Add reasoning → tool state machine transition by christiangenco · Pull Request #1160 · ml-explore/mlx-lm

christiangenco · 2026-04-16T15:47:48Z

Problem

When a tokenizer has both has_thinking and has_tool_calling set, the server-side state machine in _make_state_machine() has no edge from reasoning to tool. If the model emits the tool-call start token while still inside a <think> block (i.e. without first emitting </think>), the tool-call markers are accumulated as reasoning text and the tool parser never runs.

This affects models like Kimi K2.5 (mlx-community/Kimi-K2.5), which opens <think> at the start of every response and will happily emit <|tool_calls_section_begin|> from inside the think block.

Fix

In _make_state_machine(), when both has_thinking and has_tool_calling are true, add a (tool_call_start_tokens, "tool") edge to the reasoning state. This lets the state machine transition from reasoning directly to tool on the tool start token, the same way it does from normal.

The fix reuses the ts = tokenizer.tool_call_start_tokens binding from the existing has_tool_calling block — no re-encoding needed.

Testing

Added test_state_machine_reasoning_to_tool_transition in tests/test_generate.py. It constructs a SequenceStateMachine starting in reasoning, feeds reasoning-state tokens, then the tool start token, and asserts the machine transitions reasoning → tool → normal correctly.

Ran the full suite locally on Apple Silicon (Python 3.10, pip install -e ".[test]", HF_HOME=. with the release test_data bundle):

Ran 191 tests in 82.768s
OK (skipped=1)

Pre-commit (black + isort) passes.

Note on Kimi K2.5 specifically

This fix alone does not fully resolve Kimi K2.5 tool calling, because that model also emits the function identifier (functions.name:idx) before <|tool_calls_section_begin|> — so the function name is still consumed as reasoning text before the tool state begins, and the kimi_k2 parser only sees the bare JSON arguments. A separate parser enhancement would be needed for full Kimi K2.5 support.

However, the missing reasoning → tool transition is a correctness gap in its own right that affects any thinking + tool-calling model where tool calls may be emitted from inside a think block. This PR closes that gap.

Thinking models that emit tool calls without first closing the <think> block (e.g. Kimi K2.5) never triggered the tool parser because there was no reasoning -> tool edge in the state machine. Add that transition when both has_thinking and has_tool_calling are enabled on the tokenizer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add reasoning → tool state machine transition#1160

Add reasoning → tool state machine transition#1160
christiangenco wants to merge 1 commit intoml-explore:mainfrom
christiangenco:fix/reasoning-to-tool-transition

christiangenco commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

christiangenco commented Apr 16, 2026

Problem

Fix

Testing

Note on Kimi K2.5 specifically

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant