Skip to content

Add reasoning → tool state machine transition#1160

Open
christiangenco wants to merge 1 commit intoml-explore:mainfrom
christiangenco:fix/reasoning-to-tool-transition
Open

Add reasoning → tool state machine transition#1160
christiangenco wants to merge 1 commit intoml-explore:mainfrom
christiangenco:fix/reasoning-to-tool-transition

Conversation

@christiangenco
Copy link
Copy Markdown

Problem

When a tokenizer has both has_thinking and has_tool_calling set, the server-side state machine in _make_state_machine() has no edge from reasoning to tool. If the model emits the tool-call start token while still inside a <think> block (i.e. without first emitting </think>), the tool-call markers are accumulated as reasoning text and the tool parser never runs.

This affects models like Kimi K2.5 (mlx-community/Kimi-K2.5), which opens <think> at the start of every response and will happily emit <|tool_calls_section_begin|> from inside the think block.

Fix

In _make_state_machine(), when both has_thinking and has_tool_calling are true, add a (tool_call_start_tokens, "tool") edge to the reasoning state. This lets the state machine transition from reasoning directly to tool on the tool start token, the same way it does from normal.

The fix reuses the ts = tokenizer.tool_call_start_tokens binding from the existing has_tool_calling block — no re-encoding needed.

Testing

Added test_state_machine_reasoning_to_tool_transition in tests/test_generate.py. It constructs a SequenceStateMachine starting in reasoning, feeds reasoning-state tokens, then the tool start token, and asserts the machine transitions reasoning → tool → normal correctly.

Ran the full suite locally on Apple Silicon (Python 3.10, pip install -e ".[test]", HF_HOME=. with the release test_data bundle):

Ran 191 tests in 82.768s
OK (skipped=1)

Pre-commit (black + isort) passes.

Note on Kimi K2.5 specifically

This fix alone does not fully resolve Kimi K2.5 tool calling, because that model also emits the function identifier (functions.name:idx) before <|tool_calls_section_begin|> — so the function name is still consumed as reasoning text before the tool state begins, and the kimi_k2 parser only sees the bare JSON arguments. A separate parser enhancement would be needed for full Kimi K2.5 support.

However, the missing reasoning → tool transition is a correctness gap in its own right that affects any thinking + tool-calling model where tool calls may be emitted from inside a think block. This PR closes that gap.

Thinking models that emit tool calls without first closing the
<think> block (e.g. Kimi K2.5) never triggered the tool parser
because there was no reasoning -> tool edge in the state machine.
Add that transition when both has_thinking and has_tool_calling
are enabled on the tokenizer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant