-
-
Notifications
You must be signed in to change notification settings - Fork 12.9k
Bugfix: Prevent reasoning_content leak #32997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Bugfix: Prevent reasoning_content leak #32997
Conversation
…ool_calls This fixes a bug where reasoning_content was incorrectly flushed into the content field in the final streamed chunk when finish_reason='tool_calls'. The issue occurred when: - stream=true - OpenAI tool-call parser enabled - tool_choice='auto' - reasoning fields enabled (reasoning, reasoning_content) - speculative decoding enabled Per OpenAI's schema contract, when finish_reason='tool_calls', the response must only contain tool_calls and finish_reason, never content. Changes: 1. Add guard before final chunk creation to clear content/reasoning when finish_reason='tool_calls' 2. Add guards after tool call extraction in all paths (auto, required, named, harmony) to prevent content leakage during streaming 3. Ensure reasoning_content is never flushed into content when tool calls are present 4. Add test to verify no content leak when finish_reason=tool_calls Signed-off-by: RohanDisa <105740583+RohanDisa@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively addresses a bug where reasoning_content could leak into the content field during streaming with tool calls. The solution, which involves clearing content and reasoning fields at various points, is sound and is well-supported by a new regression test. My review includes a suggestion to ensure consistency in the fix across different code paths and a recommendation to refactor duplicated code to improve long-term maintainability.
| # in harmony path. Per OpenAI spec, tool call deltas | ||
| # must not contain content. | ||
| if tools_streamed_flag and delta_message: | ||
| delta_message.content = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the other fixes in this file and to fully prevent data leakage, reasoning and reasoning_content should also be cleared here, not just content. This ensures that no residual reasoning data is present in tool call deltas in the harmony path.
| delta_message.content = None | |
| delta_message.content = None | |
| delta_message.reasoning = None | |
| delta_message.reasoning_content = None |
| # Ensure no content/reasoning leaks when tool calls | ||
| # are present. Per OpenAI spec, tool call deltas | ||
| # must not contain content. | ||
| delta_message.content = None | ||
| delta_message.reasoning = None | ||
| delta_message.reasoning_content = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for clearing content and reasoning fields is duplicated in several places within this function when handling tool calls. This code duplication can make future maintenance more difficult and increases the risk of inconsistent fixes.
To improve maintainability and reduce redundancy, I recommend refactoring this logic into a private helper method. For example:
def _clear_non_tool_call_fields(self, delta_message: DeltaMessage):
"""Ensures no content/reasoning leaks when tool calls are present."""
delta_message.content = None
delta_message.reasoning = None
delta_message.reasoning_content = NoneYou could then call this helper method in all the relevant locations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
| # in harmony path. Per OpenAI spec, tool call deltas | ||
| # must not contain content. | ||
| if tools_streamed_flag and delta_message: | ||
| delta_message.content = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Harmony path misses clearing reasoning fields
Medium Severity
In the harmony path, when tools_streamed_flag is true, only delta_message.content is set to None, but reasoning and reasoning_content are not cleared. The extract_harmony_streaming_delta function can return a DeltaMessage with both reasoning and tool_calls set simultaneously (see stream_harmony.py lines 163-166). This allows reasoning to leak in intermediate streaming chunks for GPT-OSS models, which contradicts the PR's purpose. All other paths in this PR correctly clear all three fields.
Purpose
Fix a bug where reasoning_content was incorrectly flushed into content in the final streamed chunk when finish_reason='tool_calls'.
Root Cause & Solution
Stream finalization did not clear content/reasoning buffers when finish_reason='tool_calls', allowing leftover reasoning_content (especially from speculative decoding) to leak.
Added guards to clear content/reasoning fields before final chunk creation and after tool call extraction.
Test Plan
test_no_content_leak_when_finish_reason_tool_callsintests/entrypoints/openai/test_chat_with_tool_reasoning.py.Test Result
Fixes: #32921