Skip to content

fix: merge duplicate-index entries in streaming delta accumulator (#3201)#3446

Open
C1-BA-B1-F3 wants to merge 1 commit into
openai:mainfrom
C1-BA-B1-F3:fix/duplicate-index-streaming
Open

fix: merge duplicate-index entries in streaming delta accumulator (#3201)#3446
C1-BA-B1-F3 wants to merge 1 commit into
openai:mainfrom
C1-BA-B1-F3:fix/duplicate-index-streaming

Conversation

@C1-BA-B1-F3

Copy link
Copy Markdown

Problem

When the first streamed chunk contains multiple tool_calls entries with the same index (e.g. id/name + start of arguments, both at index 0), the accumulator stores them as separate list entries. Subsequent chunks only merge into the first one at that position, leaving the duplicate stranded with incomplete arguments.

This produces invalid final JSON for tool call arguments — some argument fragments end up in a second list entry that nobody reads.

Root cause: The accumulate_delta function stores the first encounter of a list value directly via acc[key] = delta_value without checking for duplicate indices.

Fix

Add _normalize_indexed_list() that merges list entries sharing the same integer index key before first storage. Applied to both:

  • src/openai/lib/streaming/_deltas.py (chat completions)
  • src/openai/lib/streaming/_assistants.py (assistants API)

Test plan

  • New test file: tests/test_accumulate_delta.py with 6 test cases covering:
    • Single index in first chunk (baseline)
    • Duplicate index in first chunk (the bug)
    • Duplicate index preserves type and id
    • Multiple distinct indices in first chunk
    • Multiple indices with duplicates in first chunk
    • Non-indexed lists unchanged

All 6 new tests pass. All 15 existing streaming tests pass.

Fixes #3201

When the first streamed chunk contains multiple tool_calls entries with
the same  (e.g. id/name + start of arguments, both at index 0),
the accumulator stored them as separate list entries. Subsequent chunks
only merged into the first one at that position, leaving the duplicate
stranded with incomplete arguments.

Fix: add  that merges list entries sharing
the same integer  key before first storage. Applied to both
 (chat completions) and  (assistants API).

Fixes openai#3201
@C1-BA-B1-F3 C1-BA-B1-F3 requested a review from a team as a code owner June 25, 2026 23:22

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 75b895d3d4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +33 to +34
if is_list(delta_value) and delta_value:
acc[key] = _normalize_indexed_list(delta_value)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize duplicate tool calls in the initial chunk

When the duplicate tool_calls entries arrive in the very first chat streaming chunk, this branch is never reached: ChatCompletionStreamState._accumulate_chunk() returns _convert_initial_chunk_into_snapshot() while the snapshot is None, and that path copies choice.delta.to_dict() directly. In that provider scenario, the first chunk still stores two entries with index == 0, so the next chunk merges into tool_calls[0] and leaves the argument prefix stranded in tool_calls[1]; apply the same normalization in the initial conversion path or route the first chunk through the accumulator.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Streaming tool_call deltas with duplicate indexes in first chunk are accumulated incorrectly

1 participant