Skip to content

feat: add retry logic for chat completion streaming#1381

Open
PortRoyale wants to merge 1 commit intoelie222:mainfrom
PortRoyale:feat/streaming-retry-logic
Open

feat: add retry logic for chat completion streaming#1381
PortRoyale wants to merge 1 commit intoelie222:mainfrom
PortRoyale:feat/streaming-retry-logic

Conversation

@PortRoyale
Copy link
Copy Markdown

Summary

Add retry with exponential backoff for transient errors in chatCompletionStream. This helps with smaller models (8B) that may occasionally produce schema mistakes or encounter transient network issues.

Changes

  • Add retry loop with max 2 retries and exponential backoff (1s, 2s)
  • Detect JSON parse errors (SyntaxError, "Unexpected token")
  • Detect tool call/schema validation errors
  • Detect transient network errors (via existing isTransientNetworkError)
  • Improve error logging with error type classification

Motivation

The existing createGenerateObject and createGenerateText functions already have retry logic via withLLMRetry and withNetworkRetry. This brings similar resilience to the streaming chat completion path.

When using smaller local models (like 8B parameter models via Ollama), occasional JSON malformation or schema validation failures can occur. Adding retry logic allows the system to recover gracefully from these transient failures.

Test plan

  • Verified TypeScript compiles without errors
  • Test with local Ollama model to verify retry behavior on transient failures
  • Verify no regression in normal streaming behavior

🤖 Generated with Claude Code

Add retry with exponential backoff for transient errors in chatCompletionStream.
This helps with smaller models (8B) that may occasionally produce schema mistakes.

Changes:
- Add retry loop with max 2 retries and exponential backoff (1s, 2s)
- Detect JSON parse errors (SyntaxError, "Unexpected token")
- Detect tool call/schema validation errors
- Detect transient network errors
- Improve error logging with error type classification

The existing createGenerateObject and createGenerateText functions already have
retry logic via withLLMRetry and withNetworkRetry. This brings similar resilience
to the streaming chat completion path.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Jan 23, 2026

@PortRoyale is attempting to deploy a commit to the Inbox Zero OSS Program Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Jan 23, 2026

CLA assistant check
All committers have signed the CLA.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants