Skip to content

fix: prevent stalled tool-call traces after Context Surface errors#27

Open
vishal-bala wants to merge 3 commits into
mainfrom
fix/context-surface-tool-errors
Open

fix: prevent stalled tool-call traces after Context Surface errors#27
vishal-bala wants to merge 3 commits into
mainfrom
fix/context-surface-tool-errors

Conversation

@vishal-bala
Copy link
Copy Markdown

@vishal-bala vishal-bala commented May 15, 2026

Bugs found and fix proposed by @itay-ct

Motivation

Context Surface tool calls could fail in ways that left the UI trace in an unfinished state. In particular, validation failures could emit a tool-call event without a terminal tool-result event, so the frontend kept showing the tool as Running even after the SSE stream had completed.

The Reddash prompt also did not clearly tell the model which argument names Context Surface tools expect. That made it more likely for the model to call filter tools with field-specific keys like customer_id= instead of the required value=, or to call search_policy_by_text without the required query= argument.

Changes

Updated the Reddash system prompt to explicitly document Context Surface tool argument conventions. Filter tools now instruct the model to pass a single value argument, and search_policy_by_text is documented as taking query. The common workflows were also updated to show the correct call shape.

Hardened the SSE bridge so tool traces always receive a terminal state when backend tool execution fails. The stream now handles LangGraph on_tool_error events by emitting a terminal tool-result with an error payload and duration, and stream-level exceptions flush any pending tool calls as terminal error results before the final error and done events.

Configured MCP tool wrappers to return structured JSON for validation errors instead of letting schema failures tear down the stream. This gives the agent a useful tool result and keeps the UI trace consistent.

Updated frontend trace rendering so terminal error results are shown as Error, and calls that still have no result after stream completion are shown as No response instead of continuing to spin.

Additional Changes

  • Added focused regression coverage for on_tool_error handling.
  • Added coverage for pending tool-call flushing on stream exceptions.
  • Added coverage that MCP validation errors are formatted as structured JSON.
  • Added trace badge styling for errored and missing tool results.

Note

Medium Risk
Moderate risk because it changes SSE event sequencing and tool run-id bookkeeping used by the UI trace; mistakes could break streaming UX or misattribute tool results.

Overview
Prevents tool traces from getting stuck in a perpetual Running state by hardening the chat SSE bridge to always emit a terminal tool-result.

Backend now tracks pending tool calls, normalizes tool outputs, handles LangGraph on_tool_error events, and flushes any still-pending tool calls as error tool-results before emitting the stream error/done. MCP tool wrappers also return structured JSON on input validation failures via handle_validation_error.

Updates the Reddash prompt to clarify required tool argument names (value for filter_* tools, query for search_policy_by_text), and updates the frontend trace UI to label tool outcomes as Error or No response (with matching CSS) once the stream completes. Adds regression tests covering on_tool_error, missing run_id matching, pending-tool flushing, and validation error formatting.

Reviewed by Cursor Bugbot for commit 819baa9. Bugbot is set up for automated code reviews on this repo. Configure here.

vishal-bala and others added 2 commits May 15, 2026 10:14
@vishal-bala vishal-bala self-assigned this May 15, 2026
@vishal-bala vishal-bala requested a review from jeremyplichta May 15, 2026 13:42
@jit-ci
Copy link
Copy Markdown

jit-ci Bot commented May 15, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 6c21bcc. Configure here.

Comment thread backend/app/main.py
Co-authored-by: Itay Tevel <itay.tevel@redis.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant