feat: http error handling#1832
Conversation
📝 WalkthroughWalkthroughThis pull request adds HTTP status code propagation from provider and client exceptions through the LLM call chain. The Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as API Endpoint
participant RailEngine as Rail Engine
participant LLMProvider as LLM Provider
participant Handler as Error Handler
Client->>API: Request
API->>RailEngine: Execute Rails
RailEngine->>LLMProvider: Call LLM
LLMProvider-->>Handler: Raise Exception (with status)
Handler->>Handler: Extract status_code
Handler->>RailEngine: LLMCallException(status=code)
RailEngine->>Handler: Handle Error in Stream/Response
Handler->>Handler: Redact Secrets
Handler->>Handler: Set error.type="downstream_error"<br/>Set error.code=status
Handler-->>API: Error JSON with Status
API->>API: Classify Error Type
alt Has Status Code
API-->>Client: JSONResponse(status_code=extracted)
else No Status Code
API-->>Client: JSONResponse(status_code=500)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nemoguardrails/guardrails/iorails.py (1)
403-410:⚠️ Potential issue | 🟠 Major
downstream_errorchunks are not bypassed before output rails.Since
_generation_taskcan emiterror.type="downstream_error", the current bypass check (onlygeneration_error) can route these error chunks through output rails and alter/mask the original downstream error semantics.🔧 Proposed fix
for chunk in user_output_chunks: try: parsed = json.loads(chunk) - if isinstance(parsed, dict) and parsed.get("error", {}).get("type") == _GENERATION_ERROR_TYPE: + error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None + if error_type in {_GENERATION_ERROR_TYPE, "downstream_error"}: yield chunk return except (json.JSONDecodeError, TypeError): pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoguardrails/guardrails/iorails.py` around lines 403 - 410, The current bypass in the user_output_chunks loop only checks for _GENERATION_ERROR_TYPE, so error chunks with error.type == "downstream_error" emitted by _generation_task can be sent through output rails and get mutated; update the condition inside the loop that examines parsed.get("error", {}).get("type") to also detect and immediately yield/return when the type is "downstream_error" (or more generally any error types produced by _generation_task), e.g., extend the check that references _GENERATION_ERROR_TYPE in the block processing user_output_chunks to include "downstream_error" so those chunks are bypassed before hitting output rails.
🧹 Nitpick comments (1)
tests/test_http_error_handling.py (1)
295-347: Add one regression test for IORails output-rails streaming passthrough.Current streaming tests validate
process_chunk, but they don’t exercise thenemoguardrails/guardrails/iorails.pypath wheredownstream_errorchunks should bypass output rails unchanged.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_http_error_handling.py` around lines 295 - 347, Add a regression test that exercises the nemoguardrails/guardrails/iorails.py streaming passthrough path to verify that chunks with error.type == "downstream_error" are forwarded unchanged (bypassing output rails) rather than being transformed; specifically, create a test that feeds a downstream_error JSON payload (same shape as in existing tests that use process_chunk/ChunkError) into the iorails streaming handler used in production and assert that the emitted chunk equals the original payload (or that the handler returns/passes through a ChunkError whose .error fields match the input exactly), reusing the existing error payload shapes and symbols process_chunk and ChunkError to build the input and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@nemoguardrails/guardrails/iorails.py`:
- Around line 403-410: The current bypass in the user_output_chunks loop only
checks for _GENERATION_ERROR_TYPE, so error chunks with error.type ==
"downstream_error" emitted by _generation_task can be sent through output rails
and get mutated; update the condition inside the loop that examines
parsed.get("error", {}).get("type") to also detect and immediately yield/return
when the type is "downstream_error" (or more generally any error types produced
by _generation_task), e.g., extend the check that references
_GENERATION_ERROR_TYPE in the block processing user_output_chunks to include
"downstream_error" so those chunks are bypassed before hitting output rails.
---
Nitpick comments:
In `@tests/test_http_error_handling.py`:
- Around line 295-347: Add a regression test that exercises the
nemoguardrails/guardrails/iorails.py streaming passthrough path to verify that
chunks with error.type == "downstream_error" are forwarded unchanged (bypassing
output rails) rather than being transformed; specifically, create a test that
feeds a downstream_error JSON payload (same shape as in existing tests that use
process_chunk/ChunkError) into the iorails streaming handler used in production
and assert that the emitted chunk equals the original payload (or that the
handler returns/passes through a ChunkError whose .error fields match the input
exactly), reusing the existing error payload shapes and symbols process_chunk
and ChunkError to build the input and assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: dddc9e53-a649-43a1-8489-1b45475ea967
📒 Files selected for processing (7)
nemoguardrails/actions/llm/utils.pynemoguardrails/exceptions.pynemoguardrails/guardrails/iorails.pynemoguardrails/guardrails/rail_action.pynemoguardrails/rails/llm/llmrails.pynemoguardrails/server/api.pytests/test_http_error_handling.py
Greptile SummaryThis PR fixes Guardrails always returning HTTP 200 by propagating downstream LLM/API HTTP status codes through to the Guardrails API response for non-streaming requests, and embedding status codes in SSE error chunk payloads for streaming responses.
|
| Filename | Overview |
|---|---|
| nemoguardrails/actions/llm/utils.py | Adds _extract_http_status helper and threads the extracted status into LLMCallException; logic is correct and well-guarded (returns None for status ≤ 0). |
| nemoguardrails/exceptions.py | Adds optional status field to LLMCallException; backward-compatible change with correct defaults. |
| nemoguardrails/guardrails/iorails.py | Inline error-building in _generation_task now emits downstream_error/status for HTTP errors; error-chunk detection in _stream_output_rails correctly expanded to match both error types. |
| nemoguardrails/guardrails/rail_action.py | New ModelEngineError/APIEngineError handler re-raises when a real HTTP status is present and fails closed otherwise; secrets are redacted in the RailResult reason. |
| nemoguardrails/rails/llm/llmrails.py | Introduces _build_streaming_error_payload to normalize all exception shapes into a consistent SSE error chunk; hardcodes "generation_error" string that must stay in sync with _GENERATION_ERROR_TYPE in iorails.py. |
| nemoguardrails/server/api.py | New except (LLMCallException, ModelEngineError, APIEngineError) block returns a JSONResponse with the propagated HTTP status; ChunkErrorMetadata.code widened to Union[str, int, None] to carry integer status codes. |
| tests/test_http_error_handling.py | Comprehensive new test suite covering status extraction, exception construction, rail-action routing, API endpoint propagation, and streaming error chunks. |
Sequence Diagram
sequenceDiagram
participant Client
participant API as api.py /v1/chat/completions
participant Rails as LLMRails / IORails
participant LLM as Downstream LLM/API
Note over Client,LLM: Non-streaming path
Client->>API: "POST (stream=false)"
API->>Rails: generate_async()
Rails->>LLM: HTTP call
LLM-->>Rails: HTTP 401/429/503 error
Rails->>Rails: "_extract_http_status(e) → status=401"
Rails-->>API: "raise LLMCallException(status=401)"
API->>API: except (LLMCallException, ModelEngineError, APIEngineError)
API->>API: "status = getattr(ex, status, None) or 500"
API-->>Client: "JSONResponse(status_code=401, body=OpenAI-shaped error)"
Note over Client,LLM: Streaming path
Client->>API: "POST (stream=true)"
API->>Rails: stream_async() → AsyncIterator
API-->>Client: StreamingResponse(200, text/event-stream)
Rails->>LLM: HTTP call
LLM-->>Rails: HTTP 429 error
Rails->>Rails: _build_streaming_error_payload(e)
Rails->>Rails: iorails checks error_type in (generation_error, downstream_error)
Rails-->>Client: "data: error chunk with code=429"
Rails-->>Client: data: [DONE]
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
nemoguardrails/rails/llm/llmrails.py:2097-2098
**Hardcoded `"generation_error"` string must stay in sync with `_GENERATION_ERROR_TYPE`**
`_build_streaming_error_payload` uses the literal string `"generation_error"` for the `error_type` fallback, while `iorails.py` defines `_GENERATION_ERROR_TYPE = "generation_error"` as a constant and checks `error_type in (_GENERATION_ERROR_TYPE, "downstream_error")` at line 780 when deciding whether to short-circuit an error chunk. If the constant is ever renamed in `iorails.py` (or if a third location adopts a different spelling), only one copy would be updated and error chunks produced by the LLMRails streaming path would silently escape into the output-rails pipeline instead of being yielded and terminating the stream. Consider moving the constant to a shared module (e.g., `nemoguardrails.guardrails.guardrails_types`) that both `iorails.py` and `llmrails.py` can import.
Reviews (8): Last reviewed commit: ":construction: added initial http error ..." | Re-trigger Greptile
| "type": "downstream_error" if status else _GENERATION_ERROR_TYPE, | ||
| "code": status or "generation_failed", |
There was a problem hiding this comment.
Inconsistent truthy vs
is not None check for status
if status on line 312 treats status=0 as falsy and falls back to _GENERATION_ERROR_TYPE, while rail_action.py uses if e.status is not None: (treating 0 as a real status). Similarly, llmrails.py uses if status is not None:. Using different semantics for the same sentinel value (status=0 meaning "no HTTP response") across these three files will produce inconsistent type / code fields in error payloads if a zero-status value ever reaches this path. Prefer a uniform if status is not None: guard throughout.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 312-313
Comment:
**Inconsistent truthy vs `is not None` check for `status`**
`if status` on line 312 treats `status=0` as falsy and falls back to `_GENERATION_ERROR_TYPE`, while `rail_action.py` uses `if e.status is not None:` (treating 0 as a real status). Similarly, `llmrails.py` uses `if status is not None:`. Using different semantics for the same sentinel value (`status=0` meaning "no HTTP response") across these three files will produce inconsistent `type` / `code` fields in error payloads if a zero-status value ever reaches this path. Prefer a uniform `if status is not None:` guard throughout.
How can I resolve this? If you propose a fix, please make it concise.| Returns ``None`` when no status can be determined or when the status | ||
| is ``0`` (used by ``LLMTimeoutError`` / ``LLMConnectionError`` for | ||
| client-side failures where no HTTP response was received). | ||
| """ | ||
| if isinstance(exception, LLMClientError): | ||
| return exception.status_code if exception.status_code > 0 else None | ||
|
|
||
| status = getattr(exception, "status_code", None) | ||
| if isinstance(status, int) and status > 0: | ||
| return status | ||
|
|
||
| response = getattr(exception, "response", None) | ||
| if response is not None: | ||
| status = getattr(response, "status_code", None) | ||
| if isinstance(status, int) and status > 0: | ||
| return status | ||
|
|
||
| return None | ||
|
|
||
|
|
||
| def _raise_llm_call_exception( | ||
| exception: Exception, | ||
| model: LLMModel, |
There was a problem hiding this comment.
_redact_secrets imported from a private module in multiple new call-sites
This PR adds imports of _redact_secrets (name-mangled with a leading _) from nemoguardrails.llm.clients._errors in iorails.py, llmrails.py, and api.py. Three distinct public modules now depend on an internal implementation detail. If the _errors module is refactored or renamed, all three import sites break silently at load time. Consider promoting _redact_secrets to a public utility (e.g., in nemoguardrails.utils or a dedicated nemoguardrails.llm.clients.utils) so callers have a stable contract.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/actions/llm/utils.py
Line: 322-344
Comment:
**`_redact_secrets` imported from a private module in multiple new call-sites**
This PR adds imports of `_redact_secrets` (name-mangled with a leading `_`) from `nemoguardrails.llm.clients._errors` in `iorails.py`, `llmrails.py`, and `api.py`. Three distinct public modules now depend on an internal implementation detail. If the `_errors` module is refactored or renamed, all three import sites break silently at load time. Consider promoting `_redact_secrets` to a public utility (e.g., in `nemoguardrails.utils` or a dedicated `nemoguardrails.llm.clients.utils`) so callers have a stable contract.
How can I resolve this? If you propose a fix, please make it concise.| error_dict = extract_error_json(str(e)) | ||
| if "error" in error_dict: | ||
| error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", "")) | ||
| status = getattr(e, "status", None) | ||
| if status is not None: |
There was a problem hiding this comment.
Non-dict
"error" value from extract_error_json causes AttributeError, breaking stream termination
extract_error_json can return {"error": "some string"} (a non-dict value) when the upstream JSON body has the pattern {"error": "Unauthorized"} — parsed via json.loads or ast.literal_eval on the split fragment. The new code then calls error_dict["error"].get("message", "") which raises AttributeError: 'str' object has no attribute 'get'. This exception escapes the except handler block without pushing END_OF_STREAM to the streaming handler, leaving the client hanging indefinitely.
The same defect is present in the identical block around line 1271 (_generation_task).
Guard with an isinstance check before accessing the sub-dict:
if isinstance(error_dict.get("error"), dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"Apply the same fix to both error-enrichment blocks.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 903-907
Comment:
**Non-dict `"error"` value from `extract_error_json` causes `AttributeError`, breaking stream termination**
`extract_error_json` can return `{"error": "some string"}` (a non-dict value) when the upstream JSON body has the pattern `{"error": "Unauthorized"}` — parsed via `json.loads` or `ast.literal_eval` on the split fragment. The new code then calls `error_dict["error"].get("message", "")` which raises `AttributeError: 'str' object has no attribute 'get'`. This exception escapes the `except` handler block without pushing `END_OF_STREAM` to the streaming handler, leaving the client hanging indefinitely.
The same defect is present in the identical block around line 1271 (`_generation_task`).
Guard with an `isinstance` check before accessing the sub-dict:
```python
if isinstance(error_dict.get("error"), dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"
```
Apply the same fix to both error-enrichment blocks.
How can I resolve this? If you propose a fix, please make it concise.| error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None | ||
| if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"): | ||
| yield chunk | ||
| return | ||
| except (json.JSONDecodeError, TypeError): |
There was a problem hiding this comment.
AttributeError not caught when "error" is a plain string
parsed.get("error", {}) returns the stored value (e.g. "Unauthorized") when the key is present, not the {} default. Calling .get("type") on a str raises AttributeError, which is not caught by except (json.JSONDecodeError, TypeError). This silently crashes the streaming generator.
This is newly reachable in this PR: extract_error_json can return {"error": "Unauthorized"} (string value) when an upstream provider responds with {"error": "some string"}, and llmrails.py pushes that raw chunk to the streaming handler without converting it to dict format when the isinstance guard fails.
Replace the attribute access with an isinstance guard:
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None| error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None | |
| if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"): | |
| yield chunk | |
| return | |
| except (json.JSONDecodeError, TypeError): | |
| parsed = json.loads(chunk) | |
| error_obj = parsed.get("error") if isinstance(parsed, dict) else None | |
| error_type = error_obj.get("type") if isinstance(error_obj, dict) else None | |
| if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"): |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 408-412
Comment:
**`AttributeError` not caught when `"error"` is a plain string**
`parsed.get("error", {})` returns the stored value (e.g. `"Unauthorized"`) when the key is present, not the `{}` default. Calling `.get("type")` on a `str` raises `AttributeError`, which is not caught by `except (json.JSONDecodeError, TypeError)`. This silently crashes the streaming generator.
This is newly reachable in this PR: `extract_error_json` can return `{"error": "Unauthorized"}` (string value) when an upstream provider responds with `{"error": "some string"}`, and `llmrails.py` pushes that raw chunk to the streaming handler without converting it to dict format when the `isinstance` guard fails.
Replace the attribute access with an `isinstance` guard:
```python
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
```
```suggestion
parsed = json.loads(chunk)
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"):
```
How can I resolve this? If you propose a fix, please make it concise.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
ac3cde2 to
2a1b6d0
Compare
| except Exception as e: | ||
| # If an exception occurs during generation, push it to the streaming handler as a json string | ||
| # This ensures the streaming pipeline is properly terminated | ||
| log.error(f"Error in generation task: {e}", exc_info=True) | ||
| error_message = str(e) | ||
| error_dict = extract_error_json(error_message) | ||
| error_dict = extract_error_json(str(e)) | ||
| if isinstance(error_dict.get("error"), dict): | ||
| error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", "")) | ||
| status = getattr(e, "status", None) | ||
| if status is not None: | ||
| error_dict["error"]["code"] = status | ||
| error_dict["error"]["type"] = "downstream_error" | ||
| error_payload = json.dumps(error_dict) | ||
| await streaming_handler.push_chunk(error_payload) | ||
| await streaming_handler.push_chunk(END_OF_STREAM) # type: ignore |
There was a problem hiding this comment.
_redact_secrets skipped when error value is a plain string
When extract_error_json parses the exception message into {"error": "some plain string"} (e.g. {"error": "Unauthorized"}), isinstance(error_dict.get("error"), dict) is False and the branch that calls _redact_secrets is never entered. The unredacted string is then serialised directly into the SSE error chunk via json.dumps(error_dict). The same issue exists in the generate_async exception handler (~line 903).
For third-party provider exceptions that don't go through _build_error_fields (i.e. not LLMClientError subclasses), str(e) may contain raw API keys. The dict-format path gets redacted; the string-format path does not. Apply _redact_secrets in the else branch:
error_dict = extract_error_json(str(e))
error_val = error_dict.get("error")
if isinstance(error_val, dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"
elif isinstance(error_val, str):
error_dict["error"] = _redact_secrets(error_val)Apply the same fix to the identical block in generate_async (~line 903).
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 1270-1281
Comment:
**`_redact_secrets` skipped when error value is a plain string**
When `extract_error_json` parses the exception message into `{"error": "some plain string"}` (e.g. `{"error": "Unauthorized"}`), `isinstance(error_dict.get("error"), dict)` is `False` and the branch that calls `_redact_secrets` is never entered. The unredacted string is then serialised directly into the SSE error chunk via `json.dumps(error_dict)`. The same issue exists in the `generate_async` exception handler (~line 903).
For third-party provider exceptions that don't go through `_build_error_fields` (i.e. not `LLMClientError` subclasses), `str(e)` may contain raw API keys. The dict-format path gets redacted; the string-format path does not. Apply `_redact_secrets` in the `else` branch:
```python
error_dict = extract_error_json(str(e))
error_val = error_dict.get("error")
if isinstance(error_val, dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"
elif isinstance(error_val, str):
error_dict["error"] = _redact_secrets(error_val)
```
Apply the same fix to the identical block in `generate_async` (~line 903).
How can I resolve this? If you propose a fix, please make it concise.| elif isinstance(error_val, str): | ||
| error_dict["error"] = _redact_secrets(error_val) |
There was a problem hiding this comment.
String-format error chunks drop the status code and bypass iorails detection
When extract_error_json returns {"error": "Unauthorized"} (a string value — as produced by providers that use {"error": "some string"}), the elif isinstance(error_val, str) branch sets error_dict["error"] = _redact_secrets(error_val), producing {"error": "redacted string"}. Two problems follow:
iorails.pycheckserror_type in (_GENERATION_ERROR_TYPE, "downstream_error"), buterror_typeisNonewhenerror_objis a string, so the chunk is never detected as an error and leaks into the output rails pipeline instead of being yielded and terminating the stream.- The
statuscomputed on line 907 is never embedded in the chunk — the HTTP code is silently dropped.
The same defect appears in both generate_async (~line 911) and _generation_task (~line 1283).
Normalize string-format errors to the expected dict structure so the status code is preserved and iorails.py can detect it:
elif isinstance(error_val, str):
status = getattr(e, "status", None)
error_dict["error"] = {
"message": _redact_secrets(error_val),
"type": "downstream_error" if status is not None else _GENERATION_ERROR_TYPE,
"code": status if status is not None else "generation_failed",
}Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 911-912
Comment:
**String-format error chunks drop the status code and bypass iorails detection**
When `extract_error_json` returns `{"error": "Unauthorized"}` (a string value — as produced by providers that use `{"error": "some string"}`), the `elif isinstance(error_val, str)` branch sets `error_dict["error"] = _redact_secrets(error_val)`, producing `{"error": "redacted string"}`. Two problems follow:
1. `iorails.py` checks `error_type in (_GENERATION_ERROR_TYPE, "downstream_error")`, but `error_type` is `None` when `error_obj` is a string, so the chunk is never detected as an error and leaks into the output rails pipeline instead of being yielded and terminating the stream.
2. The `status` computed on line 907 is never embedded in the chunk — the HTTP code is silently dropped.
The same defect appears in both `generate_async` (~line 911) and `_generation_task` (~line 1283).
Normalize string-format errors to the expected dict structure so the status code is preserved and `iorails.py` can detect it:
```python
elif isinstance(error_val, str):
status = getattr(e, "status", None)
error_dict["error"] = {
"message": _redact_secrets(error_val),
"type": "downstream_error" if status is not None else _GENERATION_ERROR_TYPE,
"code": status if status is not None else "generation_failed",
}
```
How can I resolve this? If you propose a fix, please make it concise.d3198e2 to
99a4932
Compare
| if isinstance(error_val, dict): | ||
| error_val["message"] = _redact_secrets(error_val.get("message", "")) | ||
| if status is not None: | ||
| error_val["code"] = status | ||
| error_val["type"] = "downstream_error" | ||
| elif isinstance(error_val, str): | ||
| error_dict["error"] = { | ||
| "message": _redact_secrets(error_val), | ||
| "type": "downstream_error" if status is not None else "generation_error", | ||
| "code": status if status is not None else "generation_failed", | ||
| } |
There was a problem hiding this comment.
Missing fallback
type field for dict-format errors without status code
When error_val is a dict and status is None (e.g., a generic Python exception whose string representation was parsed to {"error": {"message": "..."}} by extract_error_json), neither branch sets error_val["type"]. The iorails.py detection at line 539 checks error_type in ("generation_error", "downstream_error"). Without a type field, error_type is None, the condition is always False, and the error chunk is never short-circuited — raw error JSON leaks into the output-rails pipeline and eventually reaches the client as the bot response.
This is the most common path: generic Python errors (memory errors, asyncio timeouts, network errors at the Python level) all produce dict-format errors with no HTTP status. Add a fallback in the dict branch:
if isinstance(error_val, dict):
error_val["message"] = _redact_secrets(error_val.get("message", ""))
if status is not None:
error_val["code"] = status
error_val["type"] = "downstream_error"
else:
error_val.setdefault("type", "generation_error")The identical gap exists in the _generation_task handler (~line 1285).
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 906-916
Comment:
**Missing fallback `type` field for dict-format errors without status code**
When `error_val` is a `dict` and `status is None` (e.g., a generic Python exception whose string representation was parsed to `{"error": {"message": "..."}}` by `extract_error_json`), neither branch sets `error_val["type"]`. The `iorails.py` detection at line 539 checks `error_type in ("generation_error", "downstream_error")`. Without a `type` field, `error_type` is `None`, the condition is always `False`, and the error chunk is never short-circuited — raw error JSON leaks into the output-rails pipeline and eventually reaches the client as the bot response.
This is the most common path: generic Python errors (memory errors, asyncio timeouts, network errors at the Python level) all produce dict-format errors with no HTTP status. Add a fallback in the `dict` branch:
```python
if isinstance(error_val, dict):
error_val["message"] = _redact_secrets(error_val.get("message", ""))
if status is not None:
error_val["code"] = status
error_val["type"] = "downstream_error"
else:
error_val.setdefault("type", "generation_error")
```
The identical gap exists in the `_generation_task` handler (~line 1285).
How can I resolve this? If you propose a fix, please make it concise.4175077 to
ba96199
Compare
a6be550 to
c69efe5
Compare
ba96199 to
9c5c237
Compare
Description
Guardrails returned HTTP 200 for all responses, including when downstream LLM or API calls failed with 401, 404, 429, 503, etc. This PR:
Currently, the following design is followed:
non-streaming:JSONResponse(status_code=N)with OpenAI-shaped body viacreate_error_chat_completion().model_dump()streamingHTTP status is committed to 200 whenStreamingResponsestarts — status codes are carried in SSE error chunk"code"field (matches OpenAI streaming error convention)What's not included (future work)
stream_async()to run input rails before creatingStreamingResponsePRs to follow once this is merged
Checklist
@tgasser-nv @Pouyanpi
Summary by CodeRabbit