feat: http error handling#1832
Conversation
📝 WalkthroughWalkthroughThis pull request adds HTTP status code propagation from provider and client exceptions through the LLM call chain. The Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API as API Endpoint
participant RailEngine as Rail Engine
participant LLMProvider as LLM Provider
participant Handler as Error Handler
Client->>API: Request
API->>RailEngine: Execute Rails
RailEngine->>LLMProvider: Call LLM
LLMProvider-->>Handler: Raise Exception (with status)
Handler->>Handler: Extract status_code
Handler->>RailEngine: LLMCallException(status=code)
RailEngine->>Handler: Handle Error in Stream/Response
Handler->>Handler: Redact Secrets
Handler->>Handler: Set error.type="downstream_error"<br/>Set error.code=status
Handler-->>API: Error JSON with Status
API->>API: Classify Error Type
alt Has Status Code
API-->>Client: JSONResponse(status_code=extracted)
else No Status Code
API-->>Client: JSONResponse(status_code=500)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nemoguardrails/guardrails/iorails.py (1)
403-410:⚠️ Potential issue | 🟠 Major
downstream_errorchunks are not bypassed before output rails.Since
_generation_taskcan emiterror.type="downstream_error", the current bypass check (onlygeneration_error) can route these error chunks through output rails and alter/mask the original downstream error semantics.🔧 Proposed fix
for chunk in user_output_chunks: try: parsed = json.loads(chunk) - if isinstance(parsed, dict) and parsed.get("error", {}).get("type") == _GENERATION_ERROR_TYPE: + error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None + if error_type in {_GENERATION_ERROR_TYPE, "downstream_error"}: yield chunk return except (json.JSONDecodeError, TypeError): pass🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@nemoguardrails/guardrails/iorails.py` around lines 403 - 410, The current bypass in the user_output_chunks loop only checks for _GENERATION_ERROR_TYPE, so error chunks with error.type == "downstream_error" emitted by _generation_task can be sent through output rails and get mutated; update the condition inside the loop that examines parsed.get("error", {}).get("type") to also detect and immediately yield/return when the type is "downstream_error" (or more generally any error types produced by _generation_task), e.g., extend the check that references _GENERATION_ERROR_TYPE in the block processing user_output_chunks to include "downstream_error" so those chunks are bypassed before hitting output rails.
🧹 Nitpick comments (1)
tests/test_http_error_handling.py (1)
295-347: Add one regression test for IORails output-rails streaming passthrough.Current streaming tests validate
process_chunk, but they don’t exercise thenemoguardrails/guardrails/iorails.pypath wheredownstream_errorchunks should bypass output rails unchanged.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_http_error_handling.py` around lines 295 - 347, Add a regression test that exercises the nemoguardrails/guardrails/iorails.py streaming passthrough path to verify that chunks with error.type == "downstream_error" are forwarded unchanged (bypassing output rails) rather than being transformed; specifically, create a test that feeds a downstream_error JSON payload (same shape as in existing tests that use process_chunk/ChunkError) into the iorails streaming handler used in production and assert that the emitted chunk equals the original payload (or that the handler returns/passes through a ChunkError whose .error fields match the input exactly), reusing the existing error payload shapes and symbols process_chunk and ChunkError to build the input and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@nemoguardrails/guardrails/iorails.py`:
- Around line 403-410: The current bypass in the user_output_chunks loop only
checks for _GENERATION_ERROR_TYPE, so error chunks with error.type ==
"downstream_error" emitted by _generation_task can be sent through output rails
and get mutated; update the condition inside the loop that examines
parsed.get("error", {}).get("type") to also detect and immediately yield/return
when the type is "downstream_error" (or more generally any error types produced
by _generation_task), e.g., extend the check that references
_GENERATION_ERROR_TYPE in the block processing user_output_chunks to include
"downstream_error" so those chunks are bypassed before hitting output rails.
---
Nitpick comments:
In `@tests/test_http_error_handling.py`:
- Around line 295-347: Add a regression test that exercises the
nemoguardrails/guardrails/iorails.py streaming passthrough path to verify that
chunks with error.type == "downstream_error" are forwarded unchanged (bypassing
output rails) rather than being transformed; specifically, create a test that
feeds a downstream_error JSON payload (same shape as in existing tests that use
process_chunk/ChunkError) into the iorails streaming handler used in production
and assert that the emitted chunk equals the original payload (or that the
handler returns/passes through a ChunkError whose .error fields match the input
exactly), reusing the existing error payload shapes and symbols process_chunk
and ChunkError to build the input and assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: dddc9e53-a649-43a1-8489-1b45475ea967
📒 Files selected for processing (7)
nemoguardrails/actions/llm/utils.pynemoguardrails/exceptions.pynemoguardrails/guardrails/iorails.pynemoguardrails/guardrails/rail_action.pynemoguardrails/rails/llm/llmrails.pynemoguardrails/server/api.pytests/test_http_error_handling.py
Greptile SummaryThis PR propagates downstream HTTP status codes (401, 429, 503, etc.) through to the Guardrails API response instead of always returning HTTP 200, and embeds status codes in SSE streaming error chunks. The implementation covers non-streaming (
|
| Filename | Overview |
|---|---|
| nemoguardrails/actions/llm/utils.py | Adds _extract_http_status helper and threads the extracted status into LLMCallException — clean, well-documented, and defensively typed. |
| nemoguardrails/exceptions.py | Adds optional status field to LLMCallException with a backward-compatible default — straightforward and non-breaking. |
| nemoguardrails/guardrails/iorails.py | Propagates HTTP status in streaming error payloads and updates error-chunk detection to recognise both generation_error and downstream_error types; addressed previous isinstance guard issues. |
| nemoguardrails/guardrails/rail_action.py | Re-raises engine errors that carry an HTTP status (so they propagate to the API layer) and returns RailResult(is_safe=False) for errors without status; adds secret redaction. |
| nemoguardrails/rails/llm/llmrails.py | Centralises streaming error payload building into _build_streaming_error_payload; hardcodes "generation_error" literal instead of reusing the _GENERATION_ERROR_TYPE constant from iorails.py. |
| nemoguardrails/server/api.py | Adds typed exception handlers for LLMCallException/ModelEngineError/APIEngineError that return the downstream HTTP status; uses or 500 which is slightly fragile for status=0. |
| tests/test_http_error_handling.py | Comprehensive test coverage across all error-propagation paths: status extraction, exception construction, RailAction routing, API endpoint integration, and SSE streaming chunks. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Provider Exception] --> B{Is LLMClientError?}
B -- Yes --> C[_extract_http_status: use status_code]
B -- No --> D{Has .status_code attr?}
D -- Yes/int>0 --> C
D -- No --> E{Has .response.status_code?}
E -- Yes/int>0 --> C
E -- No --> F[Return None]
C --> G[LLMCallException with status=N]
G --> H{Streaming?}
H -- No --> I[api.py: JSONResponse status=N]
H -- Yes --> J[_build_streaming_error_payload]
J --> K[SSE chunk: type=downstream_error, code=N]
K --> L[iorails.py: detect error_type, yield and stop]
F --> M[LLMCallException with status=None]
M --> N[api.py: JSONResponse status=500]
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
nemoguardrails/rails/llm/llmrails.py:1941
**Hardcoded `"generation_error"` string diverges from `_GENERATION_ERROR_TYPE`**
`_build_streaming_error_payload` hardcodes `"generation_error"` while `iorails.py` defines `_GENERATION_ERROR_TYPE = "generation_error"` and uses it for detection at line 539. If `_GENERATION_ERROR_TYPE` is ever updated, `_build_streaming_error_payload` silently emits a type string that `iorails.py` won't recognise, causing error chunks to leak into the output-rails pipeline instead of short-circuiting. Import and reuse the constant to keep the two files in sync.
### Issue 2 of 2
nemoguardrails/server/api.py:598
**`or 500` silently swallows a zero status**
`getattr(ex, "status", None) or 500` treats `status=0` identically to `status=None` and returns 500. `LLMCallException` and `ModelEngineError`/`APIEngineError` currently never carry `status=0` (filtered to `None` by `_extract_http_status`), but the constructors accept any `Optional[int]`, so a caller that deliberately passes `status=0` would get a misleading 500. An explicit `None`-check is more self-documenting and less fragile.
```suggestion
status = getattr(ex, "status", None)
status = status if status is not None else 500
```
Reviews (7): Last reviewed commit: ":construction: added initial http error ..." | Re-trigger Greptile
| "type": "downstream_error" if status else _GENERATION_ERROR_TYPE, | ||
| "code": status or "generation_failed", |
There was a problem hiding this comment.
Inconsistent truthy vs
is not None check for status
if status on line 312 treats status=0 as falsy and falls back to _GENERATION_ERROR_TYPE, while rail_action.py uses if e.status is not None: (treating 0 as a real status). Similarly, llmrails.py uses if status is not None:. Using different semantics for the same sentinel value (status=0 meaning "no HTTP response") across these three files will produce inconsistent type / code fields in error payloads if a zero-status value ever reaches this path. Prefer a uniform if status is not None: guard throughout.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 312-313
Comment:
**Inconsistent truthy vs `is not None` check for `status`**
`if status` on line 312 treats `status=0` as falsy and falls back to `_GENERATION_ERROR_TYPE`, while `rail_action.py` uses `if e.status is not None:` (treating 0 as a real status). Similarly, `llmrails.py` uses `if status is not None:`. Using different semantics for the same sentinel value (`status=0` meaning "no HTTP response") across these three files will produce inconsistent `type` / `code` fields in error payloads if a zero-status value ever reaches this path. Prefer a uniform `if status is not None:` guard throughout.
How can I resolve this? If you propose a fix, please make it concise.| Returns ``None`` when no status can be determined or when the status | ||
| is ``0`` (used by ``LLMTimeoutError`` / ``LLMConnectionError`` for | ||
| client-side failures where no HTTP response was received). | ||
| """ | ||
| if isinstance(exception, LLMClientError): | ||
| return exception.status_code if exception.status_code > 0 else None | ||
|
|
||
| status = getattr(exception, "status_code", None) | ||
| if isinstance(status, int) and status > 0: | ||
| return status | ||
|
|
||
| response = getattr(exception, "response", None) | ||
| if response is not None: | ||
| status = getattr(response, "status_code", None) | ||
| if isinstance(status, int) and status > 0: | ||
| return status | ||
|
|
||
| return None | ||
|
|
||
|
|
||
| def _raise_llm_call_exception( | ||
| exception: Exception, | ||
| model: LLMModel, |
There was a problem hiding this comment.
_redact_secrets imported from a private module in multiple new call-sites
This PR adds imports of _redact_secrets (name-mangled with a leading _) from nemoguardrails.llm.clients._errors in iorails.py, llmrails.py, and api.py. Three distinct public modules now depend on an internal implementation detail. If the _errors module is refactored or renamed, all three import sites break silently at load time. Consider promoting _redact_secrets to a public utility (e.g., in nemoguardrails.utils or a dedicated nemoguardrails.llm.clients.utils) so callers have a stable contract.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/actions/llm/utils.py
Line: 322-344
Comment:
**`_redact_secrets` imported from a private module in multiple new call-sites**
This PR adds imports of `_redact_secrets` (name-mangled with a leading `_`) from `nemoguardrails.llm.clients._errors` in `iorails.py`, `llmrails.py`, and `api.py`. Three distinct public modules now depend on an internal implementation detail. If the `_errors` module is refactored or renamed, all three import sites break silently at load time. Consider promoting `_redact_secrets` to a public utility (e.g., in `nemoguardrails.utils` or a dedicated `nemoguardrails.llm.clients.utils`) so callers have a stable contract.
How can I resolve this? If you propose a fix, please make it concise.| error_dict = extract_error_json(str(e)) | ||
| if "error" in error_dict: | ||
| error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", "")) | ||
| status = getattr(e, "status", None) | ||
| if status is not None: |
There was a problem hiding this comment.
Non-dict
"error" value from extract_error_json causes AttributeError, breaking stream termination
extract_error_json can return {"error": "some string"} (a non-dict value) when the upstream JSON body has the pattern {"error": "Unauthorized"} — parsed via json.loads or ast.literal_eval on the split fragment. The new code then calls error_dict["error"].get("message", "") which raises AttributeError: 'str' object has no attribute 'get'. This exception escapes the except handler block without pushing END_OF_STREAM to the streaming handler, leaving the client hanging indefinitely.
The same defect is present in the identical block around line 1271 (_generation_task).
Guard with an isinstance check before accessing the sub-dict:
if isinstance(error_dict.get("error"), dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"Apply the same fix to both error-enrichment blocks.
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 903-907
Comment:
**Non-dict `"error"` value from `extract_error_json` causes `AttributeError`, breaking stream termination**
`extract_error_json` can return `{"error": "some string"}` (a non-dict value) when the upstream JSON body has the pattern `{"error": "Unauthorized"}` — parsed via `json.loads` or `ast.literal_eval` on the split fragment. The new code then calls `error_dict["error"].get("message", "")` which raises `AttributeError: 'str' object has no attribute 'get'`. This exception escapes the `except` handler block without pushing `END_OF_STREAM` to the streaming handler, leaving the client hanging indefinitely.
The same defect is present in the identical block around line 1271 (`_generation_task`).
Guard with an `isinstance` check before accessing the sub-dict:
```python
if isinstance(error_dict.get("error"), dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"
```
Apply the same fix to both error-enrichment blocks.
How can I resolve this? If you propose a fix, please make it concise.| error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None | ||
| if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"): | ||
| yield chunk | ||
| return | ||
| except (json.JSONDecodeError, TypeError): |
There was a problem hiding this comment.
AttributeError not caught when "error" is a plain string
parsed.get("error", {}) returns the stored value (e.g. "Unauthorized") when the key is present, not the {} default. Calling .get("type") on a str raises AttributeError, which is not caught by except (json.JSONDecodeError, TypeError). This silently crashes the streaming generator.
This is newly reachable in this PR: extract_error_json can return {"error": "Unauthorized"} (string value) when an upstream provider responds with {"error": "some string"}, and llmrails.py pushes that raw chunk to the streaming handler without converting it to dict format when the isinstance guard fails.
Replace the attribute access with an isinstance guard:
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None| error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None | |
| if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"): | |
| yield chunk | |
| return | |
| except (json.JSONDecodeError, TypeError): | |
| parsed = json.loads(chunk) | |
| error_obj = parsed.get("error") if isinstance(parsed, dict) else None | |
| error_type = error_obj.get("type") if isinstance(error_obj, dict) else None | |
| if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"): |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 408-412
Comment:
**`AttributeError` not caught when `"error"` is a plain string**
`parsed.get("error", {})` returns the stored value (e.g. `"Unauthorized"`) when the key is present, not the `{}` default. Calling `.get("type")` on a `str` raises `AttributeError`, which is not caught by `except (json.JSONDecodeError, TypeError)`. This silently crashes the streaming generator.
This is newly reachable in this PR: `extract_error_json` can return `{"error": "Unauthorized"}` (string value) when an upstream provider responds with `{"error": "some string"}`, and `llmrails.py` pushes that raw chunk to the streaming handler without converting it to dict format when the `isinstance` guard fails.
Replace the attribute access with an `isinstance` guard:
```python
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
```
```suggestion
parsed = json.loads(chunk)
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"):
```
How can I resolve this? If you propose a fix, please make it concise.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
ac3cde2 to
2a1b6d0
Compare
| except Exception as e: | ||
| # If an exception occurs during generation, push it to the streaming handler as a json string | ||
| # This ensures the streaming pipeline is properly terminated | ||
| log.error(f"Error in generation task: {e}", exc_info=True) | ||
| error_message = str(e) | ||
| error_dict = extract_error_json(error_message) | ||
| error_dict = extract_error_json(str(e)) | ||
| if isinstance(error_dict.get("error"), dict): | ||
| error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", "")) | ||
| status = getattr(e, "status", None) | ||
| if status is not None: | ||
| error_dict["error"]["code"] = status | ||
| error_dict["error"]["type"] = "downstream_error" | ||
| error_payload = json.dumps(error_dict) | ||
| await streaming_handler.push_chunk(error_payload) | ||
| await streaming_handler.push_chunk(END_OF_STREAM) # type: ignore |
There was a problem hiding this comment.
_redact_secrets skipped when error value is a plain string
When extract_error_json parses the exception message into {"error": "some plain string"} (e.g. {"error": "Unauthorized"}), isinstance(error_dict.get("error"), dict) is False and the branch that calls _redact_secrets is never entered. The unredacted string is then serialised directly into the SSE error chunk via json.dumps(error_dict). The same issue exists in the generate_async exception handler (~line 903).
For third-party provider exceptions that don't go through _build_error_fields (i.e. not LLMClientError subclasses), str(e) may contain raw API keys. The dict-format path gets redacted; the string-format path does not. Apply _redact_secrets in the else branch:
error_dict = extract_error_json(str(e))
error_val = error_dict.get("error")
if isinstance(error_val, dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"
elif isinstance(error_val, str):
error_dict["error"] = _redact_secrets(error_val)Apply the same fix to the identical block in generate_async (~line 903).
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 1270-1281
Comment:
**`_redact_secrets` skipped when error value is a plain string**
When `extract_error_json` parses the exception message into `{"error": "some plain string"}` (e.g. `{"error": "Unauthorized"}`), `isinstance(error_dict.get("error"), dict)` is `False` and the branch that calls `_redact_secrets` is never entered. The unredacted string is then serialised directly into the SSE error chunk via `json.dumps(error_dict)`. The same issue exists in the `generate_async` exception handler (~line 903).
For third-party provider exceptions that don't go through `_build_error_fields` (i.e. not `LLMClientError` subclasses), `str(e)` may contain raw API keys. The dict-format path gets redacted; the string-format path does not. Apply `_redact_secrets` in the `else` branch:
```python
error_dict = extract_error_json(str(e))
error_val = error_dict.get("error")
if isinstance(error_val, dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"
elif isinstance(error_val, str):
error_dict["error"] = _redact_secrets(error_val)
```
Apply the same fix to the identical block in `generate_async` (~line 903).
How can I resolve this? If you propose a fix, please make it concise.| elif isinstance(error_val, str): | ||
| error_dict["error"] = _redact_secrets(error_val) |
There was a problem hiding this comment.
String-format error chunks drop the status code and bypass iorails detection
When extract_error_json returns {"error": "Unauthorized"} (a string value — as produced by providers that use {"error": "some string"}), the elif isinstance(error_val, str) branch sets error_dict["error"] = _redact_secrets(error_val), producing {"error": "redacted string"}. Two problems follow:
iorails.pycheckserror_type in (_GENERATION_ERROR_TYPE, "downstream_error"), buterror_typeisNonewhenerror_objis a string, so the chunk is never detected as an error and leaks into the output rails pipeline instead of being yielded and terminating the stream.- The
statuscomputed on line 907 is never embedded in the chunk — the HTTP code is silently dropped.
The same defect appears in both generate_async (~line 911) and _generation_task (~line 1283).
Normalize string-format errors to the expected dict structure so the status code is preserved and iorails.py can detect it:
elif isinstance(error_val, str):
status = getattr(e, "status", None)
error_dict["error"] = {
"message": _redact_secrets(error_val),
"type": "downstream_error" if status is not None else _GENERATION_ERROR_TYPE,
"code": status if status is not None else "generation_failed",
}Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 911-912
Comment:
**String-format error chunks drop the status code and bypass iorails detection**
When `extract_error_json` returns `{"error": "Unauthorized"}` (a string value — as produced by providers that use `{"error": "some string"}`), the `elif isinstance(error_val, str)` branch sets `error_dict["error"] = _redact_secrets(error_val)`, producing `{"error": "redacted string"}`. Two problems follow:
1. `iorails.py` checks `error_type in (_GENERATION_ERROR_TYPE, "downstream_error")`, but `error_type` is `None` when `error_obj` is a string, so the chunk is never detected as an error and leaks into the output rails pipeline instead of being yielded and terminating the stream.
2. The `status` computed on line 907 is never embedded in the chunk — the HTTP code is silently dropped.
The same defect appears in both `generate_async` (~line 911) and `_generation_task` (~line 1283).
Normalize string-format errors to the expected dict structure so the status code is preserved and `iorails.py` can detect it:
```python
elif isinstance(error_val, str):
status = getattr(e, "status", None)
error_dict["error"] = {
"message": _redact_secrets(error_val),
"type": "downstream_error" if status is not None else _GENERATION_ERROR_TYPE,
"code": status if status is not None else "generation_failed",
}
```
How can I resolve this? If you propose a fix, please make it concise.d3198e2 to
99a4932
Compare
| if isinstance(error_val, dict): | ||
| error_val["message"] = _redact_secrets(error_val.get("message", "")) | ||
| if status is not None: | ||
| error_val["code"] = status | ||
| error_val["type"] = "downstream_error" | ||
| elif isinstance(error_val, str): | ||
| error_dict["error"] = { | ||
| "message": _redact_secrets(error_val), | ||
| "type": "downstream_error" if status is not None else "generation_error", | ||
| "code": status if status is not None else "generation_failed", | ||
| } |
There was a problem hiding this comment.
Missing fallback
type field for dict-format errors without status code
When error_val is a dict and status is None (e.g., a generic Python exception whose string representation was parsed to {"error": {"message": "..."}} by extract_error_json), neither branch sets error_val["type"]. The iorails.py detection at line 539 checks error_type in ("generation_error", "downstream_error"). Without a type field, error_type is None, the condition is always False, and the error chunk is never short-circuited — raw error JSON leaks into the output-rails pipeline and eventually reaches the client as the bot response.
This is the most common path: generic Python errors (memory errors, asyncio timeouts, network errors at the Python level) all produce dict-format errors with no HTTP status. Add a fallback in the dict branch:
if isinstance(error_val, dict):
error_val["message"] = _redact_secrets(error_val.get("message", ""))
if status is not None:
error_val["code"] = status
error_val["type"] = "downstream_error"
else:
error_val.setdefault("type", "generation_error")The identical gap exists in the _generation_task handler (~line 1285).
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 906-916
Comment:
**Missing fallback `type` field for dict-format errors without status code**
When `error_val` is a `dict` and `status is None` (e.g., a generic Python exception whose string representation was parsed to `{"error": {"message": "..."}}` by `extract_error_json`), neither branch sets `error_val["type"]`. The `iorails.py` detection at line 539 checks `error_type in ("generation_error", "downstream_error")`. Without a `type` field, `error_type` is `None`, the condition is always `False`, and the error chunk is never short-circuited — raw error JSON leaks into the output-rails pipeline and eventually reaches the client as the bot response.
This is the most common path: generic Python errors (memory errors, asyncio timeouts, network errors at the Python level) all produce dict-format errors with no HTTP status. Add a fallback in the `dict` branch:
```python
if isinstance(error_val, dict):
error_val["message"] = _redact_secrets(error_val.get("message", ""))
if status is not None:
error_val["code"] = status
error_val["type"] = "downstream_error"
else:
error_val.setdefault("type", "generation_error")
```
The identical gap exists in the `_generation_task` handler (~line 1285).
How can I resolve this? If you propose a fix, please make it concise.4175077 to
ba96199
Compare
Description
Guardrails returned HTTP 200 for all responses, including when downstream LLM or API calls failed with 401, 404, 429, 503, etc. This PR:
Currently, the following design is followed:
non-streaming:JSONResponse(status_code=N)with OpenAI-shaped body viacreate_error_chat_completion().model_dump()streamingHTTP status is committed to 200 whenStreamingResponsestarts — status codes are carried in SSE error chunk"code"field (matches OpenAI streaming error convention)What's not included (future work)
stream_async()to run input rails before creatingStreamingResponsePRs to follow once this is merged
Checklist
@tgasser-nv @Pouyanpi
Summary by CodeRabbit