Skip to content

feat: http error handling#1832

Open
m-misiura wants to merge 1 commit intoNVIDIA-NeMo:developfrom
m-misiura:http_errors_exceptions
Open

feat: http error handling#1832
m-misiura wants to merge 1 commit intoNVIDIA-NeMo:developfrom
m-misiura:http_errors_exceptions

Conversation

@m-misiura
Copy link
Copy Markdown

@m-misiura m-misiura commented Apr 28, 2026

Description

Guardrails returned HTTP 200 for all responses, including when downstream LLM or API calls failed with 401, 404, 429, 503, etc. This PR:

  • propagates HTTP status codes from downstream LLM/API errors through to the Guardrails API response instead of returning HTTP 200 for all errors
  • adds status codes to SSE streaming error chunks for streaming responses

Currently, the following design is followed:

  • non-streaming: JSONResponse(status_code=N) with OpenAI-shaped body via create_error_chat_completion().model_dump()
  • streaming HTTP status is committed to 200 when StreamingResponse starts — status codes are carried in SSE error chunk "code" field (matches OpenAI streaming error convention)

What's not included (future work)

  • pre-stream HTTP errors: input rail failures during streaming could return a proper HTTP error instead of 200+error-chunk, but this requires restructuring stream_async() to run input rails before creating StreamingResponse
  • client vs server key distinction: downstream 401/403 from a server-stored key should return 500, not forward the status

PRs to follow once this is merged

Checklist

  • I've read the CONTRIBUTING guidelines.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.

@tgasser-nv @Pouyanpi

Summary by CodeRabbit

  • Bug Fixes
    • Improved HTTP error handling to properly propagate upstream provider status codes in API responses.
    • Enhanced error classification to distinguish between downstream provider failures and internal generation errors.
    • Added sensitive data redaction in error messages to prevent credential leakage.
    • Fixed error response formatting for streaming endpoints to include proper HTTP status codes.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 28, 2026

📝 Walkthrough

Walkthrough

This pull request adds HTTP status code propagation from provider and client exceptions through the LLM call chain. The LLMCallException now captures an optional HTTP status attribute extracted from exception sources, which is then used downstream to classify errors properly in streaming responses and API endpoints, replacing hardcoded generic error handling.

Changes

Cohort / File(s) Summary
Exception Core
nemoguardrails/exceptions.py, nemoguardrails/actions/llm/utils.py
Added status: Optional[int] parameter to LLMCallException constructor and new helper logic to extract HTTP status codes from various exception types (LLMClientError.status_code, generic status_code, response.status_code).
Streaming Error Handling
nemoguardrails/guardrails/iorails.py, nemoguardrails/rails/llm/llmrails.py
Updated streaming exception handlers to redact sensitive data from error messages, classify errors as "downstream_error" when status is present, and populate error code fields with extracted HTTP status values.
Rail Action Error Routing
nemoguardrails/guardrails/rail_action.py
Added dedicated exception handler for ModelEngineError and APIEngineError that re-raises when HTTP status is available, otherwise returns unsafe RailResult with contextual error message.
API Endpoint Response
nemoguardrails/server/api.py
Enhanced chat_completion endpoint to explicitly catch rail/engine exceptions, extract their HTTP status, redact secrets from error messages, and return OpenAI-compatible error responses with appropriate status codes. Widened ChunkErrorMetadata.code type to accept both string and integer codes.
Test Coverage
tests/test_http_error_handling.py
New comprehensive test module validating status code extraction, exception propagation, rail action routing, API endpoint responses, and streaming error payload parsing across the full call chain.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as API Endpoint
    participant RailEngine as Rail Engine
    participant LLMProvider as LLM Provider
    participant Handler as Error Handler

    Client->>API: Request
    API->>RailEngine: Execute Rails
    RailEngine->>LLMProvider: Call LLM
    LLMProvider-->>Handler: Raise Exception (with status)
    Handler->>Handler: Extract status_code
    Handler->>RailEngine: LLMCallException(status=code)
    RailEngine->>Handler: Handle Error in Stream/Response
    Handler->>Handler: Redact Secrets
    Handler->>Handler: Set error.type="downstream_error"<br/>Set error.code=status
    Handler-->>API: Error JSON with Status
    API->>API: Classify Error Type
    alt Has Status Code
        API-->>Client: JSONResponse(status_code=extracted)
    else No Status Code
        API-->>Client: JSONResponse(status_code=500)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 17.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes ⚠️ Warning PR description does not include test results or detailed testing information despite major feature changes requiring comprehensive validation. Add test execution results, coverage metrics, test scenarios, and validation examples to the PR description to demonstrate testing of status code propagation functionality.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: http error handling' accurately and concisely describes the main change: adding HTTP error status code propagation from downstream LLM/API calls to Guardrails responses.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
nemoguardrails/guardrails/iorails.py (1)

403-410: ⚠️ Potential issue | 🟠 Major

downstream_error chunks are not bypassed before output rails.

Since _generation_task can emit error.type="downstream_error", the current bypass check (only generation_error) can route these error chunks through output rails and alter/mask the original downstream error semantics.

🔧 Proposed fix
             for chunk in user_output_chunks:
                 try:
                     parsed = json.loads(chunk)
-                    if isinstance(parsed, dict) and parsed.get("error", {}).get("type") == _GENERATION_ERROR_TYPE:
+                    error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None
+                    if error_type in {_GENERATION_ERROR_TYPE, "downstream_error"}:
                         yield chunk
                         return
                 except (json.JSONDecodeError, TypeError):
                     pass
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemoguardrails/guardrails/iorails.py` around lines 403 - 410, The current
bypass in the user_output_chunks loop only checks for _GENERATION_ERROR_TYPE, so
error chunks with error.type == "downstream_error" emitted by _generation_task
can be sent through output rails and get mutated; update the condition inside
the loop that examines parsed.get("error", {}).get("type") to also detect and
immediately yield/return when the type is "downstream_error" (or more generally
any error types produced by _generation_task), e.g., extend the check that
references _GENERATION_ERROR_TYPE in the block processing user_output_chunks to
include "downstream_error" so those chunks are bypassed before hitting output
rails.
🧹 Nitpick comments (1)
tests/test_http_error_handling.py (1)

295-347: Add one regression test for IORails output-rails streaming passthrough.

Current streaming tests validate process_chunk, but they don’t exercise the nemoguardrails/guardrails/iorails.py path where downstream_error chunks should bypass output rails unchanged.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_http_error_handling.py` around lines 295 - 347, Add a regression
test that exercises the nemoguardrails/guardrails/iorails.py streaming
passthrough path to verify that chunks with error.type == "downstream_error" are
forwarded unchanged (bypassing output rails) rather than being transformed;
specifically, create a test that feeds a downstream_error JSON payload (same
shape as in existing tests that use process_chunk/ChunkError) into the iorails
streaming handler used in production and assert that the emitted chunk equals
the original payload (or that the handler returns/passes through a ChunkError
whose .error fields match the input exactly), reusing the existing error payload
shapes and symbols process_chunk and ChunkError to build the input and
assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@nemoguardrails/guardrails/iorails.py`:
- Around line 403-410: The current bypass in the user_output_chunks loop only
checks for _GENERATION_ERROR_TYPE, so error chunks with error.type ==
"downstream_error" emitted by _generation_task can be sent through output rails
and get mutated; update the condition inside the loop that examines
parsed.get("error", {}).get("type") to also detect and immediately yield/return
when the type is "downstream_error" (or more generally any error types produced
by _generation_task), e.g., extend the check that references
_GENERATION_ERROR_TYPE in the block processing user_output_chunks to include
"downstream_error" so those chunks are bypassed before hitting output rails.

---

Nitpick comments:
In `@tests/test_http_error_handling.py`:
- Around line 295-347: Add a regression test that exercises the
nemoguardrails/guardrails/iorails.py streaming passthrough path to verify that
chunks with error.type == "downstream_error" are forwarded unchanged (bypassing
output rails) rather than being transformed; specifically, create a test that
feeds a downstream_error JSON payload (same shape as in existing tests that use
process_chunk/ChunkError) into the iorails streaming handler used in production
and assert that the emitted chunk equals the original payload (or that the
handler returns/passes through a ChunkError whose .error fields match the input
exactly), reusing the existing error payload shapes and symbols process_chunk
and ChunkError to build the input and assertions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dddc9e53-a649-43a1-8489-1b45475ea967

📥 Commits

Reviewing files that changed from the base of the PR and between 8c5d832 and 61dd3d4.

📒 Files selected for processing (7)
  • nemoguardrails/actions/llm/utils.py
  • nemoguardrails/exceptions.py
  • nemoguardrails/guardrails/iorails.py
  • nemoguardrails/guardrails/rail_action.py
  • nemoguardrails/rails/llm/llmrails.py
  • nemoguardrails/server/api.py
  • tests/test_http_error_handling.py

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 28, 2026

Greptile Summary

This PR propagates downstream HTTP status codes (401, 429, 503, etc.) through to the Guardrails API response instead of always returning HTTP 200, and embeds status codes in SSE streaming error chunks. The implementation covers non-streaming (JSONResponse(status_code=N)) and streaming (status in the SSE error-chunk \"code\" field) paths, with secret redaction applied at each error surface.

Confidence Score: 5/5

Safe to merge — all previously flagged issues have been addressed and only minor style concerns remain.

All P0/P1 issues from previous review rounds have been resolved. The two remaining findings are P2 style issues (hardcoded string constant and or-based None guard) that don't affect runtime correctness. Test coverage is thorough across all new code paths.

No files require special attention.

Important Files Changed

Filename Overview
nemoguardrails/actions/llm/utils.py Adds _extract_http_status helper and threads the extracted status into LLMCallException — clean, well-documented, and defensively typed.
nemoguardrails/exceptions.py Adds optional status field to LLMCallException with a backward-compatible default — straightforward and non-breaking.
nemoguardrails/guardrails/iorails.py Propagates HTTP status in streaming error payloads and updates error-chunk detection to recognise both generation_error and downstream_error types; addressed previous isinstance guard issues.
nemoguardrails/guardrails/rail_action.py Re-raises engine errors that carry an HTTP status (so they propagate to the API layer) and returns RailResult(is_safe=False) for errors without status; adds secret redaction.
nemoguardrails/rails/llm/llmrails.py Centralises streaming error payload building into _build_streaming_error_payload; hardcodes "generation_error" literal instead of reusing the _GENERATION_ERROR_TYPE constant from iorails.py.
nemoguardrails/server/api.py Adds typed exception handlers for LLMCallException/ModelEngineError/APIEngineError that return the downstream HTTP status; uses or 500 which is slightly fragile for status=0.
tests/test_http_error_handling.py Comprehensive test coverage across all error-propagation paths: status extraction, exception construction, RailAction routing, API endpoint integration, and SSE streaming chunks.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Provider Exception] --> B{Is LLMClientError?}
    B -- Yes --> C[_extract_http_status: use status_code]
    B -- No --> D{Has .status_code attr?}
    D -- Yes/int>0 --> C
    D -- No --> E{Has .response.status_code?}
    E -- Yes/int>0 --> C
    E -- No --> F[Return None]
    C --> G[LLMCallException with status=N]
    G --> H{Streaming?}
    H -- No --> I[api.py: JSONResponse status=N]
    H -- Yes --> J[_build_streaming_error_payload]
    J --> K[SSE chunk: type=downstream_error, code=N]
    K --> L[iorails.py: detect error_type, yield and stop]
    F --> M[LLMCallException with status=None]
    M --> N[api.py: JSONResponse status=500]
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
nemoguardrails/rails/llm/llmrails.py:1941
**Hardcoded `"generation_error"` string diverges from `_GENERATION_ERROR_TYPE`**

`_build_streaming_error_payload` hardcodes `"generation_error"` while `iorails.py` defines `_GENERATION_ERROR_TYPE = "generation_error"` and uses it for detection at line 539. If `_GENERATION_ERROR_TYPE` is ever updated, `_build_streaming_error_payload` silently emits a type string that `iorails.py` won't recognise, causing error chunks to leak into the output-rails pipeline instead of short-circuiting. Import and reuse the constant to keep the two files in sync.

### Issue 2 of 2
nemoguardrails/server/api.py:598
**`or 500` silently swallows a zero status**

`getattr(ex, "status", None) or 500` treats `status=0` identically to `status=None` and returns 500. `LLMCallException` and `ModelEngineError`/`APIEngineError` currently never carry `status=0` (filtered to `None` by `_extract_http_status`), but the constructors accept any `Optional[int]`, so a caller that deliberately passes `status=0` would get a misleading 500. An explicit `None`-check is more self-documenting and less fragile.

```suggestion
        status = getattr(ex, "status", None)
        status = status if status is not None else 500
```

Reviews (7): Last reviewed commit: ":construction: added initial http error ..." | Re-trigger Greptile

Comment thread nemoguardrails/guardrails/iorails.py Outdated
Comment on lines +312 to +313
"type": "downstream_error" if status else _GENERATION_ERROR_TYPE,
"code": status or "generation_failed",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Inconsistent truthy vs is not None check for status

if status on line 312 treats status=0 as falsy and falls back to _GENERATION_ERROR_TYPE, while rail_action.py uses if e.status is not None: (treating 0 as a real status). Similarly, llmrails.py uses if status is not None:. Using different semantics for the same sentinel value (status=0 meaning "no HTTP response") across these three files will produce inconsistent type / code fields in error payloads if a zero-status value ever reaches this path. Prefer a uniform if status is not None: guard throughout.

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 312-313

Comment:
**Inconsistent truthy vs `is not None` check for `status`**

`if status` on line 312 treats `status=0` as falsy and falls back to `_GENERATION_ERROR_TYPE`, while `rail_action.py` uses `if e.status is not None:` (treating 0 as a real status). Similarly, `llmrails.py` uses `if status is not None:`. Using different semantics for the same sentinel value (`status=0` meaning "no HTTP response") across these three files will produce inconsistent `type` / `code` fields in error payloads if a zero-status value ever reaches this path. Prefer a uniform `if status is not None:` guard throughout.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in commit: fe65848

Comment on lines +322 to 344
Returns ``None`` when no status can be determined or when the status
is ``0`` (used by ``LLMTimeoutError`` / ``LLMConnectionError`` for
client-side failures where no HTTP response was received).
"""
if isinstance(exception, LLMClientError):
return exception.status_code if exception.status_code > 0 else None

status = getattr(exception, "status_code", None)
if isinstance(status, int) and status > 0:
return status

response = getattr(exception, "response", None)
if response is not None:
status = getattr(response, "status_code", None)
if isinstance(status, int) and status > 0:
return status

return None


def _raise_llm_call_exception(
exception: Exception,
model: LLMModel,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 _redact_secrets imported from a private module in multiple new call-sites

This PR adds imports of _redact_secrets (name-mangled with a leading _) from nemoguardrails.llm.clients._errors in iorails.py, llmrails.py, and api.py. Three distinct public modules now depend on an internal implementation detail. If the _errors module is refactored or renamed, all three import sites break silently at load time. Consider promoting _redact_secrets to a public utility (e.g., in nemoguardrails.utils or a dedicated nemoguardrails.llm.clients.utils) so callers have a stable contract.

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/actions/llm/utils.py
Line: 322-344

Comment:
**`_redact_secrets` imported from a private module in multiple new call-sites**

This PR adds imports of `_redact_secrets` (name-mangled with a leading `_`) from `nemoguardrails.llm.clients._errors` in `iorails.py`, `llmrails.py`, and `api.py`. Three distinct public modules now depend on an internal implementation detail. If the `_errors` module is refactored or renamed, all three import sites break silently at load time. Consider promoting `_redact_secrets` to a public utility (e.g., in `nemoguardrails.utils` or a dedicated `nemoguardrails.llm.clients.utils`) so callers have a stable contract.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread nemoguardrails/rails/llm/llmrails.py Outdated
Comment on lines +903 to +907
error_dict = extract_error_json(str(e))
if "error" in error_dict:
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Non-dict "error" value from extract_error_json causes AttributeError, breaking stream termination

extract_error_json can return {"error": "some string"} (a non-dict value) when the upstream JSON body has the pattern {"error": "Unauthorized"} — parsed via json.loads or ast.literal_eval on the split fragment. The new code then calls error_dict["error"].get("message", "") which raises AttributeError: 'str' object has no attribute 'get'. This exception escapes the except handler block without pushing END_OF_STREAM to the streaming handler, leaving the client hanging indefinitely.

The same defect is present in the identical block around line 1271 (_generation_task).

Guard with an isinstance check before accessing the sub-dict:

if isinstance(error_dict.get("error"), dict):
    error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
    status = getattr(e, "status", None)
    if status is not None:
        error_dict["error"]["code"] = status
        error_dict["error"]["type"] = "downstream_error"

Apply the same fix to both error-enrichment blocks.

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 903-907

Comment:
**Non-dict `"error"` value from `extract_error_json` causes `AttributeError`, breaking stream termination**

`extract_error_json` can return `{"error": "some string"}` (a non-dict value) when the upstream JSON body has the pattern `{"error": "Unauthorized"}` — parsed via `json.loads` or `ast.literal_eval` on the split fragment. The new code then calls `error_dict["error"].get("message", "")` which raises `AttributeError: 'str' object has no attribute 'get'`. This exception escapes the `except` handler block without pushing `END_OF_STREAM` to the streaming handler, leaving the client hanging indefinitely.

The same defect is present in the identical block around line 1271 (`_generation_task`).

Guard with an `isinstance` check before accessing the sub-dict:

```python
if isinstance(error_dict.get("error"), dict):
    error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
    status = getattr(e, "status", None)
    if status is not None:
        error_dict["error"]["code"] = status
        error_dict["error"]["type"] = "downstream_error"
```

Apply the same fix to both error-enrichment blocks.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in ac3cde2

Comment thread nemoguardrails/guardrails/iorails.py Outdated
Comment on lines 408 to 412
error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None
if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"):
yield chunk
return
except (json.JSONDecodeError, TypeError):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 AttributeError not caught when "error" is a plain string

parsed.get("error", {}) returns the stored value (e.g. "Unauthorized") when the key is present, not the {} default. Calling .get("type") on a str raises AttributeError, which is not caught by except (json.JSONDecodeError, TypeError). This silently crashes the streaming generator.

This is newly reachable in this PR: extract_error_json can return {"error": "Unauthorized"} (string value) when an upstream provider responds with {"error": "some string"}, and llmrails.py pushes that raw chunk to the streaming handler without converting it to dict format when the isinstance guard fails.

Replace the attribute access with an isinstance guard:

error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
Suggested change
error_type = parsed.get("error", {}).get("type") if isinstance(parsed, dict) else None
if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"):
yield chunk
return
except (json.JSONDecodeError, TypeError):
parsed = json.loads(chunk)
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"):
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 408-412

Comment:
**`AttributeError` not caught when `"error"` is a plain string**

`parsed.get("error", {})` returns the stored value (e.g. `"Unauthorized"`) when the key is present, not the `{}` default. Calling `.get("type")` on a `str` raises `AttributeError`, which is not caught by `except (json.JSONDecodeError, TypeError)`. This silently crashes the streaming generator.

This is newly reachable in this PR: `extract_error_json` can return `{"error": "Unauthorized"}` (string value) when an upstream provider responds with `{"error": "some string"}`, and `llmrails.py` pushes that raw chunk to the streaming handler without converting it to dict format when the `isinstance` guard fails.

Replace the attribute access with an `isinstance` guard:
```python
error_obj = parsed.get("error") if isinstance(parsed, dict) else None
error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
```

```suggestion
                    parsed = json.loads(chunk)
                    error_obj = parsed.get("error") if isinstance(parsed, dict) else None
                    error_type = error_obj.get("type") if isinstance(error_obj, dict) else None
                    if error_type in (_GENERATION_ERROR_TYPE, "downstream_error"):
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in 169daae

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 89.70588% with 7 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/rails/llm/llmrails.py 72.72% 6 Missing ⚠️
nemoguardrails/actions/llm/utils.py 93.75% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@m-misiura m-misiura force-pushed the http_errors_exceptions branch from ac3cde2 to 2a1b6d0 Compare April 30, 2026 08:46
Comment on lines 1270 to 1281
except Exception as e:
# If an exception occurs during generation, push it to the streaming handler as a json string
# This ensures the streaming pipeline is properly terminated
log.error(f"Error in generation task: {e}", exc_info=True)
error_message = str(e)
error_dict = extract_error_json(error_message)
error_dict = extract_error_json(str(e))
if isinstance(error_dict.get("error"), dict):
error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
status = getattr(e, "status", None)
if status is not None:
error_dict["error"]["code"] = status
error_dict["error"]["type"] = "downstream_error"
error_payload = json.dumps(error_dict)
await streaming_handler.push_chunk(error_payload)
await streaming_handler.push_chunk(END_OF_STREAM) # type: ignore
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 security _redact_secrets skipped when error value is a plain string

When extract_error_json parses the exception message into {"error": "some plain string"} (e.g. {"error": "Unauthorized"}), isinstance(error_dict.get("error"), dict) is False and the branch that calls _redact_secrets is never entered. The unredacted string is then serialised directly into the SSE error chunk via json.dumps(error_dict). The same issue exists in the generate_async exception handler (~line 903).

For third-party provider exceptions that don't go through _build_error_fields (i.e. not LLMClientError subclasses), str(e) may contain raw API keys. The dict-format path gets redacted; the string-format path does not. Apply _redact_secrets in the else branch:

error_dict = extract_error_json(str(e))
error_val = error_dict.get("error")
if isinstance(error_val, dict):
    error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
    status = getattr(e, "status", None)
    if status is not None:
        error_dict["error"]["code"] = status
        error_dict["error"]["type"] = "downstream_error"
elif isinstance(error_val, str):
    error_dict["error"] = _redact_secrets(error_val)

Apply the same fix to the identical block in generate_async (~line 903).

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 1270-1281

Comment:
**`_redact_secrets` skipped when error value is a plain string**

When `extract_error_json` parses the exception message into `{"error": "some plain string"}` (e.g. `{"error": "Unauthorized"}`), `isinstance(error_dict.get("error"), dict)` is `False` and the branch that calls `_redact_secrets` is never entered. The unredacted string is then serialised directly into the SSE error chunk via `json.dumps(error_dict)`. The same issue exists in the `generate_async` exception handler (~line 903).

For third-party provider exceptions that don't go through `_build_error_fields` (i.e. not `LLMClientError` subclasses), `str(e)` may contain raw API keys. The dict-format path gets redacted; the string-format path does not. Apply `_redact_secrets` in the `else` branch:

```python
error_dict = extract_error_json(str(e))
error_val = error_dict.get("error")
if isinstance(error_val, dict):
    error_dict["error"]["message"] = _redact_secrets(error_dict["error"].get("message", ""))
    status = getattr(e, "status", None)
    if status is not None:
        error_dict["error"]["code"] = status
        error_dict["error"]["type"] = "downstream_error"
elif isinstance(error_val, str):
    error_dict["error"] = _redact_secrets(error_val)
```

Apply the same fix to the identical block in `generate_async` (~line 903).

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in cfa774e

Comment thread nemoguardrails/rails/llm/llmrails.py Outdated
Comment on lines +911 to +912
elif isinstance(error_val, str):
error_dict["error"] = _redact_secrets(error_val)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 String-format error chunks drop the status code and bypass iorails detection

When extract_error_json returns {"error": "Unauthorized"} (a string value — as produced by providers that use {"error": "some string"}), the elif isinstance(error_val, str) branch sets error_dict["error"] = _redact_secrets(error_val), producing {"error": "redacted string"}. Two problems follow:

  1. iorails.py checks error_type in (_GENERATION_ERROR_TYPE, "downstream_error"), but error_type is None when error_obj is a string, so the chunk is never detected as an error and leaks into the output rails pipeline instead of being yielded and terminating the stream.
  2. The status computed on line 907 is never embedded in the chunk — the HTTP code is silently dropped.

The same defect appears in both generate_async (~line 911) and _generation_task (~line 1283).

Normalize string-format errors to the expected dict structure so the status code is preserved and iorails.py can detect it:

elif isinstance(error_val, str):
    status = getattr(e, "status", None)
    error_dict["error"] = {
        "message": _redact_secrets(error_val),
        "type": "downstream_error" if status is not None else _GENERATION_ERROR_TYPE,
        "code": status if status is not None else "generation_failed",
    }
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 911-912

Comment:
**String-format error chunks drop the status code and bypass iorails detection**

When `extract_error_json` returns `{"error": "Unauthorized"}` (a string value — as produced by providers that use `{"error": "some string"}`), the `elif isinstance(error_val, str)` branch sets `error_dict["error"] = _redact_secrets(error_val)`, producing `{"error": "redacted string"}`. Two problems follow:

1. `iorails.py` checks `error_type in (_GENERATION_ERROR_TYPE, "downstream_error")`, but `error_type` is `None` when `error_obj` is a string, so the chunk is never detected as an error and leaks into the output rails pipeline instead of being yielded and terminating the stream.
2. The `status` computed on line 907 is never embedded in the chunk — the HTTP code is silently dropped.

The same defect appears in both `generate_async` (~line 911) and `_generation_task` (~line 1283).

Normalize string-format errors to the expected dict structure so the status code is preserved and `iorails.py` can detect it:

```python
elif isinstance(error_val, str):
    status = getattr(e, "status", None)
    error_dict["error"] = {
        "message": _redact_secrets(error_val),
        "type": "downstream_error" if status is not None else _GENERATION_ERROR_TYPE,
        "code": status if status is not None else "generation_failed",
    }
```

How can I resolve this? If you propose a fix, please make it concise.

@m-misiura
Copy link
Copy Markdown
Author

@greptileai

@m-misiura m-misiura force-pushed the http_errors_exceptions branch from d3198e2 to 99a4932 Compare April 30, 2026 09:23
Comment thread nemoguardrails/rails/llm/llmrails.py Outdated
Comment on lines +906 to +916
if isinstance(error_val, dict):
error_val["message"] = _redact_secrets(error_val.get("message", ""))
if status is not None:
error_val["code"] = status
error_val["type"] = "downstream_error"
elif isinstance(error_val, str):
error_dict["error"] = {
"message": _redact_secrets(error_val),
"type": "downstream_error" if status is not None else "generation_error",
"code": status if status is not None else "generation_failed",
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing fallback type field for dict-format errors without status code

When error_val is a dict and status is None (e.g., a generic Python exception whose string representation was parsed to {"error": {"message": "..."}} by extract_error_json), neither branch sets error_val["type"]. The iorails.py detection at line 539 checks error_type in ("generation_error", "downstream_error"). Without a type field, error_type is None, the condition is always False, and the error chunk is never short-circuited — raw error JSON leaks into the output-rails pipeline and eventually reaches the client as the bot response.

This is the most common path: generic Python errors (memory errors, asyncio timeouts, network errors at the Python level) all produce dict-format errors with no HTTP status. Add a fallback in the dict branch:

if isinstance(error_val, dict):
    error_val["message"] = _redact_secrets(error_val.get("message", ""))
    if status is not None:
        error_val["code"] = status
        error_val["type"] = "downstream_error"
    else:
        error_val.setdefault("type", "generation_error")

The identical gap exists in the _generation_task handler (~line 1285).

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/rails/llm/llmrails.py
Line: 906-916

Comment:
**Missing fallback `type` field for dict-format errors without status code**

When `error_val` is a `dict` and `status is None` (e.g., a generic Python exception whose string representation was parsed to `{"error": {"message": "..."}}` by `extract_error_json`), neither branch sets `error_val["type"]`. The `iorails.py` detection at line 539 checks `error_type in ("generation_error", "downstream_error")`. Without a `type` field, `error_type` is `None`, the condition is always `False`, and the error chunk is never short-circuited — raw error JSON leaks into the output-rails pipeline and eventually reaches the client as the bot response.

This is the most common path: generic Python errors (memory errors, asyncio timeouts, network errors at the Python level) all produce dict-format errors with no HTTP status. Add a fallback in the `dict` branch:

```python
if isinstance(error_val, dict):
    error_val["message"] = _redact_secrets(error_val.get("message", ""))
    if status is not None:
        error_val["code"] = status
        error_val["type"] = "downstream_error"
    else:
        error_val.setdefault("type", "generation_error")
```

The identical gap exists in the `_generation_task` handler (~line 1285).

How can I resolve this? If you propose a fix, please make it concise.

@m-misiura m-misiura force-pushed the http_errors_exceptions branch from 4175077 to ba96199 Compare April 30, 2026 09:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant