Skip to content

Conversation

@aantn
Copy link
Collaborator

@aantn aantn commented Dec 23, 2025

Summary

  • add an OOM kill toolset with a trigger_oom_kill tool that allocates ~30GB to force an OOM kill for stress testing
  • register the new toolset in the built-in toolset loader so it can be enabled via configuration

Testing

  • not run (not requested)

Codex Task

Summary by CodeRabbit

  • New Features

    • Added experimental, disabled-by-default OOM‑kill stress toolsets (Python and Bash variants) to allocate ~30GB for a configurable duration (default 300s); require explicit enablement and an environment guard.
    • Added runtime memory‑limit enforcement for subprocesses and automatic OOM detection with contextual hints in command output.
  • Documentation

    • Added guidance on enabling, invoking, and safely restricting the OOM‑kill toolsets with examples and strong non‑production cautions.
  • Tests

    • Added tests for memory‑limit parsing, prefix generation, and OOM‑hint behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 23, 2025

Walkthrough

Adds an experimental OOM‑kill toolset (Python + YAML) and TriggerOOMKill tool; registers the Python toolset; adds memory‑limit utilities and ulimit-prefixing for subprocesses; updates Bash tool execution to use the prefix and append OOM hints; adds tests and docs for controlled stress testing.

Changes

Cohort / File(s) Summary
Python toolset implementation
holmes/plugins/toolsets/oom_kill.py
New TriggerOOMKill tool and OOMKillToolset class (disabled by default). Validates hold_seconds, builds a ~30GB allocation Python command, invokes via bash execution with timeout, and exposes prerequisites/example config.
Python toolset registration
holmes/plugins/toolsets/__init__.py
Imports OOMKillToolset and instantiates it in load_python_toolsets, adding the toolset to the Python toolset list.
YAML/bash toolset manifest
holmes/plugins/toolsets/oom_kill.yaml
Adds oom_kill_bash toolset (DANGEROUS/CORE) with ALLOW_HOLMES_OOMKILL_TOOLSET prerequisite and trigger_oom_kill_bash tool running an inline Python allocation that validates hold_seconds.
Memory‑limit utilities
holmes/utils/memory_limit.py
New utilities: TOOL_MEMORY_LIMIT_ENV, TOOL_MEMORY_LIMIT_DEFAULT, parse_size_to_kb, get_memory_limit_kb, get_ulimit_prefix, check_oom_and_append_hint. Parses sizes, builds ulimit prefix, and annotates outputs when OOM suspected.
Subprocess / Bash integration
holmes/core/tools.py, holmes/plugins/toolsets/bash/common/bash.py
Commands are prefixed with get_ulimit_prefix() before execution; command outputs passed through check_oom_and_append_hint. Error handling adjusted for unexpected exceptions and FileNotFound scenarios.
Tests
tests/core/test_tool_memory_limit.py
New parameterized tests for size parsing, env-driven memory limit resolution, ulimit prefix generation, and OOM hint detection (monkeypatch used to control env).
Documentation
docs/development/stress-testing-oom-kill.md
New docs describing both toolsets, enablement guards (env var + per-toolset enable), hold_seconds parameter, invocation examples, and safety guidance for non-production use.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Loader as Toolset Loader
    participant Trigger as TriggerOOMKill
    participant Executor as Bash Executor
    participant Host as Child Process

    User->>Loader: request trigger_oom_kill(params)
    Loader->>Trigger: dispatch invoke(params, context)
    Trigger->>Trigger: validate hold_seconds
    alt invalid duration
        Trigger-->>User: StructuredToolResult (error)
    else valid
        Trigger->>Executor: execute command (ulimit prefix + inline Python alloc, timeout=hold+30)
        Executor->>Host: spawn process (allocates ~30GB, sleeps)
        Host-->>Executor: exit code + stdout/stderr
        Executor->>Executor: check_oom_and_append_hint(output, return_code)
        Executor-->>Trigger: result (stdout/stderr, code, possible OOM hint)
        Trigger-->>User: StructuredToolResult (success/failure)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Subtasks #851 — Modifies holmes/plugins/toolsets/__init__.py to add and instantiate a Python toolset; overlaps loader registration changes.

Suggested reviewers

  • moshemorad
  • Avi-Robusta

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 65.38% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and accurately summarizes the main change: adding a toolset to trigger OOM kills. It is concise, specific, and directly reflects the primary objective of the pull request.

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7a5b967 and 08418f0.

📒 Files selected for processing (1)
  • tests/core/test_tool_memory_limit.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/core/test_tool_memory_limit.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.12)
  • GitHub Check: llm_evals
  • GitHub Check: build

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
holmes/plugins/toolsets/oom_kill.py (2)

46-78: LGTM with minor note on unused parameter.

The implementation correctly validates inputs and provides detailed error messages. The timeout calculation (hold_seconds + 30) provides reasonable buffer for process startup. The static analysis flagged context as unused, but this is acceptable since it's part of the Tool._invoke interface contract.

Optional: Use underscore prefix to signal intentionally unused parameter
-    def _invoke(self, params: dict, context: ToolInvokeContext) -> StructuredToolResult:
+    def _invoke(self, params: dict, _context: ToolInvokeContext) -> StructuredToolResult:

88-111: Appropriate safety configuration, but integration tests are missing.

The toolset has proper safety guards: enabled=False, experimental=True, is_default=False, and the tool requires explicit confirmation. However, based on learnings, new toolsets require integration tests. While the PR description notes testing was not requested, consider adding at least a basic test to verify:

  1. The toolset loads correctly
  2. The confirmation guard works (rejects without confirm=True)
  3. The hold_seconds validation works
Optional: Use underscore prefix for intentionally unused config parameter
-    def prerequisites_callable(self, config: dict[str, Any]) -> tuple[bool, str]:
+    def prerequisites_callable(self, _config: dict[str, Any]) -> tuple[bool, str]:
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9eec37 and bdabdf0.

📒 Files selected for processing (2)
  • holmes/plugins/toolsets/__init__.py
  • holmes/plugins/toolsets/oom_kill.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff for formatting and linting (configured in pyproject.toml)
Type hints required (mypy configuration in pyproject.toml)
ALWAYS place Python imports at the top of the file, not inside functions or methods

Files:

  • holmes/plugins/toolsets/__init__.py
  • holmes/plugins/toolsets/oom_kill.py
holmes/plugins/toolsets/**

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/plugins/toolsets/**: Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Files:

  • holmes/plugins/toolsets/__init__.py
  • holmes/plugins/toolsets/oom_kill.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: New toolsets require integration tests
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure

Applied to files:

  • holmes/plugins/toolsets/__init__.py
  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/bash.yaml : Bash toolset validates commands for safety

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
🧬 Code graph analysis (2)
holmes/plugins/toolsets/__init__.py (1)
holmes/plugins/toolsets/oom_kill.py (1)
  • OOMKillToolset (88-111)
holmes/plugins/toolsets/oom_kill.py (2)
holmes/core/tools.py (8)
  • CallablePrerequisite (525-526)
  • StructuredToolResult (79-105)
  • StructuredToolResultStatus (52-76)
  • Tool (177-369)
  • ToolInvokeContext (166-174)
  • ToolParameter (157-163)
  • Toolset (538-782)
  • ToolsetTag (145-148)
holmes/plugins/toolsets/bash/common/bash.py (1)
  • execute_bash_command (5-58)
🪛 Ruff (0.14.10)
holmes/plugins/toolsets/oom_kill.py

46-46: Unused method argument: context

(ARG002)


106-106: Unused method argument: config

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.11)
  • GitHub Check: llm_evals
  • GitHub Check: build (3.10)
  • GitHub Check: build
🔇 Additional comments (3)
holmes/plugins/toolsets/__init__.py (1)

38-38: LGTM!

The import and registration of OOMKillToolset follow the established patterns for other Python toolsets. The toolset is correctly disabled by default (as configured in the toolset class), which is appropriate for this dangerous stress-testing tool.

Also applies to: 100-100

holmes/plugins/toolsets/oom_kill.py (2)

1-14: LGTM!

Imports are properly organized at the top of the file as per coding guidelines, with necessary types and utilities imported.


17-44: The toolset parameter passed to super().__init__() will not raise a validation error. The base Tool class uses Pydantic's default extra='ignore' configuration, which silently ignores undeclared parameters. However, since toolset is not defined as a field in the Tool class, it's being discarded. If the toolset needs to be stored on the Tool instance, add it as an explicit field; otherwise, remove the parameter from the super() call.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
holmes/plugins/toolsets/oom_kill.yaml (3)

18-20: Enhance error message with received value.

The confirmation check is a good safety mechanism. However, per coding guidelines, error messages should include the actual parameter values to enable better LLM self-correction and debugging.

🔎 Proposed enhancement
 confirm = "{{ confirm }}"
 if str(confirm).lower() not in ("true", "yes", "1"):
-    sys.exit("Confirmation required: set confirm=true to run the OOM trigger.")
+    sys.exit(f"Confirmation required: set confirm=true to run the OOM trigger. Received: {confirm!r}")

Based on coding guidelines for toolsets requiring detailed error messages.


22-28: Improve exception handling specificity and error detail.

Two improvements needed:

  1. The exception handling is too broad (except Exception) and should specifically catch ValueError for integer parsing failures.
  2. Error messages should include the actual parameter value to help with debugging and LLM self-correction.
🔎 Proposed improvements
 try:
     hold_seconds = int("{{ hold_seconds|default(300) }}")
-except Exception:
-    sys.exit("hold_seconds must be an integer.")
+except ValueError as e:
+    sys.exit(f"hold_seconds must be an integer. Received: '{{ hold_seconds }}'. Error: {e}")
 
 if hold_seconds <= 0:
-    sys.exit("hold_seconds must be positive.")
+    sys.exit(f"hold_seconds must be positive. Received: {hold_seconds}")

Based on coding guidelines requiring detailed error messages with parameter values.


30-30: Consider making memory size configurable.

The hardcoded 30GB allocation may not be suitable for all environments. Making this configurable would improve flexibility for different testing scenarios.

💡 Optional enhancement to add size parameter
+try:
+    size_gb = int("{{ size_gb|default(30) }}")
+except ValueError:
+    sys.exit("size_gb must be an integer.")
+
+if size_gb <= 0:
+    sys.exit(f"size_gb must be positive. Received: {size_gb}")
+
-size_bytes = 30 * 1024 * 1024 * 1024
+size_bytes = size_gb * 1024 * 1024 * 1024
 print(f"Allocating {size_bytes / 1024 / 1024 / 1024:.0f} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s")
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bdabdf0 and 199ee26.

📒 Files selected for processing (1)
  • holmes/plugins/toolsets/oom_kill.yaml
🧰 Additional context used
📓 Path-based instructions (1)
holmes/plugins/toolsets/**

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/plugins/toolsets/**: Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Files:

  • holmes/plugins/toolsets/oom_kill.yaml
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure

Applied to files:

  • holmes/plugins/toolsets/oom_kill.yaml
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/bash.yaml : Bash toolset validates commands for safety

Applied to files:

  • holmes/plugins/toolsets/oom_kill.yaml
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Applied to files:

  • holmes/plugins/toolsets/oom_kill.yaml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.10)
  • GitHub Check: llm_evals
  • GitHub Check: build
🔇 Additional comments (1)
holmes/plugins/toolsets/oom_kill.yaml (1)

1-10: LGTM! Appropriate warnings for destructive toolset.

The metadata clearly marks this as dangerous and restricts usage to controlled environments, which is essential for a tool designed to trigger OOM kills.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 23, 2025

Dev Docker images are ready for this commit:

Use this tag to pull the image for testing.

⚠️ Temporary images are deleted after 30 days. Copy to a permanent registry before using them:

gcloud auth configure-docker us-central1-docker.pkg.dev
docker pull us-central1-docker.pkg.dev/robusta-development/temporary-builds/holmes:08418f0
docker tag us-central1-docker.pkg.dev/robusta-development/temporary-builds/holmes:08418f0 me-west1-docker.pkg.dev/robusta-development/development/holmes-dev:08418f0
docker push me-west1-docker.pkg.dev/robusta-development/development/holmes-dev:08418f0

Patch Helm values in one line (choose the chart you use):

  • HolmesGPT chart:
helm upgrade --install holmesgpt ./helm/holmes \
  --set registry=me-west1-docker.pkg.dev/robusta-development/development \
  --set image=holmes-dev:08418f0
  • Robusta wrapper chart:
helm upgrade --install robusta robusta/robusta \
  --reuse-values \
  --set holmes.registry=me-west1-docker.pkg.dev/robusta-development/development \
  --set holmes.image=holmes-dev:08418f0

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 199ee26 and 0058135.

📒 Files selected for processing (2)
  • holmes/plugins/toolsets/oom_kill.py
  • holmes/plugins/toolsets/oom_kill.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • holmes/plugins/toolsets/oom_kill.yaml
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff for formatting and linting (configured in pyproject.toml)
Type hints required (mypy configuration in pyproject.toml)
ALWAYS place Python imports at the top of the file, not inside functions or methods

Files:

  • holmes/plugins/toolsets/oom_kill.py
holmes/plugins/toolsets/**

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/plugins/toolsets/**: Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Files:

  • holmes/plugins/toolsets/oom_kill.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/bash.yaml : Bash toolset validates commands for safety

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
🪛 Ruff (0.14.10)
holmes/plugins/toolsets/oom_kill.py

39-39: Unused method argument: context

(ARG002)


90-90: Unused method argument: config

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.12)
  • GitHub Check: llm_evals
  • GitHub Check: build
🔇 Additional comments (5)
holmes/plugins/toolsets/oom_kill.py (5)

1-14: LGTM! Imports are properly organized.

All imports are correctly placed at the top of the file and include necessary type hints, following the coding guidelines.


17-37: LGTM! Clear safety warnings and proper parameter definition.

The description appropriately warns users about the dangerous nature of this tool, and the parameter is well-defined with a reasonable default.


64-69: LGTM! Consistent and informative one-liner.

The method correctly retrieves the parameter with the same default value and returns a clear, concise description.


72-88: LGTM! Excellent safety defaults for a dangerous toolset.

The toolset is appropriately configured with:

  • enabled=False (must be explicitly enabled)
  • experimental=True (signals caution)
  • is_default=False (not included by default)
  • Clear warning in the description

These are exactly the right safeguards for a stress-testing tool that intentionally crashes processes.


90-95: LGTM! Minimal configuration appropriate for this toolset.

Both methods correctly indicate that no special configuration is required for the OOM kill toolset, which is appropriate given its straightforward nature.

Comment on lines 39 to 62
def _invoke(self, params: dict, context: ToolInvokeContext) -> StructuredToolResult:
hold_seconds = params.get("hold_seconds", 300)
if not isinstance(hold_seconds, int) or hold_seconds <= 0:
return StructuredToolResult(
status=StructuredToolResultStatus.ERROR,
error="hold_seconds must be a positive integer.",
params=params,
)

command = textwrap.dedent(
f"""
python - <<'PY'
import time

size_bytes = 30 * 1024 * 1024 * 1024
print(f"Allocating {{size_bytes / 1024 / 1024 / 1024:.0f}} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s")
data = bytearray(size_bytes)
time.sleep({hold_seconds})
PY
"""
).strip()

timeout = hold_seconds + 30
return execute_bash_command(cmd=command, timeout=timeout, params=params)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Find execute_bash_command implementation
rg -n 'def execute_bash_command' --type=py -A 30

Repository: HolmesGPT/holmesgpt

Length of output: 2512


execute_bash_command must include full error output in error messages to comply with toolset guidelines.

The implementation currently omits command output when returning error results. Line 30 returns Error: Command "{cmd}" returned non-zero exit status {process.returncode} but does not include the stdout/stderr output (which contains the actual error details). Per coding guidelines, "All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response."

Update the error path to include result_data similar to the success case (line 19) so that command output is available to the LLM for self-correction.

🧰 Tools
🪛 Ruff (0.14.10)

39-39: Unused method argument: context

(ARG002)

🤖 Prompt for AI Agents
In holmes/plugins/toolsets/oom_kill.py around lines 39 to 62, the error path
from execute_bash_command currently returns an error message that omits the
command output; change the call or handling so that when execute_bash_command
returns an ERROR result it includes result_data containing the command, stdout
and stderr (mirroring the success case), and ensure the
StructuredToolResult.error string concatenates a clear message with the full
stdout/stderr and the executed command and return code so the LLM can see the
underlying failure details for self-correction.

@github-actions
Copy link
Contributor

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
holmes/plugins/toolsets/oom_kill.yaml (1)

25-28: Memory may not trigger OOM without touching pages.

As noted in a previous review, Linux memory overcommitment means bytearray(size_bytes) reserves virtual memory without allocating physical pages. The OOM killer won't be triggered until pages are actually written.

🧹 Nitpick comments (3)
holmes/plugins/toolsets/oom_kill.yaml (1)

4-4: Consider linking to the stress-testing documentation.

The docs_url is empty but documentation exists at docs/development/stress-testing-oom-kill.md. Consider populating this once the docs are published.

holmes/plugins/toolsets/oom_kill.py (2)

40-40: Unused context parameter is expected for method signature.

The static analysis flags context as unused, but this is an override of the abstract _invoke method from Tool base class. The signature is required. Consider adding # noqa: ARG002 or using _context to suppress the warning.

🔎 Proposed fix to suppress linter warning
-    def _invoke(self, params: dict, context: ToolInvokeContext) -> StructuredToolResult:
+    def _invoke(self, params: dict, context: ToolInvokeContext) -> StructuredToolResult:  # noqa: ARG002

94-96: Consider using _config to indicate intentionally unused parameter.

The config parameter is unused because no special configuration validation is needed. Using an underscore prefix signals this intent and suppresses linter warnings.

🔎 Proposed fix
-    def prerequisites_callable(self, config: dict[str, Any]) -> tuple[bool, str]:
+    def prerequisites_callable(self, _config: dict[str, Any]) -> tuple[bool, str]:
         # No special configuration is required for this toolset.
         return True, ""
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0058135 and 735d861.

📒 Files selected for processing (3)
  • docs/development/stress-testing-oom-kill.md
  • holmes/plugins/toolsets/oom_kill.py
  • holmes/plugins/toolsets/oom_kill.yaml
🧰 Additional context used
📓 Path-based instructions (3)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff for formatting and linting (configured in pyproject.toml)
Type hints required (mypy configuration in pyproject.toml)
ALWAYS place Python imports at the top of the file, not inside functions or methods

Files:

  • holmes/plugins/toolsets/oom_kill.py
holmes/plugins/toolsets/**

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/plugins/toolsets/**: Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Files:

  • holmes/plugins/toolsets/oom_kill.py
  • holmes/plugins/toolsets/oom_kill.yaml
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When writing documentation in the docs/ directory, always add a blank line between headers/bold text and lists for proper MkDocs rendering

Files:

  • docs/development/stress-testing-oom-kill.md
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
  • holmes/plugins/toolsets/oom_kill.yaml
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
  • holmes/plugins/toolsets/oom_kill.yaml
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/bash.yaml : Bash toolset validates commands for safety

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
  • holmes/plugins/toolsets/oom_kill.yaml
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
🧬 Code graph analysis (1)
holmes/plugins/toolsets/oom_kill.py (2)
holmes/core/tools.py (9)
  • CallablePrerequisite (525-526)
  • StructuredToolResult (79-105)
  • StructuredToolResultStatus (52-76)
  • Tool (177-369)
  • ToolInvokeContext (166-174)
  • ToolParameter (157-163)
  • Toolset (538-782)
  • ToolsetTag (145-148)
  • ToolsetEnvironmentPrerequisite (534-535)
holmes/plugins/toolsets/bash/common/bash.py (1)
  • execute_bash_command (5-58)
🪛 LanguageTool
docs/development/stress-testing-oom-kill.md

[style] ~5-~5: Try moving the adverb to make the sentence clearer.
Context: ... stress tests. This guide explains how to intentionally trigger OOM kills using the built-in OOM toolsets, how to...

(SPLIT_INFINITIVE)

🪛 markdownlint-cli2 (0.18.1)
docs/development/stress-testing-oom-kill.md

71-71: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.14.10)
holmes/plugins/toolsets/oom_kill.py

40-40: Unused method argument: context

(ARG002)


94-94: Unused method argument: config

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: llm_evals
  • GitHub Check: build
🔇 Additional comments (4)
holmes/plugins/toolsets/oom_kill.py (2)

63-63: LGTM on execute_bash_command usage.

The execute_bash_command function correctly includes command output in result_data for both success and error cases (per the implementation in bash.py), satisfying the coding guideline for detailed error messages.


73-92: LGTM on toolset initialization.

Good safety defaults: enabled=False, experimental=True, is_default=False, plus environment prerequisite gate. The toolset won't be loaded without explicit opt-in.

docs/development/stress-testing-oom-kill.md (2)

1-79: Documentation structure looks good.

The documentation includes appropriate warnings, clear enablement steps for both Helm/ArgoCD and local CLI modes, and safety considerations. MkDocs formatting with blank lines between headers and lists is correctly applied.


12-12: Documentation mentions ulimit that will prevent allocation.

The documentation states the bash variant "applies ulimit -v 2097152 (2 GiB virtual memory cap) before allocation to reduce blast radius." However, this 2GB limit will cause the 30GB allocation to fail with MemoryError. The documentation should be updated once the code issue is resolved.

Comment on lines 54 to 57
size_bytes = 30 * 1024 * 1024 * 1024
print(f"Allocating {{size_bytes / 1024 / 1024 / 1024:.0f}} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s")
data = bytearray(size_bytes)
time.sleep({hold_seconds})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Memory allocation may not trigger OOM without touching pages.

Similar to the YAML variant, bytearray(size_bytes) on Linux with memory overcommitment reserves virtual memory without allocating physical pages. To reliably trigger the OOM killer, pages must be written to.

🔎 Proposed fix to touch every page
             size_bytes = 30 * 1024 * 1024 * 1024
             print(f"Allocating {{size_bytes / 1024 / 1024 / 1024:.0f}} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s")
             data = bytearray(size_bytes)
+            # Touch every page to force physical allocation
+            page_size = 4096
+            for i in range(0, size_bytes, page_size):
+                data[i] = 1
+            print("Memory allocation complete")
             time.sleep({hold_seconds})
🤖 Prompt for AI Agents
In holmes/plugins/toolsets/oom_kill.py around lines 54 to 57, allocating a large
bytearray under Linux with overcommit may only reserve virtual address space and
not fault in pages; to reliably trigger the OOM killer you must touch (write to)
each page. Fix by determining the system page size (os.sysconf or
mmap.PAGESIZE), iterating over the allocation in page-sized steps and writing at
least one byte to each page (e.g., via a memoryview or mmap) to fault pages into
RAM, then sleep for hold_seconds; ensure you handle large allocations in a
streaming loop to avoid creating huge intermediate structures and optionally
catch KeyboardInterrupt to free memory before exit.

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
docs/development/stress-testing-oom-kill.md (1)

102-104: Add language identifier to fenced code block.

The code block still lacks a language specifier. Use text or plaintext since this is a conceptual invocation example.

🔎 Proposed fix
 Example invocation (conceptual):
-```
+```text
 trigger_oom_kill: allocate ~30GB and sleep for 120s
</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (1)</summary><blockquote>

<details>
<summary>docs/development/stress-testing-oom-kill.md (1)</summary><blockquote>

`5-5`: **Consider moving the adverb for clarity.**

The phrase "how to intentionally trigger" forms a split construction. Reordering as "how to trigger OOM kills intentionally" would improve readability.


<details>
<summary>🔎 Proposed fix</summary>

```diff
-This guide explains how to intentionally trigger OOM kills using the built-in OOM toolsets, how to enable them safely, and how to confirm they are available.
+This guide explains how to trigger OOM kills intentionally using the built-in OOM toolsets, how to enable them safely, and how to confirm they are available.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 735d861 and 465745d.

📒 Files selected for processing (1)
  • docs/development/stress-testing-oom-kill.md
🧰 Additional context used
📓 Path-based instructions (1)
docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

When writing documentation in the docs/ directory, always add a blank line between headers/bold text and lists for proper MkDocs rendering

Files:

  • docs/development/stress-testing-oom-kill.md
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: New toolsets require integration tests
🪛 LanguageTool
docs/development/stress-testing-oom-kill.md

[style] ~5-~5: Try moving the adverb to make the sentence clearer.
Context: ... stress tests. This guide explains how to intentionally trigger OOM kills using the built-in OOM toolsets, how to...

(SPLIT_INFINITIVE)

🪛 markdownlint-cli2 (0.18.1)
docs/development/stress-testing-oom-kill.md

102-102: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.11)
  • GitHub Check: llm_evals
  • GitHub Check: build

@github-actions
Copy link
Contributor

Changed tag from 'cli' to 'core' so the toolset is available when
Holmes runs as a server in Kubernetes, not just in CLI mode.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@github-actions
Copy link
Contributor

aantn and others added 2 commits December 25, 2025 07:05
- Add ulimit-based memory protection to prevent OOM from crashing main process
- Support configurable limit via HOLMES_TOOL_MEMORY_LIMIT env var (default: 2GB)
- Parse human-readable sizes (e.g., "4GB", "512MB", "1.5G")
- Detect OOM kills and show helpful hint about increasing the limit
- Apply protection to both YAML tools and bash toolset

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
holmes/plugins/toolsets/oom_kill.yaml (1)

17-27: Memory may not be physically allocated due to OS overcommitment.

On Linux with memory overcommitment enabled (default), bytearray(size_bytes) reserves virtual memory but doesn't allocate physical pages until written. The OOM killer won't trigger until pages are actually touched. Additionally, there's no error handling for MemoryError if allocation fails before triggering OOM.

🔎 Proposed fix to force physical allocation and add error handling
          python - <<'PY'
          import time

          hold_seconds = int("{{ hold_seconds|default(300) }}")
          if hold_seconds <= 0:
              raise SystemExit("hold_seconds must be positive.")

          size_bytes = 30 * 1024 * 1024 * 1024
          print(f"Allocating {size_bytes / 1024 / 1024 / 1024:.0f} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s")
-          data = bytearray(size_bytes)
+          try:
+              data = bytearray(size_bytes)
+              # Touch every page to force physical allocation (defeats copy-on-write)
+              page_size = 4096
+              for i in range(0, size_bytes, page_size):
+                  data[i] = 1
+              print("Memory allocation complete, holding...")
+          except MemoryError as e:
+              raise SystemExit(f"MemoryError during allocation: {e}. System may have insufficient memory or ulimit restrictions.")
          time.sleep(hold_seconds)
          PY

Based on coding guidelines, toolsets should return detailed error messages to enable LLM self-correction.

🧹 Nitpick comments (2)
tests/core/test_tool_memory_limit.py (1)

157-162: Assertion at line 162 is fragile and may produce false negatives.

The assertion assert "4GB" in result or "8GB" in result checks for example values in the hint text. If the hint wording changes, this test could fail even when OOM detection works correctly. Consider asserting on the presence of the [OOM] marker and TOOL_MEMORY_LIMIT_ENV (which you already do), rather than specific example values.

🔎 Proposed fix to make assertion more robust
     def test_hint_appended_on_oom_indicators(self, return_code: int, output: str):
         """Test that hint is appended when OOM indicators are detected."""
         result = check_oom_and_append_hint(output, return_code)
         assert "[OOM]" in result
         assert TOOL_MEMORY_LIMIT_ENV in result
-        assert "4GB" in result or "8GB" in result  # Example values in hint
+        # Hint should mention how to increase the limit
+        assert "To increase" in result
holmes/utils/memory_limit.py (1)

107-113: Consider potential false positives in OOM detection.

The substring check "Killed" in output could match legitimate output containing the word "Killed" (e.g., process status messages, log entries about other processes). While the combination with other indicators reduces risk, consider using a more specific pattern like checking if "Killed" appears at the start of a line or as a standalone message.

🔎 Proposed refinement for more precise "Killed" detection
     is_oom = (
         return_code in (137, -9)
-        or "Killed" in output
+        or output.strip() == "Killed"  # Exact match for Linux OOM killer
+        or output.strip().endswith("Killed")  # Common pattern: "<pid> Killed"
         or "MemoryError" in output
         or "Cannot allocate memory" in output
         or "bad_alloc" in output
     )

Alternatively, if broader matching is preferred, the current approach is acceptable given the additional return code checks provide corroboration.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b96b092 and d63ee32.

📒 Files selected for processing (5)
  • holmes/core/tools.py
  • holmes/plugins/toolsets/bash/common/bash.py
  • holmes/plugins/toolsets/oom_kill.yaml
  • holmes/utils/memory_limit.py
  • tests/core/test_tool_memory_limit.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff for formatting and linting (configured in pyproject.toml)
Type hints required (mypy configuration in pyproject.toml)
ALWAYS place Python imports at the top of the file, not inside functions or methods

Files:

  • holmes/utils/memory_limit.py
  • tests/core/test_tool_memory_limit.py
  • holmes/plugins/toolsets/bash/common/bash.py
  • holmes/core/tools.py
tests/**

📄 CodeRabbit inference engine (CLAUDE.md)

Tests: Match source structure under tests/

Files:

  • tests/core/test_tool_memory_limit.py
holmes/plugins/toolsets/**

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/plugins/toolsets/**: Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Files:

  • holmes/plugins/toolsets/oom_kill.yaml
  • holmes/plugins/toolsets/bash/common/bash.py
holmes/core/tools.py

📄 CodeRabbit inference engine (CLAUDE.md)

All tools have read-only access by design

Files:

  • holmes/core/tools.py
🧠 Learnings (7)
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure

Applied to files:

  • holmes/plugins/toolsets/oom_kill.yaml
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/bash.yaml : Bash toolset validates commands for safety

Applied to files:

  • holmes/plugins/toolsets/oom_kill.yaml
  • holmes/plugins/toolsets/bash/common/bash.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Applied to files:

  • holmes/plugins/toolsets/oom_kill.yaml
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling

Applied to files:

  • holmes/plugins/toolsets/bash/common/bash.py
  • holmes/core/tools.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Use LiteLLM for multi-provider support (OpenAI, Anthropic, Azure, etc.) in LLM integration

Applied to files:

  • holmes/core/tools.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/prompts/**/*.jinja2 : Prompts: holmes/plugins/prompts/{name}.jinja2 file structure

Applied to files:

  • holmes/core/tools.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/prompts/**/*.jinja2 : Use Jinja2 templates for investigation prompts

Applied to files:

  • holmes/core/tools.py
🧬 Code graph analysis (3)
tests/core/test_tool_memory_limit.py (1)
holmes/utils/memory_limit.py (4)
  • parse_size_to_kb (19-58)
  • get_memory_limit_kb (61-76)
  • get_ulimit_prefix (79-87)
  • check_oom_and_append_hint (90-124)
holmes/plugins/toolsets/bash/common/bash.py (2)
holmes/core/tools.py (2)
  • StructuredToolResult (80-106)
  • StructuredToolResultStatus (53-77)
holmes/utils/memory_limit.py (2)
  • get_ulimit_prefix (79-87)
  • check_oom_and_append_hint (90-124)
holmes/core/tools.py (3)
holmes/plugins/prompts/__init__.py (1)
  • load_and_render_prompt (27-54)
holmes/utils/config_utils.py (1)
  • merge_transformers (11-91)
holmes/utils/memory_limit.py (2)
  • get_ulimit_prefix (79-87)
  • check_oom_and_append_hint (90-124)
🪛 Ruff (0.14.10)
holmes/utils/memory_limit.py

40-40: Avoid specifying long messages outside the exception class

(TRY003)


56-56: Avoid specifying long messages outside the exception class

(TRY003)

holmes/plugins/toolsets/bash/common/bash.py

10-10: subprocess call with shell=True identified, security issue

(S602)

holmes/core/tools.py

502-502: subprocess call with shell=True identified, security issue

(S602)


514-514: Consider moving this statement to an else block

(TRY300)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.12)
  • GitHub Check: llm_evals
  • GitHub Check: build
🔇 Additional comments (11)
holmes/plugins/toolsets/oom_kill.yaml (1)

1-12: Toolset structure and safety gates look appropriate.

The prerequisite requiring ALLOW_HOLMES_OOMKILL_TOOLSET environment variable provides a good safety mechanism, and the warning in additional_instructions clearly communicates the destructive nature.

holmes/plugins/toolsets/bash/common/bash.py (2)

9-23: Memory protection integration looks solid.

The integration of get_ulimit_prefix() and check_oom_and_append_hint() provides proper memory limit enforcement and helpful OOM detection feedback. The || true in the ulimit prefix ensures graceful degradation on systems where ulimit is unsupported.

The shell=True flagged by static analysis (S602) is intentional and necessary here since this is a bash command executor by design.


50-56: Error message update is appropriate.

The simplified error message correctly reflects that FileNotFoundError can occur when /bin/bash is not found, which is the primary failure mode with shell=True and explicit executable="/bin/bash".

holmes/core/tools.py (2)

8-8: Imports are properly organized at the top of the file.

The new imports for time, Table, and memory limit utilities are correctly placed at module level, following the coding guidelines for Python files.

Also applies to: 34-34, 43-45


498-521: Memory protection in __execute_subprocess is consistent and well-implemented.

The implementation correctly mirrors the pattern in bash.py:

  1. Prefixes commands with ulimit protection via get_ulimit_prefix()
  2. Post-processes output with check_oom_and_append_hint() for helpful error feedback

The shell=True (S602) is intentional for command execution. The TRY300 hint about moving the return to an else block is a minor style preference that doesn't affect correctness.

tests/core/test_tool_memory_limit.py (2)

1-174: Test coverage is comprehensive and well-structured.

The test suite provides thorough coverage of:

  • Size parsing with multiple units, decimals, and whitespace variations
  • Invalid input handling
  • Environment variable behavior (default, custom, fallback on invalid)
  • OOM detection with multiple indicators (return codes 137/-9, various error messages)
  • Hint content verification

Good use of @pytest.mark.parametrize for covering edge cases efficiently.


57-59: The whitespace test cases are valid and will parse correctly.

The regex pattern r"^(\d+(?:\.\d+)?)\s*([KMGT]?B?)?$" correctly supports both test cases:

  • " 2GB " is stripped to "2GB" before regex matching
  • "2 GB" remains after strip and the \s* in the regex explicitly matches zero or more whitespace characters between the number and unit

Both test cases at lines 57-59 will pass as expected.

holmes/utils/memory_limit.py (4)

1-16: Module structure and constants are well-defined.

Clear docstring explaining the module purpose, and the constants TOOL_MEMORY_LIMIT_ENV and TOOL_MEMORY_LIMIT_DEFAULT provide sensible defaults with clear naming.


19-58: parse_size_to_kb implementation is robust.

The function handles various input formats (GB, MB, KB, TB with or without 'B' suffix), decimal values, and whitespace. The regex is well-constructed and the error handling with ValueError is appropriate for a parsing function.

The TRY003 hints from static analysis suggest moving error messages to exception classes, but inline messages here improve readability without significant maintenance burden.


61-87: Environment variable handling and ulimit prefix generation are correct.

get_memory_limit_kb properly falls back to the default with a warning on invalid input, preventing misconfiguration from breaking tool execution. The || true in the ulimit prefix ensures graceful degradation on systems where ulimit is not supported or restricted.


90-124: OOM hint provides actionable guidance.

The hint includes the current limit and clear instructions on how to increase it, which aligns with the coding guideline that toolsets should return detailed error messages to enable LLM self-correction.

Follow the established pattern for Tool subclasses that need toolset reference.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
holmes/plugins/toolsets/oom_kill.py (1)

51-65: Memory allocation may not trigger OOM without touching pages (previously flagged).

The bytearray(size_bytes) allocation on line 58 may only reserve virtual memory on systems with memory overcommitment enabled (common on Linux), without actually allocating physical RAM. To reliably trigger the OOM killer, you must write to (touch) each page after allocation.

Additionally, verify that execute_bash_command returns detailed error messages including stdout/stderr output when the command fails, as required by the toolset guidelines. A previous review noted this concern.

🔎 Proposed fix to touch pages and ensure OOM trigger
             size_bytes = 30 * 1024 * 1024 * 1024
             print(f"Allocating {{size_bytes / 1024 / 1024 / 1024:.0f}} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s")
             data = bytearray(size_bytes)
+            # Touch every page to force physical allocation
+            page_size = 4096
+            for i in range(0, size_bytes, page_size):
+                data[i] = 1
+            print("Memory touched, holding allocation")
             time.sleep({hold_seconds})

Run the following script to verify that execute_bash_command includes full error details:

#!/bin/bash
# Verify execute_bash_command includes stdout/stderr in error results

rg -n 'def execute_bash_command' --type=py -A 40 holmes/plugins/toolsets/bash/common/bash.py
🧹 Nitpick comments (1)
holmes/plugins/toolsets/oom_kill.py (1)

96-101: Consider prefixing unused parameter with underscore.

The config parameter in prerequisites_callable is flagged as unused by static analysis. If the interface allows, consider renaming it to _config to signal it's intentionally unused. Alternatively, if the exact signature is required by the interface, the current implementation is acceptable.

🔎 Optional refactor
-    def prerequisites_callable(self, config: dict[str, Any]) -> tuple[bool, str]:
+    def prerequisites_callable(self, _config: dict[str, Any]) -> tuple[bool, str]:
         # No special configuration is required for this toolset.
         return True, ""
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d63ee32 and 09dedea.

📒 Files selected for processing (1)
  • holmes/plugins/toolsets/oom_kill.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff for formatting and linting (configured in pyproject.toml)
Type hints required (mypy configuration in pyproject.toml)
ALWAYS place Python imports at the top of the file, not inside functions or methods

Files:

  • holmes/plugins/toolsets/oom_kill.py
holmes/plugins/toolsets/**

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/plugins/toolsets/**: Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Files:

  • holmes/plugins/toolsets/oom_kill.py
🧠 Learnings (4)
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/bash.yaml : Bash toolset validates commands for safety

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
🪛 Ruff (0.14.10)
holmes/plugins/toolsets/oom_kill.py

42-42: Unused method argument: context

(ARG002)


96-96: Unused method argument: config

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.12)
  • GitHub Check: build (3.10)
  • GitHub Check: llm_evals
  • GitHub Check: build (3.11)
  • GitHub Check: build
🔇 Additional comments (5)
holmes/plugins/toolsets/oom_kill.py (5)

1-16: LGTM!

Imports are correctly placed at the top of the file and include necessary type hint support, following the coding guidelines.


18-40: LGTM!

The tool is properly initialized with clear documentation of its dangerous nature. The toolset field declaration follows the established pattern for Tool subclasses that need a toolset reference.


42-49: LGTM!

Parameter validation correctly ensures hold_seconds is a positive integer and returns a structured error for invalid input.


67-72: LGTM!

The one-liner description is clear and consistent with the _invoke implementation.


75-94: LGTM!

The toolset is properly configured with appropriate safety guards: disabled by default, requires explicit environment variable enablement, and is clearly marked as experimental and dangerous. This follows best practices for potentially harmful toolsets.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 09dedea and 7a5b967.

📒 Files selected for processing (2)
  • holmes/plugins/toolsets/oom_kill.py
  • holmes/utils/memory_limit.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use Ruff for formatting and linting (configured in pyproject.toml)
Type hints required (mypy configuration in pyproject.toml)
ALWAYS place Python imports at the top of the file, not inside functions or methods

Files:

  • holmes/plugins/toolsets/oom_kill.py
  • holmes/utils/memory_limit.py
holmes/plugins/toolsets/**

📄 CodeRabbit inference engine (CLAUDE.md)

holmes/plugins/toolsets/**: Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Files:

  • holmes/plugins/toolsets/oom_kill.py
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: New toolsets require integration tests
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : Toolsets: holmes/plugins/toolsets/{name}.yaml or {name}/ directory structure

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/** : All toolsets MUST return detailed error messages from underlying APIs to enable LLM self-correction, including the exact query/command executed, time ranges, parameters, filters used, and full API error response

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/plugins/toolsets/bash.yaml : Bash toolset validates commands for safety

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
📚 Learning: 2025-12-21T13:17:48.366Z
Learnt from: CR
Repo: HolmesGPT/holmesgpt PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-21T13:17:48.366Z
Learning: Applies to holmes/core/tool_calling_llm.py : Implement structured tool calling with automatic retry and error handling

Applied to files:

  • holmes/plugins/toolsets/oom_kill.py
🧬 Code graph analysis (1)
holmes/plugins/toolsets/oom_kill.py (1)
holmes/core/tools.py (9)
  • CallablePrerequisite (529-530)
  • StructuredToolResult (80-106)
  • StructuredToolResultStatus (53-77)
  • Tool (178-370)
  • ToolInvokeContext (167-175)
  • ToolParameter (158-164)
  • Toolset (542-786)
  • ToolsetTag (146-149)
  • ToolsetEnvironmentPrerequisite (538-539)
🪛 Ruff (0.14.10)
holmes/plugins/toolsets/oom_kill.py

41-41: Unused method argument: context

(ARG002)


54-54: Local variable data is assigned to but never used

Remove assignment to unused variable data

(F841)


86-86: Unused method argument: config

(ARG002)

holmes/utils/memory_limit.py

40-40: Avoid specifying long messages outside the exception class

(TRY003)


56-56: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: build (3.10)
  • GitHub Check: build (3.11)
  • GitHub Check: build (3.12)
  • GitHub Check: llm_evals
  • GitHub Check: build
🔇 Additional comments (2)
holmes/utils/memory_limit.py (1)

1-126: LGTM! Well-structured memory limit utilities.

The module provides comprehensive memory limit parsing, ulimit prefix generation, and OOM detection. The implementation is correct and handles edge cases appropriately.

The static analysis TRY003 warnings (lines 40, 56) about long exception messages are style suggestions and can be optionally refined by extracting messages to constants or custom exception classes if preferred.

holmes/plugins/toolsets/oom_kill.py (1)

65-84: Good safety guards for dangerous toolset.

The toolset correctly implements safety measures:

  • Disabled by default
  • Requires explicit environment variable ALLOW_HOLMES_OOMKILL_TOOLSET
  • Marked as experimental
  • Clear warning in description

This follows best practices for potentially destructive operations.

Comment on lines +41 to +56
def _invoke(self, params: dict, context: ToolInvokeContext) -> StructuredToolResult:
hold_seconds = params.get("hold_seconds", 300)
if not isinstance(hold_seconds, int) or hold_seconds <= 0:
return StructuredToolResult(
status=StructuredToolResultStatus.ERROR,
error="hold_seconds must be a positive integer.",
params=params,
)

size_bytes = 30 * 1024 * 1024 * 1024
print(
f"Allocating {{size_bytes / 1024 / 1024 / 1024:.0f}} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s"
)
data = bytearray(size_bytes) # type: ignore
time.sleep({hold_seconds})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical syntax errors and missing return statement.

The _invoke method has multiple critical issues that will cause runtime failures:

  1. Line 52: The f-string uses double braces {{size_bytes...}}, which escapes them and produces literal text "{size_bytes / 1024 / 1024 / 1024:.0f}" instead of interpolating the value.

  2. Line 55: time.sleep({hold_seconds}) creates a set and passes it to sleep, which expects a number. This will raise TypeError: 'set' object cannot be interpreted as an integer.

  3. Missing return: The method doesn't return a StructuredToolResult on the success path, violating the tool interface contract.

Additionally, as noted in past reviews, bytearray(size_bytes) on Linux with memory overcommitment may only reserve virtual memory without allocating physical pages. To reliably trigger OOM, pages must be written to.

🔎 Proposed fix for syntax errors, missing return, and reliable OOM triggering
     def _invoke(self, params: dict, context: ToolInvokeContext) -> StructuredToolResult:
         hold_seconds = params.get("hold_seconds", 300)
         if not isinstance(hold_seconds, int) or hold_seconds <= 0:
             return StructuredToolResult(
                 status=StructuredToolResultStatus.ERROR,
                 error="hold_seconds must be a positive integer.",
                 params=params,
             )
 
         size_bytes = 30 * 1024 * 1024 * 1024
         print(
-            f"Allocating {{size_bytes / 1024 / 1024 / 1024:.0f}} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s"
+            f"Allocating {size_bytes / 1024 / 1024 / 1024:.0f} GB of memory to intentionally trigger OOM kill; sleeping for {hold_seconds}s"
         )
         data = bytearray(size_bytes)  # type: ignore
-        time.sleep({hold_seconds})
+        # Touch every page to force physical allocation
+        page_size = 4096
+        for i in range(0, size_bytes, page_size):
+            data[i] = 1
+        print("Memory allocation complete")
+        time.sleep(hold_seconds)
+        
+        return StructuredToolResult(
+            status=StructuredToolResultStatus.SUCCESS,
+            data="OOM kill triggered successfully",
+            params=params,
+        )
🧰 Tools
🪛 Ruff (0.14.10)

41-41: Unused method argument: context

(ARG002)


54-54: Local variable data is assigned to but never used

Remove assignment to unused variable data

(F841)

🤖 Prompt for AI Agents
In holmes/plugins/toolsets/oom_kill.py around lines 41 to 56, fix three issues:
replace the escaped f-string braces so the GB value is interpolated correctly,
change time.sleep({hold_seconds}) to pass the numeric hold_seconds value (no
braces), and ensure the method returns a StructuredToolResult on success;
additionally to reliably trigger OOM on Linux, touch/write to the allocated
memory pages (e.g., iterate writing to each page or use memoryview writes) so
physical pages are committed, and wrap allocation/sleep in try/except to return
a StructuredToolResult with ERROR on exceptions and a SUCCESS result on
completion.

@github-actions
Copy link
Contributor

Results of HolmesGPT evals

  • ask_holmes: 4/37 test cases were successful, 23 regressions, 6 setup failures, 3 mock failures
Test suite Test case Status
ask 01_how_many_pods 🚧
ask 02_what_is_wrong_with_pod 🚧
ask 04_related_k8s_events
ask 05_image_version
ask 09_crashpod
ask 10_image_pull_backoff
ask 110_k8s_events_image_pull
ask 11_init_containers
ask 13a_pending_node_selector_basic
ask 14_pending_resources
ask 15_failed_readiness_probe
ask 163_compaction_follow_up
ask 17_oom_kill 🚧
ask 18_oom_kill_from_issues_history
ask 19_detect_missing_app_details
ask 20_long_log_file_search
ask 24_misconfigured_pvc
ask 24a_misconfigured_pvc_basic
ask 28_permissions_error 🚧
ask 39_failed_toolset
ask 41_setup_argo
ask 42_dns_issues_steps_new_tools ⚠️
ask 43_current_datetime_from_prompt
ask 45_fetch_deployment_logs_simple
ask 51_logs_summarize_errors
ask 53_logs_find_term
ask 54_not_truncated_when_getting_pods
ask 59_label_based_counting 🚧
ask 60_count_less_than
ask 61_exact_match_counting 🚧
ask 63_fetch_error_logs_no_errors
ask 79_configmap_mount_issue
ask 83_secret_not_found
ask 86_configmap_like_but_secret
ask 93_calling_datadog[0] 🔧
ask 93_calling_datadog[1] 🔧
ask 93_calling_datadog[2] 🔧

Legend

  • ✅ the test was successful
  • :minus: the test was skipped
  • ⚠️ the test failed but is known to be flaky or known to fail
  • 🚧 the test had a setup failure (not a code regression)
  • 🔧 the test failed due to mock data issues (not a code regression)
  • 🚫 the test was throttled by API rate limits/overload
  • ❌ the test failed and should be fixed before merging the PR

@aantn aantn marked this pull request as draft December 28, 2025 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants