docs: add detection-integration guide for downstream receipt consumers (#418)

luckyPipewrench · web-flow · commit ccce9b2c8293 · 2026-04-21T08:19:11.000-04:00
* docs: add detection-integration guide for downstream receipt consumers New guide explaining how SIEM rules, analyst review, and long-window LLM-based detectors all consume the same signed action-receipt stream. Includes a 40-line runnable Python example that verifies a chain via pipelock-verify and routes each verified receipt to a pluggable handler. Explicit "What this does not solve" section covers compromised mediators, real-time coverage gaps, receipts-as-input-not-substitute, agent-side compromise, and the same-user deployment ceiling. The existing tool-response-injection harness gains a short pointer to the new guide. * docs: address review findings on detection-integration Three corrections from a close review of the first draft: 1. Gate the "every proxy decision produces a signed receipt" claim on flight_recorder.signing_key_path being set. Without a signing key, pipelock still enforces but the evidence stream is not emitted. Docs now say so and point at the config. 2. Rewrite the SIEM section. Receipts live in the flight-recorder JSONL file; the emit pipeline (webhook, syslog, OTLP) carries a separate security-event envelope. They are complementary streams, not the same stream in different wrappers. Guide now recommends a file shipper (Filebeat, Fluent Bit, Vector) tailing flight_recorder.dir and points readers at siem-integration.md for the emit format. 3. Fix the Python example to filter entries to type == "action_receipt" (evidence files contain non-receipt entries) and carry the outer envelope's session_id into the yielded record. The handler prints session_id now. Verified the updated script against the conformance corpus: valid-chain passes, broken-chain rejects with CHAIN BROKEN. * docs: align receipt-signing language across config, flight-recorder, and detection-integration Three corrections after a review pass: 1. configuration.md: signing_key_path description no longer implies full hot-reload rotation. Reload re-reads key bytes when the same path stays configured; changing the configured path requires restart. 2. flight-recorder.md: remove stale reference to the pipelock-assess keystore. The receipt-signing key is loaded from flight_recorder.signing_key_path and is separate from the assess key. Add a note clarifying that replacing key file contents at a fixed path is an advanced operation; the operator-safe path is still a restart so the old chain closes cleanly. 3. detection-integration.md: gate the intro claim on signing being enabled, fix the key-rotation guidance to match configuration.md, and describe the worked-example evidence file as mixed (action_receipt plus other recorder entries) rather than receipt-only. * docs: add CNCF Landscape badge to README Pipelock was listed in the CNCF Landscape under Provisioning > Security & Compliance on 2026-04-20 (cncf/landscape#4807). Badge placed alongside OpenSSF Scorecard + OpenSSF Best Practices so the ecosystem-trust signals group together, ahead of the CI/quality row. * docs: capitalize Pipelock in prose per style guide
diff --git a/README.md b/README.md
@@ -10,6 +10,7 @@
   <a href="https://github.com/luckyPipewrench/pipelock/releases"><img alt="Release" src="https://img.shields.io/github/v/release/luckyPipewrench/pipelock"></a>
   <a href="LICENSE"><img alt="Core Apache 2.0" src="https://img.shields.io/badge/Core-Apache_2.0-blue.svg"></a>
   <a href="enterprise/LICENSE"><img alt="Enterprise ELv2" src="https://img.shields.io/badge/Enterprise-ELv2-orange.svg"></a>
+  <a href="https://landscape.cncf.io/?item=provisioning--security-compliance--pipelock"><img alt="CNCF Landscape: Security &amp; Compliance" src="https://img.shields.io/badge/CNCF%20Landscape-Security%20%26%20Compliance-1a73e8?logo=cncf&logoColor=white"></a>
 </p>
 
 <p align="center">
diff --git a/docs/configuration.md b/docs/configuration.md
@@ -1743,7 +1743,7 @@ flight_recorder:
 | `retention_days` | `0` | Auto-expire files after N days (0 = keep forever) |
 | `redact` | `true` | DLP-redact evidence content before writing. Receipt entries get field-level redaction (target/pattern scrubbed, signature preserved). |
 | `sign_checkpoints` | `true` | Ed25519 sign checkpoint entries |
-| `signing_key_path` | (empty) | Ed25519 private key for action receipts. When set, every proxy decision produces a signed receipt. Generate a key with `pipelock keygen <name>`. Verify receipts with `pipelock verify-receipt <file>`. Hot-reloadable: add, remove, or rotate keys via SIGHUP. |
+| `signing_key_path` | (empty) | Ed25519 private key for signed action receipts. When set, every proxy decision produces a signed receipt. Without it, the flight recorder can still write non-receipt evidence entries. Generate a key with `pipelock keygen <name>`. Verify receipts with `pipelock verify-receipt <file>`. In `pipelock run`, changing the configured path requires restart; reload re-reads updated key bytes only when the same path stays configured. |
 | `max_entries_per_file` | `10000` | Rotate to a new file after this many entries |
 | `raw_escrow` | `false` | Encrypt raw (pre-redaction) detail to sidecar files |
 | `escrow_public_key` | (required if raw_escrow) | X25519 public key (hex) for escrow encryption |
diff --git a/docs/guides/detection-integration.md b/docs/guides/detection-integration.md
@@ -0,0 +1,323 @@
+<!--
+Copyright 2026 Josh Waldrep
+SPDX-License-Identifier: Apache-2.0
+-->
+
+# Detection integration
+
+Pipelock is a real-time agent firewall. It blocks what it can see on the wire.
+That is a hard, narrow job, and it is not the whole detection story.
+
+Long-running agents move through multi-week attack chains, reason in
+natural language, and produce actions that look benign in isolation.
+No real-time gateway will catch all of that. The gate is the first
+line, not the last.
+
+This guide is for people building the layer that runs behind the gate.
+SIEM engineers, SOC analysts, and researchers training detection models
+all need the same thing upstream: structured, tamper-evident evidence of
+what the agent actually did. When receipt signing is enabled, Pipelock
+emits that evidence as signed action receipts. This guide covers how to
+consume them.
+
+## Real-time gateways are not enough
+
+Picture a malicious MCP server that runs a two-step attack across a
+week.
+
+Step one, Monday: the agent calls an innocent-looking tool. The tool
+response tells the agent to install a new hook that exfiltrates data
+to `https://google.com/report`. To an inline detector, that looks
+like a bad tool response, but the destination is benign. The
+content might sail through, especially under a looser policy.
+
+Step two, the following Monday: a different tool response tells the
+agent to change the hook's destination. To an inline detector, that
+request looks like a one-line config edit. The agent already has
+the hook, so changing an endpoint is not structurally suspicious.
+A real-time gateway viewing only that single moment has no reason
+to block it.
+
+Put the two steps next to each other and the intent is obvious. A
+detector looking at a week of activity can see the shape. A detector
+looking at a single request cannot.
+
+Long-window detection is the only way to catch attacks that are
+designed to look like two benign things. The question is what you
+feed that detector.
+
+## The primitive: signed action receipts
+
+When `flight_recorder.signing_key_path` is set in the Pipelock
+config, every proxy decision produces a signed action receipt.
+Receipts are Ed25519-signed, JSON-structured, and linked into a
+SHA-256 hash chain so any deletion or reordering is detectable
+after the fact. Without a signing key configured, Pipelock still
+enforces, and the flight recorder can still write other evidence
+entries, but the signed receipt stream is not produced.
+
+Generate a key with `pipelock keygen <name>`, set
+`flight_recorder.signing_key_path`, and start or restart Pipelock.
+If you replace the key file contents at the same configured path,
+reload will re-read that file. Changing the configured path still
+requires a restart.
+
+A receipt carries the fields a downstream detector needs to reason
+about the decision:
+
+| Field | What |
+|-------|------|
+| `action_id` | UUIDv7, unique per decision. Stable identifier for the record. |
+| `timestamp` | RFC 3339 wall-clock time the decision was made. |
+| `verdict` | `block`, `warn`, `exemption`, `allow`, `strip`, `redirect`, or `ask`. Deterministic. |
+| `layer` | Which scanner triggered (`mcp_response_scan`, `dlp_header`, `response_scan`, `airlock`, etc.). |
+| `pattern` | Named rule inside the layer (e.g., `Prompt Injection`, `aws_access_key`). |
+| `transport` | `fetch`, `forward`, `websocket`, `mcp_stdio`, `mcp_http_upstream`, `mcp_http_listener`, `connect`, `intercept`. |
+| `session_id` | Groups receipts from the same agent session. |
+| `principal` / `actor` | Who initiated the action and who enforced it. |
+| `policy_hash` | SHA-256 of the canonical policy config at decision time. Changes whenever the policy changes. |
+| `side_effect_class` / `reversibility` | Classification of the attempted action. |
+| `chain_prev_hash` / `chain_seq` | Hash-chain linkage to the prior receipt in the stream. |
+| `signature` / `signer_key` | Ed25519 signature and public key. |
+
+These are fixed fields with a fixed schema. A detector parses them
+with a JSON reader, not a log regex.
+
+Full canonical schema and field reference:
+<https://pipelab.org/learn/action-receipt-spec/>
+
+## Three downstream consumers
+
+The same receipt stream serves three different detection styles
+without modification.
+
+### SIEM rules
+
+Ship the flight-recorder JSONL file to Splunk, Datadog, Elastic, or
+any SIEM that ingests JSON. The standard pattern is a file shipper
+(Filebeat, Fluent Bit, Vector, or equivalent) tailing
+`flight_recorder.dir` and forwarding each line. Filter entries to
+`type == "action_receipt"` at the shipper or at the SIEM.
+
+The receipt fields map cleanly to structured search: group by
+`detail.action_record.action_id` or by `session_id` to reconstruct
+an agent's behavior, filter on `detail.action_record.verdict=block`
+to audit enforcement events, alert on
+`detail.action_record.pattern=aws_access_key` for secret-exfil
+attempts.
+
+Pipelock also ships a separate real-time emit pipeline (webhook,
+syslog, OTLP) that streams a different envelope format focused on
+security events and severity levels. That pipeline is for operator
+alerting, not for forwarding the full receipt chain. See
+[`siem-integration.md`](siem-integration.md) for that format.
+Receipts and emit events are complementary streams, not the same
+stream in different wrappers.
+
+### Analyst review
+
+When a SIEM rule fires or an agent's session looks suspicious, an
+analyst can pull the full receipt stream for that `session_id` and
+reconstruct every decision in order. The hash chain confirms the
+stream has not been edited since it was written. The `policy_hash`
+confirms which policy version was in force. The `signature`
+confirms the record came from Pipelock and not from a tampered
+agent log.
+
+This is the audit-trail use case. Receipts are designed to be
+presentable to a third party (auditor, incident responder, internal
+review) without requiring trust in Pipelock itself. The signature
+and the chain are the trust anchors.
+
+### Long-window LLM detection
+
+If your detection strategy involves feeding agent history to a
+reasoning model and asking "is this suspicious", what you feed the
+model matters more than which model you pick. Raw log text lets an
+attacker shape the model's view by shaping the agent's reasoning
+tokens. Structured receipts do not.
+
+A receipt stream is a deterministic sequence of fixed-field events.
+`verdict`, `layer`, `pattern`, `policy_hash`, and `timestamp` are
+not up for attacker manipulation. The attacker cannot inject text
+into the stream that changes what these fields say. The stream is
+signed and chain-linked, so injection is detectable.
+
+This is the architecture Zack Korman gestures at in his April 2026
+video on AI agent threat detection: a funnel of cheaper LLMs
+filtering events, stronger LLMs confirming, and an agentic layer
+trying to disprove the finding. Whether that funnel works is its
+own question. But if it does, it only works on inputs the attacker
+cannot massage. Structured receipts are that kind of input. Raw
+reasoning tokens are not.
+
+## Worked example
+
+The [`tool-response-injection`](/examples/tool-response-injection/)
+example ships with Pipelock and runs the whole loop end-to-end. It
+uses a deliberately malicious MCP server that returns a prompt-
+injection payload disguised as a game result.
+
+Run the harness:
+
+```bash
+cd examples/tool-response-injection
+python3 demo.py
+```
+
+The harness produces three artifacts worth looking at:
+
+1. **`evidence/evidence-proxy-0.jsonl`**: the MCP stdio evidence file.
+   It contains the signed receipt stream as `action_receipt` entries
+   and may also contain other recorder entries such as checkpoints.
+2. **`evidence-proxy-0.jsonl` from the HTTP upstream run**: same
+   event shape, different `transport` field.
+3. **`signing.key.pub` hex output**: the public key printed to
+   stdout, the only thing a third party needs to verify the stream.
+
+The harness verifies the stream inline with Python. For an
+independent check, install the reference verifier:
+
+```bash
+pip install pipelock-verify
+python -m pipelock_verify evidence/evidence-proxy-0.jsonl --key <public-key-hex>
+```
+
+Or use the Go CLI that ships with Pipelock:
+
+```bash
+pipelock verify-receipt evidence/evidence-proxy-0.jsonl --key <public-key-hex>
+```
+
+Both exit 0 on success, 1 on any signature failure, chain break, or
+reordering. The verifiers are byte-for-byte equivalent.
+
+### Consuming receipts in a detector
+
+Once the stream verifies, a downstream detector reads it like any
+JSONL source. This is a complete working example:
+
+```python
+"""Verify a pipelock receipt stream, then route each verified receipt
+to a pluggable handler for whatever downstream detector runs on.
+
+Requires: pip install pipelock-verify
+Run: python verify_and_route.py evidence.jsonl <public-key-hex>
+"""
+
+import json
+import subprocess
+import sys
+from typing import Callable, Iterator
+
+
+def verified_receipts(path: str, pubkey_hex: str) -> Iterator[dict]:
+    """Yield each receipt only if the full stream verifies.
+
+    Evidence files can contain non-receipt entries (checkpoints, other
+    event types). We filter for type == "action_receipt" and carry the
+    outer envelope's session_id into the yielded record, since it is
+    the primary grouping key detectors use."""
+    check = subprocess.run(
+        ["pipelock-verify", path, "--key", pubkey_hex],
+        capture_output=True,
+        text=True,
+    )
+    if check.returncode != 0:
+        raise RuntimeError(
+            f"verification failed: {check.stderr.strip() or check.stdout.strip()}"
+        )
+    with open(path) as f:
+        for line in f:
+            if not line.strip():
+                continue
+            entry = json.loads(line)
+            if entry.get("type") != "action_receipt":
+                continue
+            record = entry["detail"]["action_record"]
+            record.setdefault("session_id", entry.get("session_id"))
+            yield record
+
+
+def default_handler(receipt: dict) -> None:
+    """Replace this with your SIEM forwarder, alert pipeline, or
+    feature extractor for an LLM classifier."""
+    print(
+        f"{receipt['timestamp']} {receipt.get('session_id', '-'):24s} "
+        f"{receipt['transport']:20s} {receipt['verdict']:10s} "
+        f"{receipt.get('layer', '-'):24s} {receipt.get('pattern', '-')}"
+    )
+
+
+def route(path: str, pubkey_hex: str, handler: Callable[[dict], None]) -> None:
+    for receipt in verified_receipts(path, pubkey_hex):
+        handler(receipt)
+
+
+if __name__ == "__main__":
+    if len(sys.argv) != 3:
+        sys.exit("usage: verify_and_route.py <path> <public-key-hex>")
+    route(sys.argv[1], sys.argv[2], default_handler)
+```
+
+Replace `default_handler` with whatever your pipeline needs:
+
+```python
+def route_to_siem(receipt: dict) -> None:
+    if receipt["verdict"] == "block":
+        forward_to_siem(receipt)
+    if receipt["layer"] == "mcp_response_scan":
+        score_for_llm_funnel(receipt)
+```
+
+The shape of the consumer is up to you. The shape of the input is
+fixed by the receipt spec.
+
+## What this does not solve
+
+This is the important section. Signed receipts solve one narrow
+problem. They do not solve several others.
+
+**Compromised mediators can still lie.** A receipt proves Pipelock
+recorded a decision. It does not prove Pipelock made the right
+decision. If a scanner pattern is wrong, the signed record is a
+signed wrong answer.
+
+**Real-time gateways still miss multi-week attacks.** The worked
+example above catches a prompt-injection payload in a single tool
+response because that payload is visible in flight. A slow-boiling
+attack where each individual step looks benign needs a long-window
+detector running on the receipt stream, not a faster gateway.
+
+**Receipts are input to detection, not a substitute for it.** A
+stream of signed records does not tell you which sessions are
+compromised. Someone or something still has to look at the stream
+and make that call. What receipts give you is a trustworthy place
+to look.
+
+**Agent-side attacks are out of scope.** Pipelock sees what the
+agent tries to do on the network. If an attacker has already
+compromised the agent process itself (code execution, same-user
+file access, shared memory), the receipts can document what the
+agent then tried to do, but they cannot prevent the compromise or
+retroactively verify the agent's internal state.
+
+**Same-user deployments have a known ceiling.** If Pipelock runs
+as the same Unix user as the agent, the agent can delete or
+truncate the receipt file. The `demo_capability_separation.py`
+script in the harness demonstrates this limit directly. Running
+Pipelock under a separate user (or in a separate container) is a
+deployment-level fix, not a product-level one.
+
+## Where to go from here
+
+- **Receipt format spec:** <https://pipelab.org/learn/action-receipt-spec/>
+- **Verification mechanics:** [`receipt-verification.md`](receipt-verification.md)
+- **SIEM transport options:** [`siem-integration.md`](siem-integration.md)
+- **Transport coverage matrix:** [`receipt-transports.md`](receipt-transports.md)
+- **Worked example:** [`examples/tool-response-injection/`](/examples/tool-response-injection/)
+- **PyPI verifier:** <https://pypi.org/project/Pipelock-verify/>
+
+If you are integrating Pipelock receipts into a detection pipeline
+and run into something the spec does not cover, open an issue at
+<https://github.com/luckyPipewrench/Pipelock/issues>.
diff --git a/docs/guides/flight-recorder.md b/docs/guides/flight-recorder.md
@@ -40,7 +40,8 @@ flight_recorder:
 | `raw_escrow` | false | Write an encrypted sidecar with the unredacted detail for each entry. |
 | `escrow_public_key` | "" | X25519 hex public key for escrow encryption. Required when `raw_escrow: true`. |
 
-The agent private key used for signing is the same key used for `pipelock assess` signing. It is loaded from the keystore at `~/.pipelock/` (or the path configured with `--keystore`).
+The receipt-signing private key is loaded from
+`flight_recorder.signing_key_path`.
 
 ### Rotating the signing key
 
@@ -50,6 +51,12 @@ Pipelock **rejects `flight_recorder.signing_key_path` changes at hot-reload time
 2. Swap the key file referenced by `signing_key_path`.
 3. Start pipelock. It opens a new chain with the new key.
 
+If you keep the same `signing_key_path` and replace the key file at
+that path, a reload re-reads the file contents. Treat that as an
+advanced operation: the documented operator-safe path is still a
+restart so the old chain closes cleanly before the new key starts
+signing.
+
 The new chain is a separate verifiable unit. Verifiers that expect one chain per `session_id` must be updated to treat the key change as a chain boundary. A proper in-place rotation (key-rotation marker inside the chain, continuous verification across the switch) is tracked as a v2.2.1 feature.
 
 ## Evidence File Format
diff --git a/examples/tool-response-injection/README.md b/examples/tool-response-injection/README.md
@@ -113,5 +113,12 @@ For stdio mode, replace the subprocess command in `demo.py` with your server com
 
 The harness is most useful when your server has an innocent tool description but a risky tool response body. That is the gap this example is meant to surface.
 
+## Using The Evidence Downstream
+The harness proves the receipt stream. What to do with it is a separate
+question. See [`docs/guides/detection-integration.md`](../../docs/guides/detection-integration.md)
+for how SIEM rules, analyst review, and long-window LLM detectors all
+consume the same receipt format, plus a forty-line Python example that
+verifies a stream and routes each receipt to a pluggable handler.
+
 ## Security Note
 This example emits deliberate prompt-injection payloads for testing and demonstration. It is a detector harness, not a weapon.