Phase B.2 — Claude Code hooks for agent-runtime

Date: 2026-05-03 Phase: B.2 (telemetry — agent-side hooks) Predecessor: B.1 (m8trx-claude-isolate wrapper telemetry patch) Successor: B.3 (host-side heartbeat)

Goal

Install Claude Code hooks inside the M8trx agent-runtime container that emit one tool_call event to the brain ingestion API for every tool Claude invokes. Operators get fleet-wide visibility into "what tools are being called, by which agent, how often, with what payload sizes" — without ever seeing tool argument or output content.

Why now

Phase B.1 instrumented the wrapper (m8trx-claude-isolate), which runs outside the agent-runtime container and emits session.start / session.end from the bash EXIT trap. The wrapper can't see anything that happens inside Claude — including which tools fire. Phase B.2 closes that gap with hooks running inside the container.

Architecture

                     ┌─ paperclipai container ─┐
                     │  m8trx-claude-isolate    │
                     │  (B.1 wrapper)           │ ── session.start ───┐
                     │  trap EXIT → session.end │ ── session.end  ────┼──► brain /v1/events
                     │                          │                     │
                     │  docker run -e BRAIN_*   │                     │
                     │     │                    │                     │
                     │     ▼                    │                     │
                     │  ┌─ agent-runtime ────┐  │                     │
                     │  │  claude (Node)     │  │                     │
                     │  │   ↓ PostToolUse    │  │                     │
                     │  │  brain-hook (sh)   │ ─┼─ tool_call ─────────┘
                     │  │  reads stdin JSON  │  │
                     │  │  curl POST 2s t/o  │  │
                     │  │  exit 0 always     │  │
                     │  └────────────────────┘  │
                     └──────────────────────────┘

Env vars are passed by the wrapper via docker run -e (already wired in B.1, see agent-artifacts/m8trx-claude-isolate.modified lines 245–254): BRAIN_URL, BRAIN_API_KEY, BRAIN_AGENT_ID, BRAIN_RUN_ID. The hook reads these from its env. If BRAIN_URL or BRAIN_API_KEY is unset, the hook is a silent no-op — same opt-in pattern as the wrapper.

Hook coverage

PostToolUse only, matcher *. Fires once per completed tool call.

The Stop hook was considered (originally listed in docs/RESUME.md) but rejected: it would only emit session.end events that duplicate what the wrapper's EXIT trap already produces. Adding more hook types (UserPromptSubmit, PreToolUse) is out of scope for MVP — easy to add later when fleet data tells us what we're missing.

Components

Three files, all under agent-artifacts/claude-hooks/:

1. settings.json

The Claude Code hooks config. Installed at /etc/claude-code/managed-settings.json inside the agent-runtime container — the highest-priority settings tier, which the customer cannot override by dropping a .claude/settings.json into their workspace.

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "*",
        "hooks": [
          { "type": "command", "command": "/usr/local/bin/brain-hook" }
        ]
      }
    ]
  }
}

2. brain-hook

POSIX sh script. Installed at /usr/local/bin/brain-hook, mode 0755, root-owned, world-readable+executable (the node user inside the container needs to execute it).

Behaviour:

  1. Read stdin (Claude Code's PostToolUse JSON).
  2. If BRAIN_URL, BRAIN_API_KEY, or BRAIN_AGENT_ID is unset → exit 0 immediately (no POST attempted).
  3. Parse stdin with jq:
  4. Build event JSON (see Event payload contract below).
  5. POST with curl --silent --max-time 2 --retry 0 to ${BRAIN_URL%/}/v1/events, with Authorization: Bearer ${BRAIN_API_KEY} and Content-Type: application/json.
  6. On any failure (jq parse, curl exit non-zero, brain non-2xx): if BRAIN_DEBUG=1, write one line to stderr; otherwise silent.
  7. Always exit 0.

Implementation notes:

3. README.md

Short doc covering:

Event payload contract

{
  "event_id": "<uuid v4>",
  "ts": "2026-05-03T19:42:11.000Z",
  "event_type": "tool_call",
  "agent_id": "<$BRAIN_AGENT_ID>",
  "run_id":   "<$BRAIN_RUN_ID, omitted if empty>",
  "payload": {
    "tool_name":    "<from stdin .tool_name>",
    "input_bytes":  <int>,
    "output_bytes": <int>
  }
}

Env contract

Var Set by Required for telemetry Hook behaviour if unset
BRAIN_URL wrapper (docker run -e) yes silent no-op
BRAIN_API_KEY wrapper yes silent no-op
BRAIN_AGENT_ID wrapper (from $PAPERCLIP_AGENT_ID) yes silent no-op
BRAIN_RUN_ID wrapper (from $PAPERCLIP_RUN_ID) no field omitted from payload
BRAIN_DEBUG operator override (docker run -e BRAIN_DEBUG=1) no suppress all stderr (default)

Error handling

Failure Behaviour
BRAIN_URL or BRAIN_API_KEY unset Exit 0 immediately, no POST.
BRAIN_AGENT_ID unset Exit 0 immediately. (Should never happen with the wrapper; defensive.)
jq parse fails on stdin Exit 0. If BRAIN_DEBUG=1, log "brain-hook: stdin parse failed".
curl fails at the transport layer (timeout, DNS, connection refused) Exit 0. If BRAIN_DEBUG=1, log "brain-hook: POST failed (curl exit N)".
Brain returns non-2xx (4xx bad event shape, 5xx server error) Exit 0. If BRAIN_DEBUG=1, log status + response body. This is the "we shipped a bad hook" canary. The hook must explicitly check HTTP status — curl --silent alone exits 0 for any received response, so the implementation needs --fail/--fail-with-body or an explicit --write-out '%{http_code}' capture.
Hook crashes (syntax error) set -e + EXIT trap forces exit 0.

The exit 0-always invariant matters because Claude Code treats a non-zero hook exit as a tool-blocking signal. Telemetry must never break the customer's task.

Stderr behaviour: silent by default. With BRAIN_DEBUG=1, each failure writes one short line. Claude Code surfaces hook stderr in its transcript log; it does not dump it into the model's context or the user's terminal, so debug-mode output is operator-visible only.

Testing

Standalone hook test (in scope for this phase)

A bin/test-brain-hook.sh script in the brain repo, runnable against the local brain server, covering:

  1. Happy path — pipe a valid PostToolUse JSON to the hook with correct env, then verify the event landed by querying postgres directly: docker compose exec -T postgres psql -U brain brain -c "select event_type, agent_id, run_id, payload from events order by ts desc limit 1". Expect tool_call, the right agent_id, the right run_id (or NULL), and payload matching {"tool_name":..., "input_bytes":..., "output_bytes":...}. Direct psql is used because the dashboard API exposes only aggregated rollups (/customers, /agents, /tools, /overview), not individual event lookups.
  2. No telemetry configBRAIN_URL unset, no event posted.
  3. Bad URL with debug onBRAIN_URL=http://192.0.2.1:1 (RFC 5737 unroutable) + BRAIN_DEBUG=1, verify one stderr line within ~2s and exit 0.
  4. Malformed stdin with debug onecho "not json" | … + BRAIN_DEBUG=1, verify one "stdin parse failed" stderr line and exit

Bootstrap: the script mints a fresh test key via docker compose exec brain-api node bin/mint-key.js cust_m8trx_test "M8TRX Test" (the original key from yesterday's session was shown once and is no longer recoverable).

End-to-end test inside agent-runtime container — deferred

Real in-container e2e requires editing M8trxAgent's images/agent-runtime/Dockerfile to add jq and COPY the two files. The RESUME plan explicitly defers that integration step (this phase only documents the Dockerfile diff in README.md). True e2e lands when M8trxAgent's Dockerfile takes its first non-placeholder commit.

Out of scope

Open questions

None at design-approval time. All five clarifying questions were resolved interactively in the brainstorming session before this spec was written.