Discipline

The Agent Contract

Five rules an AI coding agent should follow when Roam is available.

When Roam is installed in a repo, agents have a local code graph, a call-graph, dependency edges, runtime hot-paths, git history, and an algorithmic-pattern library — all queryable through deterministic CLI commands or MCP tools. The contract below turns those capabilities into a five-step discipline.

Each step is one command. Each maps to one engineering question agents commonly miss. Together they prevent the failure modes we see in agent-generated PRs: missed clones, broken tests, accidental O(n²), risky refactors, hidden coupling.

Surface at a glance

Authoritative counts from roam surface --json (v13.2). The MCP server exposes a tunable preset surface so the agent sees what it needs, not the entire registry.

SurfaceCountNotes
CLI commands241234 canonical + 7 aliases.
MCP tools registered227Full registry — pick a subset via preset.
MCP tools in core preset57Default for new agent integrations.
Languages28Tier-1 extractors plus tree-sitter fallback.
Canonical envelope1One JSON shape across all 227 tools — parse once, branch on status.

1. Before editing — ask for context

$ roam context <symbol-or-file>

Returns the exact files + line ranges to read before changing a symbol, prioritised by callers, callees, and tests. Stops the agent reading the wrong files or skipping the relevant ones. Same data shape as roam preflight, but compact — ideal for prompt-context.

2. Before deleting — ask for impact

$ roam impact <symbol>

Full blast radius via Personalised PageRank: who calls this symbol, transitively, weighted by churn and runtime hotness. If the answer surfaces a caller the agent didn't know about, the deletion is unsafe. Pair with roam safe-delete <symbol> for the binary verdict.

3. Before merging — ask for critique

$ git diff main..HEAD | roam critique

Patch-level structural review. Catches clones-not-edited (the agent updated one of three near-identical implementations), layer violations (the diff just imported HTTP into the domain layer), and blast-radius spikes. Exits 5 on BLOCK findings — wire into CI directly.

4. Before refactoring — simulate first

$ roam simulate move <symbol> <new-file>
$ roam simulate extract <file> <line-range>
$ roam simulate delete <symbol>

Clone the graph, apply the transform, report what breaks before any source file changes. roam mutate applies the transform after the simulation passes; roam plan-refactor produces a step-by-step plan with caller-update ordering.

5. Before optimising — ask for algorithmic risk

$ roam math
$ roam algo --confidence high

The differentiator. Detects code that is correct but computationally wrong — exactly the class of patterns agents ship that pass tests and fail at scale: nested-loop lookups becoming O(n²), N+1 queries, regex compiled inside hot loops, repeated JSON parsing, quadratic string concatenation, branching recursion without memoisation.

Pairs with roam n1, roam missing-index, and roam hotspots for the full performance-shape sweep.

The whole contract, in one block

# Before any edit
$ roam context <target>

# Before any deletion
$ roam impact <target>
$ roam safe-delete <target>

# Before any merge
$ git diff | roam critique

# Before any refactor
$ roam simulate <op> <target>
$ roam plan-refactor <target>

# Before any optimisation pass
$ roam math
$ roam hotspots --danger

The canonical JSON envelope

Every roam --json <cmd> call and every MCP tool response is shaped by one canonical envelope. Marshal against this shape and the same code handles success, partial success, and the four failure modes Roam treats as first-class signal: missing prerequisite, intermediate-layer signal loss, empty stdout, and degraded resolution.

{
  "command": "<tool_name>",
  "status": "<index_not_built | advisory_warnings | partial_failure | hard_failure | usage_error | rate_limited | stale_index>",
  "isError": true,
  "summary": {
    "verdict": "<one-line, imperative, concrete-noun terminal>",
    "level": "<blocker | warning | info>",
    "partial_success": false,
    "state": "<machine-readable state>"
  },
  "error_code": "<closed enum>",
  "error": "<human-readable error text, never raw JSON dump>",
  "hint": "<imperative: what to do next>",
  "next_command": "<copy-pasteable roam command, when applicable>",
  "retry_after_seconds": 60,
  "agent_contract": {
    "facts": ["<concrete-noun anchored fact>", "..."],
    "next_commands": ["<copy-pasteable>", "# explanation..."]
  },
  "_meta": { "timestamp": "...", "index_age_s": 42 }
}

Hard invariants on the shape:

Worked example: the cold-start envelope an MCP tool returns when .roam/index.db does not exist yet — /docs/mcp-usage#the-cold-start-envelope.

Verdict vocabulary — concrete nouns only

Verdicts and agent_contract.facts end on a concrete-noun terminal so downstream models stay in analytical mode rather than collapsing to summary mode. A fact-string is anchored when its terminal token (last word, punctuation stripped) is one of the canonical anchors. The live set lives in src/roam/output/formatter.py and is mirrored by the LAW 4 lint at tests/test_law4_lint.py.

FamilyRepresentative anchors
Code structure files, symbols, edges, nodes, cycles, clusters, layers, modules, commands, tools, capabilities, imports, endpoints, dependencies, packages, routes
Findings findings, hotspots, smells, violations, warnings, errors, alerts, issues, gaps, leaks, secrets, vulnerabilities
Quality metrics keys, values, chars, lines, tokens, bytes, items, entries, records, fields
State qualifiers passed, failed, scanned, checked, affected, scored, confirmed, analyzed, skipped, reached
Time units days, weeks, months, years, hours, minutes, seconds, milliseconds

Wrong: "7 of 10 capabilities are AI-safe" (ends on AI-safe, not anchored). Right: "7 of 10 AI-safe capabilities" (ends on capabilities, anchored). When an agent generates a fact for its own internal reasoning, the same discipline keeps the next step actionable.

The eight evidence questions

Roam compiles every AI-assisted change into one ChangeEvidence packet that answers these eight questions. Each question maps to commands the agent already runs as part of the contract above — the packet just makes the answers portable for a reviewer.

QuestionPrimitives
Who acted?roam runs, roam replay, MCP receipt
What authority existed?roam mode, roam permit, roam lease, roam constitution
What context was read?roam context, roam retrieve, pr-bundle
What changed?roam diff, roam pr-analyze, roam workspace
What could break?roam impact, roam preflight, roam test-impact, roam vuln-reach
What policy applied?roam rules, roam laws, roam check-rules
What verified it?roam tests, roam critique, roam pr-bundle, roam runs verify
Who accepted risk?roam permit, roam pr-bundle, audit trail

Roam maps to and supports evidence for governance controls; it does not certify or make compliant. The evidence packet is portable input for the external GRC tool of your choice.

MCP boundary security

Roam owns the inside-server half of the agent's trust boundary: it runs locally as the same user as the editor, gates writes through a four-mode policy, and emits a tamper-evident decision receipt on every sensitive call. It is explicitly not a network gateway and does not proxy model traffic — gateway-class defences (semantic prompt-injection scanning, response interception, cross-server aggregation) are a complementary layer owned by your MCP host. See the Discussion #37 reply for the full inside-server-vs-gateway framing, and /docs/mcp-usage#security-stance for the public stance.

Egress secret redaction

Every MCP tool response passes through a structural secret-pattern scan on the egress path (redact_secrets_in_string / redact_secrets_in_value) before the bytes leave the server. Hits replace the secret with a stable placeholder and stamp the receipt's redactions field with the closed-enum reason — today the only emitted reason on the MCP path is secret; the canonical vocabulary admits secret, pii, sensitive_content, size_limit, policy, user_opt_in_required, machine_local_path, schema_strict, and producer_not_available. Patterns cover GitHub PAT (classic and fine-grained), OpenAI / Anthropic sk- keys, AWS AKIA, Bearer tokens, PEM private-key markers, and JWT. Per-pattern hit counts ride in extra["redaction_details"] as {pattern_id: hit_count}.

Four-mode policy enforcement

Every MCP wrapper resolves the caller's mode (read_only / safe_edit / migration / autonomous_pr) and checks the tool's required_mode before dispatch. The receipt's policy_decision is a closed enum — allow / deny / not_evaluated — reflecting an actual enforcement decision at the MCP boundary, not a hard-coded allow. Resolution priority (highest wins): explicit --mode flag, ROAM_AGENT_MODE env var, .roam/active_mode file, default safe_edit. A gateway can map external roles to Roam modes and pass the resolved mode in per call.

HMAC-linked decision receipts

Each receipt's sha256 content hash is linked into the HMAC-chained run ledger at .roam/runs/<run_id>/events.jsonl, so receipt tampering is detectable offline. verify_chain_with_receipts() in src/roam/runs/signing.py extends the standard four-state run-verify envelope (ok / tampered / unsigned / empty) with a receipt_integrity closed enum: ok (every linked receipt's on-disk sha256 matches), missing (a ledger event names a receipt file no longer on disk), tampered (a receipt file no longer hashes to the value the chain anchors), and not_linked (receipts exist on disk that no ledger event anchors). Pre-link chains hash byte-identical to before — no migration is needed.

Receipt schema export

The McpDecisionReceipt dataclass exports to a Draft 2020-12 JSON Schema via scripts/export_mcp_receipt_schema.py. Gateway and policy-enforcement-point developers can validate emitted receipts against the schema before tailing them into SIEM. The schema is schema-stable; field additions are additive.

What every McpDecisionReceipt carries

One JSON file per sensitive tool call, stored at .roam/mcp_receipts/<run_id>/<tool_call>.json. Frozen dataclass; canonical-JSON serialisation; stable sha256 content hash. Raw inputs and outputs are never stored — only their digests.

FieldPurpose
tool_callOpaque per-invocation id (<tool>_<12-hex>).
client_idMCP client process id from ROAM_MCP_CLIENT_ID.
tool_nameCanonical tool name (e.g. roam_preflight).
actor_ref_idAgent id from ROAM_AGENT_ID; ties to the ActorRef vocabulary.
declared_side_effectsTuple of read_only / write_filesystem / etc. from the tool registry.
required_moderead_only / safe_edit / migration / autonomous_pr.
input_hashsha256 of canonical-JSON input args. Never the args themselves.
policy_decisionClosed enum: allow / deny / not_evaluated.
output_ref / output_hashArtifact id for large output, or sha256 for small. Mutually exclusive.
run_event_idLink to .roam/runs/<id>/events.jsonl row.
redactionsClosed-enum tuple of redaction reasons; stable across versions.
extraFree-form structured detail (e.g. redaction_details per-pattern counts).

Reviewers and gateways consume receipts alongside the eight evidence questions on /docs/architecture#eight-evidence-questions and the worked PR-Replay packet under /audit#evidence.

Wire it into your agent

Roam ships a Model Context Protocol server and an auto-generated Claude Code skill. Setup is one command per editor — see /docs/integration-tutorials for Claude Code, Cursor, Codex, Gemini, and Amp.

Or run roam skill-generate --target claude to emit a SKILL.md directly from the live capability registry — the agent gets accurate, up-to-date instructions for every command without you hand-writing the prompt.

The principle behind the contract

Every command in Roam should answer a real engineering question that an agent or reviewer would ask. If a command doesn't map to a clear question, it's hidden, marked experimental, or improved. The goal isn't fewer commands — it's a clearer mental model.

Five questions × five commands = the discipline. Everything else is specialisation on top.

Where to next

Ready to try Roam? Install the free CLI · or read the command reference for the full surface.

See it run: The 5-minute canonical demo — install → health → preflight → critique → signed ChangeEvidence packet, end to end.

Agent onboarding companions: