When Roam is installed in a repo, agents have a local code graph, a call-graph, dependency edges, runtime hot-paths, git history, and an algorithmic-pattern library — all queryable through deterministic CLI commands or MCP tools. The contract below turns those capabilities into a five-step discipline.
Each step is one command. Each maps to one engineering question agents commonly miss. Together they prevent the failure modes we see in agent-generated PRs: missed clones, broken tests, accidental O(n²), risky refactors, hidden coupling.
Surface at a glance
Authoritative counts from roam surface --json
(v13.2). The MCP server exposes a tunable preset surface so the
agent sees what it needs, not the entire registry.
| Surface | Count | Notes |
|---|---|---|
| CLI commands | 241 | 234 canonical + 7 aliases. |
| MCP tools registered | 227 | Full registry — pick a subset via preset. |
MCP tools in core preset | 57 | Default for new agent integrations. |
| Languages | 28 | Tier-1 extractors plus tree-sitter fallback. |
| Canonical envelope | 1 | One JSON shape across all 227 tools — parse once, branch on status. |
1. Before editing — ask for context
$ roam context <symbol-or-file>
Returns the exact files + line ranges to read before changing a
symbol, prioritised by callers, callees, and tests. Stops the
agent reading the wrong files or skipping the relevant ones.
Same data shape as roam preflight, but compact —
ideal for prompt-context.
2. Before deleting — ask for impact
$ roam impact <symbol>
Full blast radius via Personalised PageRank: who calls this
symbol, transitively, weighted by churn and runtime hotness.
If the answer surfaces a caller the agent didn't know about,
the deletion is unsafe. Pair with
roam safe-delete <symbol> for the binary verdict.
3. Before merging — ask for critique
$ git diff main..HEAD | roam critique
Patch-level structural review. Catches clones-not-edited
(the agent updated one of three near-identical implementations),
layer violations (the diff just imported HTTP into the domain
layer), and blast-radius spikes. Exits 5 on
BLOCK findings — wire into CI directly.
4. Before refactoring — simulate first
$ roam simulate move <symbol> <new-file> $ roam simulate extract <file> <line-range> $ roam simulate delete <symbol>
Clone the graph, apply the transform, report what breaks before
any source file changes. roam mutate applies the
transform after the simulation passes; roam plan-refactor
produces a step-by-step plan with caller-update ordering.
5. Before optimising — ask for algorithmic risk
$ roam math $ roam algo --confidence high
The differentiator. Detects code that is correct but computationally wrong — exactly the class of patterns agents ship that pass tests and fail at scale: nested-loop lookups becoming O(n²), N+1 queries, regex compiled inside hot loops, repeated JSON parsing, quadratic string concatenation, branching recursion without memoisation.
Pairs with roam n1, roam missing-index,
and roam hotspots for the full performance-shape
sweep.
The whole contract, in one block
# Before any edit $ roam context <target> # Before any deletion $ roam impact <target> $ roam safe-delete <target> # Before any merge $ git diff | roam critique # Before any refactor $ roam simulate <op> <target> $ roam plan-refactor <target> # Before any optimisation pass $ roam math $ roam hotspots --danger
The canonical JSON envelope
Every roam --json <cmd> call and every MCP tool
response is shaped by one canonical envelope. Marshal against
this shape and the same code handles success, partial success,
and the four failure modes Roam treats as first-class signal:
missing prerequisite, intermediate-layer signal loss, empty
stdout, and degraded resolution.
{
"command": "<tool_name>",
"status": "<index_not_built | advisory_warnings | partial_failure | hard_failure | usage_error | rate_limited | stale_index>",
"isError": true,
"summary": {
"verdict": "<one-line, imperative, concrete-noun terminal>",
"level": "<blocker | warning | info>",
"partial_success": false,
"state": "<machine-readable state>"
},
"error_code": "<closed enum>",
"error": "<human-readable error text, never raw JSON dump>",
"hint": "<imperative: what to do next>",
"next_command": "<copy-pasteable roam command, when applicable>",
"retry_after_seconds": 60,
"agent_contract": {
"facts": ["<concrete-noun anchored fact>", "..."],
"next_commands": ["<copy-pasteable>", "# explanation..."]
},
"_meta": { "timestamp": "...", "index_age_s": 42 }
}
Hard invariants on the shape:
isError: truesits inside a successful JSON-RPC result; it is not a protocol-level error. Protocol errors do not reliably reach the LLM context window.- Partial results beat total failure. When the underlying
check produced any structured signal, the envelope carries it
(annotated with
partial_success: true) rather than collapsing to a genericCOMMAND_FAILED. - The
errorfield carries actionable guidance, never a raw exception trace. summary.verdictstands alone — an agent that reads only the verdict gets a complete next-action signal.next_command(when set) is a literal copy-paste-executableroam <subcommand>string, not a description.
Worked example: the cold-start envelope an MCP tool returns
when .roam/index.db does not exist yet —
/docs/mcp-usage#the-cold-start-envelope.
Verdict vocabulary — concrete nouns only
Verdicts and agent_contract.facts end on a
concrete-noun terminal so downstream models stay in
analytical mode rather than collapsing to summary mode. A
fact-string is anchored when its terminal token (last word,
punctuation stripped) is one of the canonical anchors. The
live set lives in
src/roam/output/formatter.py and is mirrored by
the LAW 4 lint at tests/test_law4_lint.py.
| Family | Representative anchors |
|---|---|
| Code structure | files, symbols, edges, nodes, cycles, clusters, layers, modules, commands, tools, capabilities, imports, endpoints, dependencies, packages, routes |
| Findings | findings, hotspots, smells, violations, warnings, errors, alerts, issues, gaps, leaks, secrets, vulnerabilities |
| Quality metrics | keys, values, chars, lines, tokens, bytes, items, entries, records, fields |
| State qualifiers | passed, failed, scanned, checked, affected, scored, confirmed, analyzed, skipped, reached |
| Time units | days, weeks, months, years, hours, minutes, seconds, milliseconds |
Wrong: "7 of 10 capabilities are AI-safe" (ends on
AI-safe, not anchored). Right:
"7 of 10 AI-safe capabilities" (ends on
capabilities, anchored). When an agent generates a
fact for its own internal reasoning, the same discipline keeps
the next step actionable.
The eight evidence questions
Roam compiles every AI-assisted change into one
ChangeEvidence packet that answers these eight
questions. Each question maps to commands the agent already
runs as part of the contract above — the packet just makes the
answers portable for a reviewer.
| Question | Primitives |
|---|---|
| Who acted? | roam runs, roam replay, MCP receipt |
| What authority existed? | roam mode, roam permit, roam lease, roam constitution |
| What context was read? | roam context, roam retrieve, pr-bundle |
| What changed? | roam diff, roam pr-analyze, roam workspace |
| What could break? | roam impact, roam preflight, roam test-impact, roam vuln-reach |
| What policy applied? | roam rules, roam laws, roam check-rules |
| What verified it? | roam tests, roam critique, roam pr-bundle, roam runs verify |
| Who accepted risk? | roam permit, roam pr-bundle, audit trail |
Roam maps to and supports evidence for governance controls; it does not certify or make compliant. The evidence packet is portable input for the external GRC tool of your choice.
MCP boundary security
Roam owns the inside-server half of the agent's trust boundary: it runs locally as the same user as the editor, gates writes through a four-mode policy, and emits a tamper-evident decision receipt on every sensitive call. It is explicitly not a network gateway and does not proxy model traffic — gateway-class defences (semantic prompt-injection scanning, response interception, cross-server aggregation) are a complementary layer owned by your MCP host. See the Discussion #37 reply for the full inside-server-vs-gateway framing, and /docs/mcp-usage#security-stance for the public stance.
Egress secret redaction
Every MCP tool response passes through a structural secret-pattern
scan on the egress path
(redact_secrets_in_string /
redact_secrets_in_value) before the bytes leave the
server. Hits replace the secret with a stable placeholder and
stamp the receipt's redactions field with the
closed-enum reason — today the only emitted reason on the MCP
path is secret; the canonical vocabulary admits
secret, pii,
sensitive_content, size_limit,
policy, user_opt_in_required,
machine_local_path, schema_strict,
and producer_not_available. Patterns cover GitHub
PAT (classic and fine-grained), OpenAI / Anthropic
sk- keys, AWS AKIA, Bearer tokens, PEM private-key
markers, and JWT. Per-pattern hit counts ride in
extra["redaction_details"] as
{pattern_id: hit_count}.
Four-mode policy enforcement
Every MCP wrapper resolves the caller's mode
(read_only / safe_edit /
migration / autonomous_pr) and checks
the tool's required_mode before dispatch. The
receipt's policy_decision is a closed enum —
allow / deny /
not_evaluated — reflecting an actual enforcement
decision at the MCP boundary, not a hard-coded allow. Resolution
priority (highest wins): explicit --mode flag,
ROAM_AGENT_MODE env var,
.roam/active_mode file, default
safe_edit. A gateway can map external roles to
Roam modes and pass the resolved mode in per call.
HMAC-linked decision receipts
Each receipt's sha256 content hash is linked into the
HMAC-chained run ledger at
.roam/runs/<run_id>/events.jsonl, so receipt
tampering is detectable offline.
verify_chain_with_receipts() in
src/roam/runs/signing.py extends the standard
four-state run-verify envelope
(ok / tampered /
unsigned / empty) with a
receipt_integrity closed enum:
ok (every linked receipt's on-disk sha256 matches),
missing (a ledger event names a receipt file no
longer on disk), tampered (a receipt file no longer
hashes to the value the chain anchors), and
not_linked (receipts exist on disk that no ledger
event anchors). Pre-link chains hash byte-identical to before —
no migration is needed.
Receipt schema export
The McpDecisionReceipt dataclass exports to a
Draft 2020-12 JSON Schema via
scripts/export_mcp_receipt_schema.py. Gateway and
policy-enforcement-point developers can validate emitted
receipts against the schema before tailing them into SIEM. The
schema is schema-stable; field additions are additive.
What every McpDecisionReceipt carries
One JSON file per sensitive tool call, stored at
.roam/mcp_receipts/<run_id>/<tool_call>.json.
Frozen dataclass; canonical-JSON serialisation; stable sha256
content hash. Raw inputs and outputs are never stored — only
their digests.
| Field | Purpose |
|---|---|
tool_call | Opaque per-invocation id (<tool>_<12-hex>). |
client_id | MCP client process id from ROAM_MCP_CLIENT_ID. |
tool_name | Canonical tool name (e.g. roam_preflight). |
actor_ref_id | Agent id from ROAM_AGENT_ID; ties to the ActorRef vocabulary. |
declared_side_effects | Tuple of read_only / write_filesystem / etc. from the tool registry. |
required_mode | read_only / safe_edit / migration / autonomous_pr. |
input_hash | sha256 of canonical-JSON input args. Never the args themselves. |
policy_decision | Closed enum: allow / deny / not_evaluated. |
output_ref / output_hash | Artifact id for large output, or sha256 for small. Mutually exclusive. |
run_event_id | Link to .roam/runs/<id>/events.jsonl row. |
redactions | Closed-enum tuple of redaction reasons; stable across versions. |
extra | Free-form structured detail (e.g. redaction_details per-pattern counts). |
Reviewers and gateways consume receipts alongside the eight evidence questions on /docs/architecture#eight-evidence-questions and the worked PR-Replay packet under /audit#evidence.
Wire it into your agent
Roam ships a Model Context Protocol server and an auto-generated Claude Code skill. Setup is one command per editor — see /docs/integration-tutorials for Claude Code, Cursor, Codex, Gemini, and Amp.
Or run roam skill-generate --target claude to emit a
SKILL.md directly from the live capability registry — the agent
gets accurate, up-to-date instructions for every command without
you hand-writing the prompt.
The principle behind the contract
Every command in Roam should answer a real engineering question that an agent or reviewer would ask. If a command doesn't map to a clear question, it's hidden, marked experimental, or improved. The goal isn't fewer commands — it's a clearer mental model.
Five questions × five commands = the discipline. Everything else is specialisation on top.
Where to next
Ready to try Roam? Install the free CLI · or read the command reference for the full surface.
See it run:
The 5-minute canonical demo —
install → health → preflight → critique → signed
ChangeEvidence packet, end to end.
Agent onboarding companions:
- /llms.txt — the LLM-friendly index of every Roam doc page. Drop this into your agent's bootstrap prompt to give it the full surface in one fetch.
- /docs/getting-started — the five-minute setup path for the human running the agent.
- /docs/mcp-usage — the MCP-specific call patterns: cold-start envelope, slow-tool handle pattern, typical agent flow, parameter aliases.
- /docs/mcp-usage#security-stance — the public inside-server-vs-gateway stance and which class of defence Roam owns versus which class belongs to the host.
- /docs/architecture#eight-evidence-questions —
the eight questions every
ChangeEvidencepacket answers about an AI-assisted change. - /audit#evidence — a worked PR-Replay evidence packet on a real repo.
- /docs/architecture — how the call graph, evidence layer, and detector registry fit together. Read once; agents do not need it per call.