System Design
Roam Architecture
Roam parses source once, stores structural facts in SQLite, and exposes deterministic query primitives to CLI and MCP clients — then emits structured evidence (receipts, ledger, packets) that any gateway can act on.
McpDecisionReceipt per sensitive tool call, the HMAC-chained run ledger, and the ChangeEvidence packet. Hosts (Claude Code, Cursor) own per-call approval; gateways own policy + correlation + audit aggregation; Roam ships the structured evidence stream those layers consume. See the MCP layering discussion for the full split.
Pipeline at a Glance
Repository ──> Index Pipeline ──> SQLite Storage
│
┌────────────────┼────────────────┐
▼ ▼ ▼
Graph Analytics Retrieval + CLI / MCP
Rules Engine Patch Verifier Interfaces
Security Code Graph JSON / SARIF
Attestation
The index is built once per repo with roam init. Subsequent runs are
incremental — only changed files re-parse. All downstream consumers (CLI, MCP, CI gates)
read from the same SQLite artefact at .roam/index.db.
Subsystem Responsibilities
| Subsystem | Main modules | Responsibility |
|---|---|---|
| Index Pipeline | index/indexer.py, index/parser.py, index/symbols.py |
Build and refresh the structural index from source + git history. |
| Storage | db/schema.py, db/connection.py |
SQLite schema, migrations, batched query helpers. |
| Graph Intelligence | graph/builder.py, graph/layers.py, graph/clusters.py, graph/pagerank.py |
Centrality, layering, communities, cycle analysis, AST clone clustering. |
| Retrieval | retrieve/pipeline.py, retrieve/rerank.py |
Graph-aware FTS5 + structural reranker (PageRank + co-change + clones + runtime hot). |
| Patch Verifier | critique/checks.py, critique/aggregator.py |
Diff parsing + clones-not-edited + blast-radius + intent-alignment for roam critique. |
| Taint & Reachability | security/taint_engine.py |
Graph-reach BFS over edges with sanitiser-stop nodes; OpenVEX-correct. |
| Code Graph Attestation | attest/cga.py |
in-toto v1 statement builder. Merkle root over symbol fingerprints + edge bundle digest. Cosign-signable. |
| Fleet Planner | fleet/manifest.py |
Multi-agent partitioner (Louvain + co-change + PageRank anchors); emits .roam-fleet.json. |
| Rule Engine | rules/builtin.py, rules/engine.py |
Built-in rules + YAML rule packs (path / symbol / AST / dataflow patterns). |
| Interfaces | commands/cmd_*.py, mcp_server.py, mcp_extras/ |
Deterministic queries for CLI and MCP clients. Sampling-driven compression, watcher-based invalidation, per-session memory. |
| Output Contracts | output/formatter.py, output/sarif.py, output/schema_registry.py |
Stable text / JSON / SARIF envelopes for agents and CI; every --json error path returns a parseable envelope. |
Index Pipeline Stages
- Discovery — collect tracked files (via
git ls-files+.gitignore) and classify file roles. - Parsing — tree-sitter parse per file with language routing across 28 supported languages.
- Extraction — symbols (classes, functions, methods, fields), signatures, docstrings, references.
- Resolution — convert references into graph edges (caller→callee, import chains, inheritance).
- Metrics — cognitive complexity, centrality (PageRank, betweenness), churn, co-change, cognitive load.
- Persistence — upsert into SQLite with incremental diffing; only changed files re-parse.
discover -> parse -> extract -> resolve -> metrics -> persist
(incremental path executes only changed files)
Command-to-Data Flow
Example: roam preflight AuthService
CLI cmd_preflight
-> ensure_index()
-> query symbols/edges/metrics
-> run health/rule checks
-> aggregate verdict + risk factors
-> render text or JSON envelope
Example: roam cga emit --include-taint --sign
CLI cmd_cga
-> ensure_index()
-> attest.cga.build_statement()
-> _symbol_fingerprints() # Merkle root over (qname, kind, sig, path)
-> _edge_bundle_digest() # graph snapshot fingerprint
-> security.taint_engine.run() # graph-reach BFS, sanitizer stops
-> _finding_to_vex_claim() # OpenVEX status + justification
-> in-toto v1 Statement
(predicateType: https://roam-code.com/spec/CodeGraph/v1)
-> cosign sign-blob --bundle # optional, graceful skip if absent
-> .roam/attestations/<sha>.intoto.json + .sig
Property: the CGA chain is reproducible — same source tree + same git HEAD → same Merkle root → same predicate digest. Signing layers identity onto a deterministic fingerprint.
Tradeoff: static structure gives speed and determinism, but cannot model
fully dynamic runtime behavior without trace ingestion (roam ingest-trace).
Agent OS Substrate
On top of the analysis core, Roam ships an 11-package control-plane substrate that
lets agents earn the right to change code. Every package is repo-local (under
.roam/), zero-network, and additive to the index.
| Package | What it does |
|---|---|
atomic_io | POSIX + Windows-safe atomic writes (os.replace) for every ledger and bundle file. |
agents_md/ | Compositional AGENTS.md generator; consumes the rest of the substrate. |
constitution/ | Capstone .roam/constitution.yml unifying laws, rules, memory, gates. |
db/findings.py | Cross-detector finding registry (roam findings list/show/count); USER_VERSION 17. |
laws/ | Invariant mining (roam laws mine/check) — self-installing. |
leases/ | Multi-agent coordination (roam lease claim/release/list). |
memory/ | Repo-local agent memory at .roam/memory.jsonl. |
modes/ | Four cumulative action modes: read_only / safe_edit / migration / autonomous_pr. |
policy/ | Graph-aware rule clauses (reachable_from, imports_from, ...). |
runs/ | Per-run event ledger + HMAC tamper detection (roam runs verify). |
world_model/ | Four detectors: side-effects, idempotency, causal-graph, tx-boundaries. |
The canonical agent loop:
1. roam runs start # open run, get ROAM_RUN_ID (HMAC-signed events)
2. roam mode safe_edit # declare action surface
3. roam pr-bundle init # start proof bundle
4. roam preflight <sym> # gate before edit
5. roam impact <sym> # blast radius
6. <edit>
7. roam diff | roam critique # review
7a. roam findings list # cross-detector findings on the workspace
8. roam pr-bundle emit # close bundle with proofs
9. roam runs end --with-pr-bundle-emit
10. roam replay <id> # narrate the run
11. roam agent-score # composite 0..100 score
Findings Registry
A normalised cross-detector table that answers "what's wrong with this workspace
right now?" in one query — instead of running ten detectors and reconciling
ten output shapes. Detectors keep their detector-specific tables and ALSO upsert a row
to the central findings table, which becomes the surface for CLI consumers,
SARIF emit, and suppression management.
Schema
| Column | Purpose |
|---|---|
finding_id_str |
Stable string identifier (UNIQUE). Deterministic — rerunning a detector refreshes the same row in place. Convention: "<detector>:<subject>:<hash>". |
subject_kind |
What kind of thing the finding is about: symbol, file, edge, commit, package, etc. |
subject_id |
Foreign key into the table named by subject_kind. Nullable — not every subject maps to a row id. |
claim |
Human-readable summary of the finding. |
evidence_json |
Detector-specific structured fields. Schema is owned by the detector, not by the registry. |
confidence |
One of heuristic, structural, static_analysis, runtime. See the tier table below. |
source_detector |
Which detector emitted the row: clones, dead, complexity, etc. |
source_version |
Detector version stamp. Consumers can spot rows produced under a stale detector shape. |
Confidence tiers
Every finding carries a confidence label drawn from a closed enumeration of four tiers. Detectors pick the tier that matches their evidence — never mint new strings.
| Tier | Definition | Example |
|---|---|---|
heuristic |
Name-pattern matching, length thresholds, fuzzy NLP signals. | vibe-check's comment_anomalies — comments don't match code semantics. |
structural |
Graph-pattern matching over the symbol / edge / call graph. | n1's loop-with-dependent-write — a loop body issues a DB call that depends on the loop variable. |
static_analysis |
Deterministic AST / CFG / dataflow analysis. | complexity scores; missing-index's unconditional-predicate finding. |
runtime |
Requires ingested runtime traces (OpenTelemetry / Jaeger / Zipkin / coverage). | hotspots's UPGRADE / CONFIRMED / DOWNGRADE classification. |
CLI surface
roam findings list # all findings on this workspace
roam findings list --detector clones # filter by detector
roam findings list --subject-kind symbol # filter by subject kind
roam findings show <finding_id> # one record, full evidence
roam findings count # per-detector totals
Full reference and flags: command reference for roam findings.
Detectors that persist findings
28 detectors persist findings to the registry (the registry stores last-run state per
detector, so roam findings count returns 0 for detectors that haven't been
run on the current corpus; counts are last-run state, not cumulative). Run
roam findings count for the live per-detector tally on your workspace.
The table below covers the original 16-detector substrate plus boundary
and test-hermeticity; consumer / aggregator detectors
(critique, doctor, fan,
fingerprint, health, llm-smells,
dark-matter) re-emit derived findings from these upstream detectors.
| Detector | Wave | Tier | What it finds |
|---|---|---|---|
clones | W95 | structural | copy-paste / structural duplicates |
dead | W99 | structural | unreachable symbols |
complexity | W102 | static_analysis | cognitive complexity hotspots |
smells | W109 | heuristic | god class / long method / feature envy (24 kinds) |
n1 | W110 | structural | loop-with-dependent-query patterns |
missing-index | W111 | static_analysis | unindexed predicate columns |
over-fetch | W114 | static_analysis | SELECT * / wildcard column reads |
bus-factor | W115 | heuristic | single-owner critical components |
auth-gaps | W116 | structural | endpoints missing auth checks |
vulns | W117 | static_analysis | reachable vulnerable dependencies |
invariants / laws | W119 | structural | mined invariant violations |
hotspots | W120 | runtime | runtime-trace classified hotspots |
taint | W122 | static_analysis | source → sink dataflow leaks |
vibe-check | W125 | heuristic | AI-rot anomalies (8 pattern families) |
orphan-imports | W132 | structural | imported-but-unused modules |
conventions | W133 | heuristic | naming / layout convention drift |
pr-risk | W134 | structural | per-PR risk factors |
duplicates | W136 | heuristic | near-duplicate symbol families |
audit-trail-conformance | W145 | static_analysis | audit-trail integrity checks |
audit-trail-verify | W146 | static_analysis | signed-ledger verification |
boundary | — | static_analysis / structural | public-by-accident exports, wrong-direction layer imports |
test-hermeticity | — | structural / static_analysis | non-hermetic test calls (network, time, random, fs, env, subprocess) |
Agent loop integration
The registry slots into the canonical agent loop as step 7a — between roam critique
(review the diff) and roam pr-bundle emit (close the proof bundle):
...
7. roam diff | roam critique # review the change
7a. roam findings list # cross-detector findings on the workspace
8. roam pr-bundle emit # close bundle with proofs
...
This is the "what's wrong with this workspace right now?" gate. An agent runs it before closing a proof bundle so the bundle either references the surviving findings as accepted or carries evidence that the change made them go away.
Evidence Compiler
Roam is a local evidence compiler for AI-assisted software change. The findings
registry above is one input layer; the evidence compiler aggregates findings, run events,
policy decisions, tests, and approvals into typed ChangeEvidence packets.
One shared record renders into PR Replay reports, SARIF, in-toto attestations, or
OSCAL-shaped exports — no exporter owns its own data-gathering logic.
The eight evidence questions
Every sellable Roam report answers these eight questions. If a surface cannot answer
one yet, the report says so explicitly via the producer_not_available
redaction marker — never silent omission.
- Who acted? — human, agent id, MCP client id, tool id (
runs,replay, MCP receipt). - What authority existed? — mode, permits, leases, scopes, policy decision (
mode,permit,lease,constitution). - What context was read? — files, symbols, commands, handles, hashes (
pr-bundle,context,retrieve). - What changed? — diff hash, changed files, changed subjects (
diff,graph-diff,pr-analyze). - What could break? — blast radius, callers, tests, vulnerable paths (
impact,preflight,test-impact,vuln-reach). - What policy applied? — rules, laws, controls, exceptions (
rules,laws,constitution). - What verified it? — tests run / required, gates, attestations (
tests,critique,pr-bundle,runs verify). - Who accepted risk? — approval, accepted risk, reviewer, timestamp (
permit,pr-bundle, run ledger).
Data model
| Type | Purpose |
|---|---|
ChangeEvidence |
One evidence packet per code-change scope. Carries evidence_id,
schema_version, repo_id, git_range,
commit_sha, diff_hash, run_ids,
mode, started_at, completed_at,
verdict, risk_level, changed_subjects,
findings, policy_decisions, tests,
approvals, accepted_risks, artifacts,
redactions, and a content_hash.
|
EvidenceSubject |
Portable identifier wrapper around things Roam already sees — symbol, file, endpoint, package, module, directory, commit, rule, control, run, bundle, finding, test, artifact. Survives reindexing so reports, SARIF rows, and attestations stay stable across rebuilds. |
EvidenceLink |
Typed edges inside a packet (12-member closed enumeration):
derived_from, touches, calls,
tested_by, triggered, blocked_by,
allowed_by, accepted_by, satisfies_control,
maps_to_standard, supersedes, mitigates.
|
EvidenceArtifact |
File or data reference with a content hash and an optional path. Large artifacts are referenced by hash rather than embedded so the packet stays small and redaction metadata stays meaningful. |
Projections
Every external format is a projection from the same evidence packet. No exporter owns its own data-gathering logic.
| Projection | Use |
|---|---|
| Markdown / PDF | PR Replay, Due Diligence, AI Adoption Readiness, Incident Replay. |
| SARIF | GitHub Code Scanning and CI annotations. |
| in-toto / CGA | Signed proofs for code graph and change evidence. |
| OSCAL-like JSON / YAML | Governance evidence and control mapping. |
| OpenTelemetry spans / events | Agent observability bridges (LangSmith / Langfuse / Helicone). |
| CycloneDX VEX / OpenVEX | Vulnerability reachability context. |
| CDEvents / CloudEvents | Later CI/CD event interoperability. |
Recipes
The compiler runs as a local recipe DAG, not a workflow engine. Each recipe declares its
inputs, steps, required evidence, gates, report sections, and exports — then the runner
executes existing Roam commands through Python APIs, collects their JSON envelopes,
normalises them into a ChangeEvidence packet, and renders the configured
reports and exports.
recipe:
inputs: # what the recipe needs (git range, scope, mode)
steps: # ordered list of Roam command invocations
required_evidence: # which fields the packet must contain to close
gates: # pass/fail conditions
report_sections: # Markdown / structured-output sections
exports: # projection list (sarif, in-toto, oscal-like, ...)
Example recipes: pr-replay, governance-evidence-pack,
codebase-due-diligence, ai-adoption-readiness,
security-reachability-triage, post-incident-replay,
migration-assurance.
Execution phases
The compiler ships in small phases that share the same evidence model.
- Phase 0 — Vocabulary freeze (in flight): enum-like constants for evidence subject kinds, link kinds, artifact kinds, claim severities, and redaction reasons; one test proving existing command outputs map onto these kinds.
- Phase 1 — Schema v0 (in flight): pure dataclasses for
ChangeEvidence/EvidenceSubject/EvidenceLink/EvidenceArtifact; deterministic JSON serialisation; stable content hash;schema_versionandredactions[]; no DB migration. - Phase 2 — Envelope collector: helper that turns existing JSON envelopes into
ChangeEvidencefor the PR Replay path, with a warning list for fields that do not yet map cleanly. - Phase 3 — PR Replay compiled report: one evidence JSON, one Markdown report, optional PDF projection, and a "Suggested Review configuration" generated from the same packet.
- Phase 4 — Governance control mapping: YAML control map from Roam evidence types to governance-control language, OSCAL-like JSON / YAML projection, and a sample report under
templates/audit-report/. - Phase 5 — Projection consolidation: central SARIF / in-toto / OTel / VEX projections driven by the shared evidence layer; per-command emitters become thin compatibility wrappers. Customer-pulled.
Confidence checks
- Every customer-facing recipe emits Markdown plus structured JSON.
- Every evidence packet carries
schema_version, content hash, and explicit redaction metadata. - Every governance claim uses "maps to" or "supports evidence for" — never "certifies compliance".
- Every large artifact can be referenced by path or hash instead of embedded.
- Every exporter is a projection from shared evidence, not a second source of truth.
SLSA Source Track L3
SLSA Source Track L3 is the strongest signed projection from the
evidence compiler. Roam maps to SLSA SRC-L3
requirements by emitting two in-toto v1 statements alongside every code-change scope:
a Code Graph Attestation (CGA) that pins the structural fingerprint of the analysed
tree, and a SLSA Verification Summary Attestation (VSA) that projects the same
ChangeEvidence packet into the SLSA-shaped predicate consumers expect.
Two predicates
Roam emits two attestation predicates from one shared evidence record:
-
CGA — Code Graph Attestation — predicateType
https://roam-code.com/spec/CodeGraph/v1. Merkle root over symbol fingerprints plus an edge bundle digest. Reproducible: same source tree at the same git HEAD produces the same predicate digest. Emitted byroam cga emit. -
VSA — Verification Summary Attestation — predicateType
https://slsa.dev/verification_summary/v1(SLSA v1.2). ProjectsChangeEvidenceinto the SLSA VSA predicate shape so external verifiers (slsa-verifier, Sigstore, Rekor consumers) ingest Roam evidence without learning the roam-specific CGA predicate. Emitted byroam cga emit --also-vsaas a sibling artifact, or byroam pr-bundle emit --slsa-l3through the proof-bundle pipeline.
Standalone CGA limitation
A standalone CGA alone does not reach SRC L3. A CGA is not a Verification
Summary Attestation, so its supply-chain claim falls short of the SRC-L3 requirement
set. For the canonical L3 path, pair the CGA with a sibling VSA via
roam cga emit --also-vsa, or run the same wiring through
roam pr-bundle emit --slsa-l3 — the VSA is byte-identical across both
entry points because it projects the same shared evidence packet.
CI auto-trigger
roam ci-setup --with-slsa-l3 scaffolds a GitHub Actions workflow at
.github/workflows/roam-slsa-src-l3.yml that auto-triggers
roam pr-bundle emit --slsa-l3 --sign --keyless on every PR. The keyless
cosign path uses Fulcio short-lived certificates plus Rekor transparency-log entries;
the workflow emits the VSA, the run-ledger-root statement, the cosign signature
triplet, and the Rekor record alongside the proof bundle.
Honest banner
Roam maps to SLSA SRC L3 requirements and supports evidence for the L3 claim. Roam itself does not certify that L3 is reached — the user is responsible for the offline verifier step plus Rekor publishing. The wording-lint that ships with the evidence compiler enforces this rule on every generated report: Roam emits the evidence; the verifier asserts the claim.
Why SQLite
- Zero-dep: ships with Python; no client/server, no infrastructure.
- FTS5: full-text search built-in, used for symbol search and retrieval.
- Determinism: same input source → identical SQLite file (modulo timestamps). Diffable, reproducible.
- Portable:
roam index-exportemits a tarball with manifest SHA-256 + optional cosign signature;index-importverifies before extracting. - Local: index lives at
.roam/index.dbin your repo. No source code leaves the machine for the free CLI.
See it run
The 5-minute canonical demo —
install → health → preflight → critique → signed
ChangeEvidence packet, end to end. The architecture
above in one sitting.
Want this architecture run on your own repo by an analyst? See Roam Audit — we replay your last 5 PRs through the same pipeline, return a signed evidence packet plus a written report. Or jump straight to pricing / governance / trust.