Governing Multi-Step AI Development Workflows with Mneme HQ
Give every agent in your pipeline — planner, coder, reviewer, deployer — a shared memory of your architectural decisions. One decision store. Every stage enforced.
Each agent in your pipeline starts with zero project memory.
Multi-agent development workflows — where a planner scopes tasks, a coder implements them, and a reviewer validates the output — are increasingly common. But each agent operates independently. The planner doesn't know what the coder knows. The reviewer doesn't know what was decided three sprints ago. Without a shared memory layer, architectural drift compounds at every stage of the pipeline.
The result: the planner proposes approaches that violate architecture rules, the coder implements them, and the reviewer — lacking the decision context — approves code that breaks conventions the whole team agreed on months ago.
Planner
Scopes tasks, picks approaches — no memory of past decisions
Coder
Implements the plan — violates architecture rules it can't see
Reviewer
Evaluates output — approves violations it has no context to catch
Deployer
Ships — architectural drift reaches production
Rule: All DB access via repository layer. Raw SQL in handlers banned ADR-004.
✗ FAIL decision/search-via-indexed-view
Rule: Search must use the search_index view, not raw table scans.
→ Plan rejected before coder agent is invoked.
No existing orchestration layer enforces project decisions.
| Approach | Limitation | With Mneme HQ |
|---|---|---|
| Agent system prompts | Static; can't cover full decision history; each agent has its own context | Shared decision store queried by any agent at any stage |
| Shared context window | Grows too large; stale; not structured for retrieval | Semantic retrieval — only relevant decisions surfaced per query |
| Reviewer agent | Evaluates code without decision context; can't catch architecture violations | Reviewer queries Mneme HQ — violations flagged with rationale |
| CI checks | Post-pipeline; violations already implemented; expensive to fix | Gate at each stage — planner, coder, reviewer all pre-checked |
A shared architectural governance layer for every agent in the pipeline.
Build a shared decision store
All architectural decisions, constraints, and anti-patterns live in one decisions/ directory — a single source of truth every agent can query.
Gate each pipeline stage
Add a mneme check call before invoking the next agent. If the plan violates decisions, stop before the coder runs. If code violates decisions, stop before the reviewer runs.
Inject decision context per stage
Each agent receives only the decisions relevant to its stage — planners see architecture rules, coders see implementation constraints, reviewers see the full violation report.
Log violations for audit
Every check is logged. You get a full audit trail of which decisions were evaluated, which violations were caught, and at which stage — across every pipeline run.
Wiring Mneme HQ into a multi-agent pipeline.
# Before invoking coder agent plan = planner_agent.run(task) check = mneme.check(plan, mode="strict", stage="plan") if check.has_violations(): # Return violations to planner for revision plan = planner_agent.revise(plan, violations=check.violations) # Before invoking reviewer agent code = coder_agent.run(plan) check = mneme.check(code, mode="strict", stage="code") if check.has_violations(): # Block reviewer; return to coder with context code = coder_agent.fix(code, violations=check.violations)
$ mneme check "plan: add search via raw SQL in handler" --stage plan Checking against 14 decisions (stage: plan)... ✗ FAIL decision/no-direct-db-queries Stage: plan — approach violates repository layer constraint. ✗ FAIL decision/search-via-indexed-view Stage: plan — raw table scan violates search architecture decision. ✓ PASS decision/search-pagination-required ✓ PASS decision/rate-limit-on-search-endpoints Pipeline gate: BLOCKED — 2 violations at plan stage. Coder agent not invoked. Violations returned to planner.
What teams see after adding decision gates to their pipeline.
Common questions.
Does Mneme HQ work as a Python library or only as a CLI?
mneme check) and a Python API (from mneme import check), so it can be embedded directly into orchestration code — LangGraph, custom pipelines, or any Python-based agent framework.Can different agents query different subsets of decisions?
--tags to scope checks per stage — e.g., --tags architecture for the planner, --tags security,compliance for the coder, and no filter for the reviewer to see everything.