salience rankingConfirmed memories outrank stale ones; invalidated memories attenuate below a floor. Pinned and authority-reviewed memories are exempt. Cold-start ranking is byte-identical to plain retrieval — measured, not asserted.
Event-sourced memory for agent work
Zaxy turns agent work into durable, auditable memory: a hash-chained append-only log as the source of truth, cited Memory Checkout as the trust contract, salience-based forgetting that attenuates instead of deleting, compaction recovery for long sessions, and coordinated worker missions that merge back into one replayable project history.
Cognitive memory
Zaxy 2.x implements the memory literature as boring, testable scoring over an immutable log: retrieval strengthens, disuse decays, surprise gates encoding. Forgetting is pure projection policy — attenuated memories leave default ranking but stay one explicit query away, with a replayable record of why they faded. Every default in this list was flipped or held by an evaluation lane that ships in the repo.
salience rankingConfirmed memories outrank stale ones; invalidated memories attenuate below a floor. Pinned and authority-reviewed memories are exempt. Cold-start ranking is byte-identical to plain retrieval — measured, not asserted.
compaction recoveryAfter a context compaction, a session-resumed hook hands the agent back its open tasks, accepted findings, and known unknowns — every line citing the exact log events it came from.
memory_checkout(..., max_tokens=N)Token-budgeted packing with explicit elision reporting, and cache-stable packet ordering: consolidated content renders byte-identically across calls so provider prompt caching hits.
tool profilesThe default core profile lists 8 tools instead of 47 — an 83.5% smaller listing surface — while every tool stays callable by name. --profile full restores the complete listing.
memory_feeling_of_knowingExperimental sub-millisecond metamemory pre-check: would checkout likely return something? Lets agents decide whether retrieval is worth the call before paying for it.
zaxy memory mine-proceduresRecurring successful tool sequences become review-pending procedure candidates with trace citations. Nothing becomes authoritative without review — the same gate as every generated abstraction.
Coordinate
Spawning agents is easy. The hard part is turning isolated investigations into one trustworthy state of work. Zaxy records each worker in its own Eventloom session, reviews findings with evidence, marks stale and conflicting claims, and promotes only accepted facts into the parent mission.
The coordinator owns accepted project history, decisions, handoff, and Memory Checkout state.
Agents investigate in isolated logs, so exploration does not contaminate authoritative memory.
Human or coordinator-agent review accepts, rejects, defers, or promotes findings with cited provenance.
Architecture
Eventloom source of truth remains the append-only project record. The
default local runtime is embedded LadybugDB, launched and cleaned by
zaxy init and zaxy doctor. Neo4j remains the
sidecar control backend; pgGraph, LatticeDB, and Pathlight are
advanced integration tracks for teams that need alternate deployment or
observability posture.
Purpose control plane
Zaxy carries purpose-conditioned checkout through retrieval, diagnostics, feedback, compaction, and Coordinate accepted state. This is still framed as project-local agent work memory, not a broad Company Brain claim.
memory_checkout(..., purpose="coding")Applies deterministic purpose emphasis, recall floors, scoring profile selection, and checkout guidance.
zaxy memory purpose statusReplays active profile, checkout quality, accepted Coordinate state, and feedback posture without graph mutation.
zaxy memory purpose lanesShows purpose-specific checkout lanes, cited source groups, and suppression candidates.
zaxy memory purpose feedbackSurfaces positive and negative outcome history so future retrieval can prioritize useful purpose-specific memory.
Interfaces
coordination_checkout accepted parent state plus diagnostic worker-local findings
coordination_approval_packet reviewable accept/reject/defer/promote payloads
memory_checkout answerability, current_citation_count, salience diagnostics, budget reporting, and memory_feedback guidance
memory_feeling_of_knowing experimental likely/possible/unlikely pre-check with calibration markers
CoordinationAdapter dependency-light Python wrapper with LangGraph and CrewAI helper paths
dashboard --enable-coordinate-review opt-in human review controls over replay-backed state; read-only remains the default
Benchmark evidence
Current public benchmark evidence is intentionally narrow: the headline 500-question LongMemEval-compatible checkout diagnostic and the Harvey LAB external legal-agent memory-ablation report. Older backend shootouts, partial slices, suite gates, and debug reports are archived as development history rather than current claims.
Full ten-task external legal-agent memory-ablation run, +0.184 versus regular/no-memory and 9/10 task wins versus article-best rows.
Full 500-question LongMemEval-compatible checkout diagnostic: mean 0.956, Answer@5 0.910, Recall@5 1.000.
The headline 500 is a Zaxy same-harness checkout run, not an official LongMemEval end-to-end assistant score.
Archived reports remain useful for engineering history, but public benchmark claims now route through the benchmark hub.
The 2.2 ANN engagement rule ships exactly as far as internal lane evidence extends: HNSW engages at 100k+ vectors up to 64 dimensions, where it measured recall 1.0 with 12.9x faster index builds. Above that envelope, exact search remains the recommendation — the negative results ship in the same paper.
These rows are release evidence and disclosure status, not a universal memory leaderboard.
| Artifact | Status | What it supports | What it does not support |
|---|---|---|---|
| Harvey LAB external memory-ablation | complete | Full 10-task legal-agent benchmark evidence: 0.788 mean criterion pass rate, +0.184 vs regular/no-memory, +0.081 vs article-best rows, 9/10 task wins. | Same-harness full-suite scores for non-Zaxy systems beyond the article-published matrix. |
| LongMemEval-compatible checkout 500 | current headline | Same-harness checkout diagnostic: mean 0.956, Answer@5 0.910, Recall@5 1.000, citation coverage 1.000. | Official LongMemEval end-to-end assistant accuracy or external memory-system leaderboard ranking. |
Install
pipx install zaxy-memory
zaxy init
zaxy memory log --eventloom-path .eventloom --limit 5
zaxy memory bootstrap --eventloom-path .eventloom
zaxy doctor --eventloom-path .eventloom
zaxy coordinate start "ship auth refactor" --mission auth-main
zaxy coordinate worker create --mission auth-main --worker auth-api
zaxy coordinate assign --mission auth-main --worker auth-api "trace failures"
zaxy coordinate brief --mission auth-main
zaxy coordinate checkout --mission auth-main
Zaxy writes `.env.local`, records session genesis and heartbeat, checks graph posture, and prints the MCP command or config path.
Session history lives in .eventloom/ as append-only JSONL. The graph is a rebuildable projection.
memory log, memory bootstrap, doctor, and hook-status expose Last checkout, capture, and stale-memory posture.
Documentation