# Matryca Plumber - AI Agent Context & Execution Guide (v1.9.14)

> **SYSTEM DIRECTIVE FOR AI AGENTS:** You operate on the user's **local Logseq OG vault** via Matryca Plumber.  
> **CRITICAL RULE:** DO NOT `git clone` this repository. DO NOT `pip install`. Run the **PyPI** release with **`uvx`** so you use a real, versioned wheel — not a guessed dev tree.

Canonical copy (2026 standard path): `.well-known/llms.txt` (same content as this file).

---

## 0. Graph path (REQUIRED — no `--graph` flag)

Matryca Plumber v1.9.14 does **not** accept `--graph` on the CLI. You **must** point at the vault root (folder containing `pages/` and usually `journals/`) with the environment variable **`LOGSEQ_GRAPH_PATH`**.

**Set once per shell session (copy-paste):**
```bash
export LOGSEQ_GRAPH_PATH="/absolute/path/to/your/logseq/graph"
```

**One-shot (inline env for a single command):**
```bash
LOGSEQ_GRAPH_PATH="/absolute/path/to/your/logseq/graph" uvx matryca-plumber --help
```

If `LOGSEQ_GRAPH_PATH` is unset or invalid, graph commands return an error — fix the path before retrying.

---

## 1. Execution & Installation (PyPI / uvx)

Matryca Plumber is on PyPI as **`matryca-plumber`**. Use **`uvx`** (uv tool runner); no permanent install required.

**Entrypoint (console script):**
```bash
uvx matryca-plumber <subcommand> ...
```

**Discover commands:**
```bash
uvx matryca-plumber --help
uvx matryca-plumber read --help
```

**Machine-readable JSON:** Global flag **`--json`** must appear **before** the subcommand:
```bash
uvx matryca-plumber --json read page "My Project"
```

---

## 2. Core CLI (verified v1.9.14)

Subcommands: `read`, `search`, `mutate`, `refactor`, `lint`, `context`, `service`, `plumber`.  
Shorthand daemon/UI verbs (routed to `plumber`): `start`, `stop`, `status`, `ui`, `audit`, `cluster`.

### 2.0 Plumber commands — UI vs daemon (do not confuse)

| Command | Starts | Does **not** start |
|---------|--------|-------------------|
| `status` / `ui` (or `plumber status` / `plumber ui`) | Sovereign UI + API on `http://127.0.0.1:8500` | maintenance daemon |
| `plumber start` | background maintenance daemon | browser / UI server |
| `plumber start --foreground` | foreground daemon (terminal logs) | browser / UI server |
| `plumber stop` | — | stops daemon |

**Lazy UI bootstrap (v1.9.10+):** `status` / `ui` bind `:8500` in seconds; the in-memory graph index loads on the first analytics request. **v1.9.11:** settings save, graph-path save, L1 provision, and **Start Engine** also use lazy bootstrap so large vaults do not hit the 10s UI fetch timeout. Use **Start Engine** in the UI or `plumber start` to run Phase 1/2 maintenance (the daemon subprocess loads the AST eagerly).

**Common mistake:** `plumber start` alone does **not** open the dashboard — run `status` in another terminal or use **Start Engine** after opening the UI.

### 2.1 Extract graph data as JSON (DO NOT grep `.md` files)

Use **`read`** with a **positional** `target_type`, then an optional `query` string.

| `target_type` | `query` | Use when |
|---------------|---------|----------|
| `page` | Logseq page title | Full spatial page context |
| `subtree` | `Page Title\|block-uuid` or JSON | Token-efficient block extract |
| `memory` | (omit) | L1 session memory files |
| `bootstrap_status` | (omit) | Phase 1 semaphore (`bootstrap_complete`, Soft Gate) |
| `dashboard` | (omit) | Vault / daemon overview |
| `block_ast` | `Page Title\|block-uuid` | Single block AST excerpt |
| `xray_page` | Page title | X-Ray aliases `[0]`…`[n]` for mutations |
| `structural_hops` | hop query | Link/tag neighborhood report |

**Command (canonical):**
```bash
export LOGSEQ_GRAPH_PATH="/absolute/path/to/your/logseq/graph"
uvx matryca-plumber --json read page "My Project"
```

**Semantic macro (bundled context):**
```bash
uvx matryca-plumber context load "My Project"
uvx matryca-plumber context load "My Project|aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
```

**Search (examples):**
```bash
uvx matryca-plumber --json search bm25 "redis cache"
uvx matryca-plumber --json search journal_tasks "7"
```

Always **parse JSON from stdout** when using `--json`. Never hand-parse raw `pages/*.md` or `journals/*.md`.

### 2.2 MCP (FastMCP **stdio** — NOT HTTP port 8080)

MCP is a **stdio** sidecar: the host (Cursor, Claude Desktop, **Hermes Agent**, etc.) spawns `matryca-plumber` and talks JSON-RPC over stdin/stdout. There is **no** `mcp --port` flag in the v1.9 line.

**Lazy AST handshake (v1.9.6+):** MCP lifespan defers full-vault AST parsing until the **first graph tool call**. `initialize` + `tools/list` complete in seconds; `read_graph_data` / `target_type=bootstrap_status` and `target_type=memory` do **not** require the AST index.

**Requirements:**
1. `LOGSEQ_GRAPH_PATH` set to the vault root.
2. `MATRYCA_MCP_ENABLED=true` (off by default for safety).

**Host config pattern (Cursor / Claude Desktop):**
```json
{
  "mcpServers": {
    "matryca-logseq": {
      "command": "uvx",
      "args": ["matryca-plumber"],
      "env": {
        "LOGSEQ_GRAPH_PATH": "/absolute/path/to/your/logseq/graph",
        "MATRYCA_MCP_ENABLED": "true"
      }
    }
  }
}
```

With no CLI subcommand and MCP enabled, `uvx matryca-plumber` starts the **stdio** MCP server.  
For interactive graph work without MCP, prefer the CLI in section 2.1.

#### Hermes Agent (`~/.hermes/config.yaml`)

Hermes requires the host MCP client extra: `cd ~/.hermes/hermes-agent && uv pip install -e ".[mcp]"`.

```yaml
mcp_servers:
  matryca-plumber:
    command: uvx
    args: [matryca-plumber]
    env:
      MATRYCA_MCP_ENABLED: "true"
      LOGSEQ_GRAPH_PATH: /absolute/path/to/vault
    enabled: true
    connect_timeout: 120   # handshake (initialize + tools/list) — not vault parse time
    timeout: 300           # per tool call; first graph tool pays AST load on large vaults
```

| Setting | Purpose |
|---------|---------|
| `connect_timeout` | Hermes **handshake** only (`initialize`, `tools/list`). **60–120 s** is enough with lazy AST. |
| `timeout` | Each **tool invocation**. Raise for large vaults on the **first** graph read/search (AST cold start). Rule of thumb: `(pages + journals) × ~0.2 s` on slow mounts — measure once. |

Full guide: `docs/integrations/hermes-agent.md` · stderr telemetry: `AST cache bootstrap started|complete` in `~/.hermes/logs/mcp-stderr.log`.

### 2.3 AX robustness — page titles & write targets (v1.9.7+)

Local LLMs often send **wrong page title formats**. Plumber normalizes before lookup:

| You may send | Plumber accepts |
|--------------|-----------------|
| `Domain/Topic` (semantic) | Canonical Logseq title |
| `Domain___Topic` or `Domain___Topic.md` | Same (namespace encoding) |
| `pages/Domain___Topic.md` | Strips prefix/suffix |
| Wrong casing `DOMAIN/topic` | Case-insensitive match |

**Never send** path traversal (`../`, `../../etc/passwd`) — rejected with a clear error; MCP session survives.

**`mutate_graph` / `write_outline` targets:**

| `target` | When to use |
|----------|-------------|
| `parent-block-uuid` | After `xray_page` or spatial read |
| `[n]` | X-Ray alias from `.matryca_xray_state.json` |
| `Page Title\|block-uuid` or `Page Title\|[n]` | **Recommended** when the model might hallucinate UUIDs |

If the block ref is invalid but the **page exists**, Plumber **safe-appends** at page bottom and returns `warnings` (check them). Bare unknown aliases like `[42]` without a page still fail with `ok: false`.

Spec: `docs/openspec/agent-ax-robustness.md`

### 2.4 Security & Sandbox (v1.9.9+)

Graph reads and writes stay inside **`LOGSEQ_GRAPH_PATH`**. v1.9.9 adds defense-in-depth beyond MCP title normalization; **v1.9.13 (Enterprise Resilience)** hardens parsing and RAG boundaries — TOCTOU-safe bounded JSON reads, `templates_subdir` traversal rejection, namespace-aware semantic cache keys, subtree heading fences (token-efficient excerpts), and string-aware LLM JSON recovery so braces inside string values do not truncate payloads. **v1.9.14** adds journal-aware Phase 2 clustering (daily notes no longer inflate `[unclustered]` or cluster-focus prompts) and skips entity-consolidation LLM turns for journal/date wikilink pairs.

| Control | Operator note |
|---------|----------------|
| Path sandbox | `read_graph_file_text()` on graph paths; `../` and symlink escape → `PathTraversalSecurityError` |
| Link registry | Tampered `.matryca_link_registry.json` paths are rejected before read |
| JSON size cap | `MATRYCA_JSON_MAX_BYTES` (default 64 MiB) on catalog/registry/daemon/cache loaders |
| UI token | Set `MATRYCA_UI_TOKEN` on shared hosts; `.env.example` templates `MATRYCA_UI_REQUIRE_EXPLICIT_TOKEN=true` |
| Debug NDJSON | `MATRYCA_LLM_DEBUG_LOG_PATH` must lie under allowed roots; secrets redacted when enabled |

**Do not** rely on raw filesystem reads of `pages/` — use Plumber tools (section 2.1). Full matrix: `SECURITY.md` · spec: `docs/openspec/security-sandbox.md`

### 2.5 Diagnostics & audit (no `doctor` command)

There is **no** `doctor` subcommand. Use these instead:

| Goal | Command |
|------|---------|
| Bootstrap / graph insights dashboard | `uvx matryca-plumber plumber audit` or `uvx matryca-plumber audit` |
| Semantic cluster neighborhoods | `uvx matryca-plumber plumber cluster` or `uvx matryca-plumber cluster` |
| Sovereign UI + pre-flight checks | `uvx matryca-plumber status` → UI on `http://127.0.0.1:8500` (no daemon until **Start Engine** or `plumber start`) |
| Block-reference lint | `uvx matryca-plumber lint block_refs` |
| Maintenance daemon (headless) | `uvx matryca-plumber plumber start` / `plumber stop` (no browser; pair with `status` for the cockpit) |

**JSON audit output:**
```bash
export LOGSEQ_GRAPH_PATH="/absolute/path/to/your/logseq/graph"
uvx matryca-plumber --json plumber audit
```

---

## 3. Zero-shot examples (small LLMs — copy exactly)

**A. Help + env check**
```bash
export LOGSEQ_GRAPH_PATH="$HOME/Logseq/graphs/MyGraph"
uvx matryca-plumber --help
```

**B. Read one page as JSON**
```bash
export LOGSEQ_GRAPH_PATH="$HOME/Logseq/graphs/MyGraph"
uvx matryca-plumber --json read page "My Project"
```

**C. Recent journal tasks (last 7 days) as JSON**
```bash
export LOGSEQ_GRAPH_PATH="$HOME/Logseq/graphs/MyGraph"
uvx matryca-plumber --json search journal_tasks "7"
```

**D. Run graph audit after user reports errors**
```bash
export LOGSEQ_GRAPH_PATH="$HOME/Logseq/graphs/MyGraph"
uvx matryca-plumber --json plumber audit
```

**E. Load bundled agent context (markdown stdout)**
```bash
export LOGSEQ_GRAPH_PATH="$HOME/Logseq/graphs/MyGraph"
uvx matryca-plumber context load "My Project"
```

**F. Write outline with page context (safe fallback if UUID wrong)**
```bash
export LOGSEQ_GRAPH_PATH="$HOME/Logseq/graphs/MyGraph"
uvx matryca-plumber --json mutate write_outline \
  --target "My Project|aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee" \
  --payload '{"text":"New bullet","children":[]}'
```

For **today's journal page** as `read page`, use the **exact Logseq page title** shown in the app (locale-dependent), not a file path like `journals/2026_06_05.md`.

---

## 4. Anti-patterns (DO NOT DO THIS)

* **DO NOT** `grep`, `find`, or write Python to parse `pages/` / `journals/` — use `uvx matryca-plumber read` / `search` / `context load`.
* **DO NOT** `git clone https://github.com/MarcoPorcellato/matryca-plumber` to "use" the tool — use **`uvx matryca-plumber`** from PyPI.
* **DO NOT** `pip install matryca-plumber` unless the user explicitly asked for a global install.
* **DO NOT** invent flags: no `--graph`, no `doctor`, no `mcp --port 8080`.
* **DO NOT** ask the user to install Python dependencies manually — **`uvx`** resolves the wheel.
* **DO NOT** hand-craft `pages/*.md` filenames with raw `/` — pass **semantic titles** (`Domain/Topic`); Plumber maps to `Domain___Topic.md`.
* **DO NOT** panic on `warnings` in mutate JSON — read them; Plumber may have recovered via safe append.

---

## 5. Further reading (humans & agents)

* Operator README: `README.md`
* Agent onboarding spec: `docs/openspec/agent-onboarding.md`
* **AX robustness spec:** `docs/openspec/agent-ax-robustness.md`
* LLM OS contract (two-tier, Soft Gate, Safe-Sync): `SYSTEM_PROMPT.md` § "LLM OS"
* Agent DX spec (CLI JSON, Journey Log — one cumulative `- 🤖 Matryca Activity` bullet per day in the daemon journal): `docs/openspec/agent-dx.md`
* Security (MCP gate, graph sandbox, bounded JSON, CLI redaction): `SECURITY.md` · `docs/openspec/security-sandbox.md`
* Hermes Agent MCP (lazy handshake, timeouts): `docs/integrations/hermes-agent.md`

---

## 6. LLM OS — two-tier architecture (MANDATORY for graph work)

Matryca Plumber implements a **dual-LLM** system. You are almost certainly **Tier 2 (Cognitive Agent)** — not the background Gardener.

| Tier | Role | Your action |
|------|------|-------------|
| **Tier 1 — Gardener** | Daemon Phase 1 harvest (`matryca plumber start`) | **NEVER impersonate.** Do not run harvest, rewrite `### Matryca Semantic Index` blocks, or edit `master_catalog.json`. |
| **Tier 2 — Cognitive Agent** | You (MCP / CLI) | **MUST** follow the Master Index Soft Gate and Safe-Sync rules in [`SYSTEM_PROMPT.md`](SYSTEM_PROMPT.md) § "LLM OS". |

**Before any `search_graph` or targeted `read_graph_data` on L2 wiki pages:**

1. `read_graph_data` / `target_type=memory` — load L1 session rules.
2. `read_graph_data` / `target_type=bootstrap_status` — check Phase 1 semaphore.
3. `read_graph_data` / `target_type=page` / `query=Matryca Master Index` — scan the compiled catalog.
4. If the index is **missing**, **empty**, or Phase 1 is **in progress** → **pause** and present the user with 3 options (Local Daemon / Blind Search / Cloud Indexing). **WAIT** for explicit authorization before Blind Search or Cloud Indexing. **NEVER** guess page titles or `grep pages/` without authorization.
5. When the gate is green (or user authorizes Option B), pinpoint exact `[[Page Title]]` from the index, then call narrow reads (`page`, `subtree`, `xray_page`).

**CLI equivalent:**
```bash
uvx matryca-plumber --json read bootstrap_status
uvx matryca-plumber --json read page "Matryca Master Index"
```

**Safe-Sync (summary):** READ only via Matryca tools on `pages/` + `journals/` under `LOGSEQ_GRAPH_PATH`. **NEVER** open Logseq's internal app database. WRITE only via `mutate_graph`, `refactor_blocks`, `ingest_document`, `store_fact` (atomic `.md` + OCC). Full contract: `SYSTEM_PROMPT.md`.
