You are iteration 5 of 10 in a /spec_kit:deep-review hunt for legacy residue from the local-LLM and embedding-default migration shipped in packet 014-local-embeddings-setup-a.

POST-014 CANONICAL DEFAULTS (treat as ground truth):
- Memory MCP: provider=hf-local, model=onnx-community/embeddinggemma-300m-ONNX, dtype=q8, dims=768
- CocoIndex: provider=sentence-transformers, model=google/embeddinggemma-300m, dtype=bf16, dims=768
- Provider resolution: EMBEDDINGS_PROVIDER=auto falls through to hf-local when no API keys present
- Optional: EMBEDDINGS_PROVIDER=llama-cpp (explicit opt-in) with synchronous auto-migrate on startup

PURGED FROM DEFAULTS (residue if asserted as active default):
- Voyage AI (voyageai, voyage-3, voyage-4, voyage-code-3) — kept as fallback in factory chain but NO LONGER the recommended/primary
- Qwen3 (qwen3, qwen-3, qwen-32k) — registry kept for opt-in, PURGED FROM DEFAULT DOCS
- MiniLM-384d — stale CocoIndex sqlite deleted in 004; no active code path
- Legacy generic context-index.sqlite (pre-filename-keyed) — deleted in 007
- ONNX runtime backend — REJECTED 2026-05-13 (014/014 sub-phase reverted)

SCOPE SURFACES (read-only):
1. Code .ts/.py/.cjs under: shared/, .opencode/skills/, scripts/, mcp_server/, cocoindex_code/
2. Markdown .md/SKILL.md/README/INSTALL_GUIDE under: .opencode/skills/**, .opencode/install_guides/, repo root
3. JSON/configs: description.json, graph-metadata.json (per packet), package.json, .utcp_config.json, .claude/mcp.json, root .mcp.json, opencode.json, _routes.yaml, .codex/config.toml, .gemini/settings.json, pyproject.toml, requirements*.txt, .opencode/settings*.json, .claude/settings*.json
4. Assets/templates/fixtures: assets/config_templates.md, prompt packs, test fixtures, frozen sample text
5. References: .opencode/skills/**/references/**

KNOWN STALE ANCHORS (must surface as P1 by iter 1-3):
- .opencode/skills/system-spec-kit/mcp_server/ENV_REFERENCE.md — "prefers voyage-4 then openai then hf-local"
- .opencode/skills/system-spec-kit/references/memory/embedding_resilience.md — 1024-dim cache key references
- .opencode/skills/mcp-coco-index/INSTALL_GUIDE.md — voyage-code-3 as primary
- .opencode/skills/mcp-coco-index/references/settings_reference.md — voyage as default
- .opencode/install_guides/README.md — text-embedding-3-small default export

INTENTIONAL HISTORICAL CONTEXT (do NOT flag as residue):
- .opencode/specs/system-spec-kit/026-graph-and-context-optimization/014-local-embeddings-setup-a/** migration narrative
- shared/embeddings/factory.ts provider fallback chain — LIVING resolver (falls through to hf-local when no key)
- .opencode/commands/doctor/{_routes.yaml,doctor_memory.yaml} provider-detection
- vitest fixtures exercising Voyage/OpenAI fallback paths — regression safety
- Code-graph or spec-memory rows with old strings (data, not source)
- This packet (021-local-llm-legacy-review/**) — review artifacts, not target

REVIEW DIMENSIONS (categorize each finding under exactly one):
- correctness: dead/unreachable code, incorrect defaults asserted in code, config-drift between committed configs
- traceability: stale docs/READMEs/SKILL.md/INSTALL_GUIDE/references claiming outdated defaults
- maintainability: fixture rot, asset rot, orphaned templates, legacy prompt-pack residue

SEVERITY LEVELS:
- P0 (Blocker): default-asserting code/doc that BREAKS post-014 behavior or actively misleads users into wrong setup
- P1 (Required): visible/user-facing doc or config asserting a different default than canonical post-014
- P2 (Suggestion): commented-out residue, obsolete example, redundant fixture

BANNED OPERATIONS (RM-8 Layer 1):
- NO rm, NO rm -rf, NO git rm, NO mv, NO sed -i, NO rmdir
- NO writes outside the iteration file at the exact path given below
- NO commits, NO branch creation
- READ-ONLY against repo except your iteration file
If you detect a need to mutate, record it as a "scope_violation" finding instead of executing.

YOUR TASK FOR THIS ITERATION (#5):
1. Focus dimension for this iter: traceability
2. Use rg/grep + Read across the scope surfaces. Discriminate residue from intentional historical context.
3. Produce EXACTLY ONE file: /Users/michelkerkmeester/MEGA/Development/Code_Environment/Public/.opencode/specs/system-spec-kit/026-graph-and-context-optimization/014-local-embeddings-setup-a/021-local-llm-legacy-review/review/iterations/iteration-005.md
4. The file must be markdown with this structure:

# Iteration 005 — Local-LLM Legacy Hunt

## Focus
[one paragraph: what you scanned this iteration and why]

## Findings

| ID | Severity | Dimension | File:Line | Evidence (quote) | Disposition | Recommendation |
|----|----------|-----------|-----------|------------------|-------------|----------------|
| L-005-001 | P1 | traceability | path/to/file:N | "quoted snippet" | confirmed-residue | [short fix recommendation] |
| ... | ... | ... | ... | ... | ... | ... |

## Iteration summary
- Files scanned: N
- New findings: N (P0=N, P1=N, P2=N)
- Out-of-scope/historical noted but NOT flagged: N
- Notes: [anything for the synthesizer]

CONSTRAINTS:
- 5–15 NEW findings per iter (avoid duplicating prior iterations — see prior findings below)
- Each finding MUST have a real file:line evidence quote (no hallucinated paths)
- If you cannot find new genuine residue, output fewer findings + note "saturation" in iteration summary
- Skip files inside /Users/michelkerkmeester/MEGA/Development/Code_Environment/Public/.opencode/specs/system-spec-kit/026-graph-and-context-optimization/014-local-embeddings-setup-a/021-local-llm-legacy-review (this review packet itself)
- Skip files inside .git/, node_modules/, __pycache__/, .venv/, dist/, build/, _sandbox/

WRITE ONLY /Users/michelkerkmeester/MEGA/Development/Code_Environment/Public/.opencode/specs/system-spec-kit/026-graph-and-context-optimization/014-local-embeddings-setup-a/021-local-llm-legacy-review/review/iterations/iteration-005.md. DO NOT WRITE ANYWHERE ELSE.


PRIOR ITERATIONS FINDINGS (avoid duplicate flags — these are already covered):

--- iteration-001.md ---
| L-001-001 | P0 | correctness | .opencode/skills/system-spec-kit/shared/embeddings/factory.ts:819 | "const llamaCppAvailability = getLlamaCppAvailability();" | confirmed-residue | Remove `llama-cpp` from the implicit `auto` no-key path; only select it when `EMBEDDINGS_PROVIDER=llama-cpp` is explicit. |
| L-001-002 | P0 | correctness | .opencode/skills/system-spec-kit/shared/embeddings/factory.ts:822 | "name: 'llama-cpp'," | confirmed-residue | Change the no-key default branch to return `hf-local` with `onnx-community/embeddinggemma-300m-ONNX`, `q8`, 768 dims. |
| L-001-003 | P1 | correctness | .codex/config.toml:23 | "_NOTE_3_EMBEDDINGS_PROVIDER = \"Options: auto \| llama-cpp \| hf-local \| voyage \| openai. Default 'auto' resolves to llama-cpp; set EMBEDDINGS_PROVIDER=hf-local to opt out.\"" | confirmed-residue | Update Codex config notes so `auto` resolves to `hf-local` without API keys and `llama-cpp` is explicit opt-in. |
| L-001-004 | P1 | correctness | .claude/mcp.json:19 | "\"_NOTE_3_EMBEDDINGS_PROVIDER\": \"Options: auto \| llama-cpp \| hf-local \| voyage \| openai. Default 'auto' resolves to llama-cpp; set EMBEDDINGS_PROVIDER=hf-local to opt out.\"," | confirmed-residue | Align Claude MCP config note with post-014 provider resolution and remove the opt-out framing. |
| L-001-005 | P1 | correctness | opencode.json:29 | "\"_NOTE_3_EMBEDDINGS_PROVIDER\": \"Options: auto \| llama-cpp \| hf-local \| voyage \| openai. Default 'auto' resolves to llama-cpp; set EMBEDDINGS_PROVIDER=hf-local to opt out.\"," | confirmed-residue | Align OpenCode config note with `auto -> hf-local` for no-key startup. |
| L-001-006 | P1 | correctness | .gemini/settings.json:36 | "\"_NOTE_3_EMBEDDINGS_PROVIDER\": \"Options: auto \| llama-cpp \| hf-local \| voyage \| openai. Default 'auto' resolves to llama-cpp; set EMBEDDINGS_PROVIDER=hf-local to opt out.\"," | confirmed-residue | Align Gemini config note with `auto -> hf-local` and make `llama-cpp` opt-in only. |
| L-001-007 | P1 | traceability | .opencode/skills/system-spec-kit/mcp_server/ENV_REFERENCE.md:445 | "In `EMBEDDINGS_PROVIDER=auto`, the runtime prefers Voyage `voyage-4` (1024 dims) when `VOYAGE_API_KEY` is present, then OpenAI `text-embedding-3-small` (1536 dims) when `OPENAI_API_KEY` is present" | confirmed-residue | Rewrite the embedding default reference around the canonical hf-local default; keep cloud providers as explicit/optional fallback details. |
| L-001-008 | P1 | traceability | .opencode/skills/system-spec-kit/references/memory/embedding_resilience.md:23 | "\| Voyage Provider \| `shared/embeddings/providers/voyage.ts` \| Primary embedding provider \|" | confirmed-residue | Change the architecture table to name `hf-local` as the default provider and Voyage as optional/fallback. |
| L-001-009 | P1 | traceability | .opencode/skills/system-spec-kit/references/memory/embedding_resilience.md:48 | "│ 1. VOYAGE AI (Primary)                                            │" | confirmed-residue | Replace the fallback-order diagram with the post-014 resolver order and remove Voyage-as-primary language. |
| L-001-010 | P1 | traceability | .opencode/skills/mcp-coco-index/INSTALL_GUIDE.md:101 | "\| **Embedding** \| Configurable (default: all-MiniLM-L6-v2 local, recommended: Voyage Code 3) \|" | confirmed-residue | Update CocoIndex install guide to `sentence-transformers` + `google/embeddinggemma-300m`, `bf16`, 768 dims as the default. |
| L-001-011 | P1 | traceability | .opencode/skills/mcp-coco-index/references/settings_reference.md:43 | "\| `embedding.model`    \| string        \| `sentence-transformers/all-MiniLM-L6-v2`   \| Model identifier passed to the provider          \|" | confirmed-residue | Replace MiniLM defaults with `google/embeddinggemma-300m` and 768-dimensional expectations. |
| L-001-012 | P1 | traceability | .opencode/skills/mcp-coco-index/SKILL.md:271 | "\| `voyage/voyage-code-3` (primary) \| Cloud via LiteLLM \| 1024 \| `VOYAGE_API_KEY` required \| Higher quality code search \|" | confirmed-residue | Remove `primary` from Voyage Code 3 and document it only as optional LiteLLM/cloud configuration. |
| L-001-013 | P1 | maintainability | .opencode/skills/mcp-coco-index/assets/config_templates.md:75 | "\"_NOTE_2\": \"Default embedding: all-MiniLM-L6-v2 (local, no API key needed)\"," | confirmed-residue | Refresh config templates so generated client configs do not reintroduce the MiniLM default. |
| L-001-014 | P1 | traceability | .opencode/install_guides/README.md:671 | "export VOYAGE_EMBEDDINGS_MODEL=voyage-3.5  # Default" | confirmed-residue | Replace the stale Voyage default example with canonical hf-local defaults; keep Voyage only as optional explicit provider setup. |
| L-001-015 | P1 | traceability | .opencode/install_guides/README.md:678 | "export HF_EMBEDDINGS_MODEL=nomic-ai/nomic-embed-text-v1.5  # Default" | confirmed-residue | Update HF Local example to `onnx-community/embeddinggemma-300m-ONNX`, `q8`, 768 dims. |

--- iteration-002.md ---
| L-002-001 | P1 | traceability | README.md:139 | "# Option A: Voyage AI (recommended - best quality)" | confirmed-residue | Update the root quick start to lead with hf-local EmbeddingGemma defaults and move Voyage to optional explicit setup. |
| L-002-002 | P1 | traceability | README.md:517 | "- **Voyage AI** - Set `VOYAGE_API_KEY` env var. Best quality, recommended." | confirmed-residue | Remove Voyage-as-recommended wording from the embedding provider feature list. |
| L-002-003 | P1 | traceability | README.md:828 | "Semantic code search via vector embeddings (Voyage Code 3 and All-MiniLM-L6-v2 models)" | confirmed-residue | Replace the CocoIndex model summary with `google/embeddinggemma-300m` / sentence-transformers / bf16 / 768 dims. |
| L-002-004 | P1 | traceability | .opencode/skills/system-spec-kit/mcp_server/README.md:46 | "The default `auto` cascade is cloud key when configured, then `llama-cpp` when the local GGUF runtime is available, then `hf-local` as the local fallback." | confirmed-residue | Rewrite provider resolution docs so `auto` falls through to hf-local when no API keys are present; document `llama-cpp` as explicit opt-in only. |
| L-002-005 | P1 | traceability | .opencode/skills/system-spec-kit/mcp_server/README.md:52 | "`EMBEDDINGS_PROVIDER=auto` resolves to `llama-cpp` after cloud providers." | confirmed-residue | Remove the auto-to-llama-cpp claim and describe synchronous migration only for explicit `EMBEDDINGS_PROVIDER=llama-cpp`. |
| L-002-006 | P1 | traceability | .opencode/skills/system-spec-kit/README.md:361 | "Compares meaning via embeddings (Voyage AI 1024d)" | confirmed-residue | Update the hybrid-search channel docs to name hf-local EmbeddingGemma q8 768d as the Memory MCP default. |
| L-002-007 | P1 | traceability | .opencode/skills/system-spec-kit/README.md:691 | "Recommended. Best retrieval quality. Requires `VOYAGE_API_KEY`" | confirmed-residue | Remove the recommendation from the provider table and mark Voyage as optional/fallback. |
| L-002-008 | P1 | traceability | .opencode/skills/system-spec-kit/shared/README.md:340 | "`HF_EMBEDDINGS_MODEL` ... `nomic-ai/nomic-embed-text-v1.5`" | confirmed-residue | Replace the stale HF model default with `onnx-community/embeddinggemma-300m-ONNX`, q8, 768 dims. |
| L-002-009 | P1 | traceability | .opencode/skills/system-spec-kit/shared/README.md:345 | "Auto-detection: Voyage if `VOYAGE_API_KEY` exists (recommended)" | confirmed-residue | Align provider precedence with post-014 behavior and remove the Voyage recommendation. |
| L-002-010 | P1 | traceability | .opencode/skills/mcp-coco-index/README.md:76 | "`sentence-transformers/all-MiniLM-L6-v2` (local, no API key)" | confirmed-residue | Change the documented default model to `google/embeddinggemma-300m` with 768-dimensional expectations. |
| L-002-011 | P1 | traceability | .opencode/skills/mcp-coco-index/README.md:77 | "`voyage/voyage-code-3` via LiteLLM (1024-dim, requires `VOYAGE_API_KEY`)" | confirmed-residue | Remove `voyage-code-3` as the primary model and present it only as optional LiteLLM configuration if retained. |
| L-002-012 | P1 | traceability | .opencode/skills/mcp-coco-index/INSTALL_GUIDE.md:437 | "**Primary (recommended):** `voyage/voyage-code-3` via LiteLLM provider -- best code search quality." | confirmed-residue | Rewrite the CocoIndex model section so sentence-transformers + `google/embeddinggemma-300m` is primary/default. |

--- iteration-003.md ---
| L-003-001 | P1 | maintainability | .opencode/skills/mcp-coco-index/assets/config_templates.md:140 | `"_NOTE_2": "Default embedding: all-MiniLM-L6-v2 (local, no API key needed)",` | confirmed-residue | Update the Claude MCP config template to advertise `google/embeddinggemma-300m`, bf16, 768 dims as the CocoIndex default. |
| L-003-002 | P1 | maintainability | .opencode/skills/mcp-coco-index/assets/config_templates.md:160 | `_NOTE_2 = "Default embedding: all-MiniLM-L6-v2 (local, no API key needed)"` | confirmed-residue | Update the Codex config template note to the post-014 EmbeddingGemma default. |
| L-003-003 | P1 | maintainability | .opencode/skills/mcp-coco-index/INSTALL_GUIDE.md:354 | `"_NOTE_2_EMBEDDING": "Default: all-MiniLM-L6-v2 (local, no API key needed)",` | confirmed-residue | Refresh the OpenCode install-guide config sample so copied configs do not reintroduce MiniLM. |
| L-003-004 | P1 | maintainability | .opencode/skills/mcp-coco-index/INSTALL_GUIDE.md:409 | `"_NOTE_2_EMBEDDING": "Default: all-MiniLM-L6-v2 (local, no API key needed)",` | confirmed-residue | Refresh the Claude install-guide config sample to the canonical CocoIndex EmbeddingGemma default. |
| L-003-005 | P1 | maintainability | .opencode/skills/mcp-coco-index/INSTALL_GUIDE.md:429 | `_NOTE_2 = "Default embedding: all-MiniLM-L6-v2 (local, no API key needed)"` | confirmed-residue | Refresh the Codex install-guide config sample to the canonical CocoIndex EmbeddingGemma default. |
| L-003-006 | P2 | maintainability | .opencode/skills/mcp-coco-index/manual_testing_playbook/manual_testing_playbook.md:301 | `embedding.model matches a documented model such as the default local sentence-transformers/all-MiniLM-L6-v2 or a LiteLLM model like voyage/voyage-code-3` | confirmed-residue | Regenerate CFG-001 playbook text so manual verification checks for `google/embeddinggemma-300m` instead of accepting stale defaults. |
| L-003-007 | P2 | maintainability | .opencode/skills/mcp-coco-index/manual_testing_playbook/03--configuration/001-default-model-verification.md:42 | `embedding.model matches a documented model such as sentence-transformers/all-MiniLM-L6-v2 or voyage/voyage-code-3` | confirmed-residue | Regenerate the per-scenario CFG-001 file with the post-014 default model and dimensions. |
| L-003-008 | P2 | maintainability | .opencode/skills/mcp-coco-index/feature_catalog/03--indexing-pipeline/04-embedding-provider-selection.md:23 | `Default user settings choose sentence-transformers/all-MiniLM-L6-v2.` | confirmed-residue | Regenerate the feature-catalog current-reality section to name `sentence-transformers` + `google/embeddinggemma-300m`. |
| L-003-009 | P2 | maintainability | .opencode/skills/mcp-coco-index/tests/test_settings.py:50 | `assert "all-MiniLM-L6-v2" in s.embedding.model` | confirmed-residue | Update the default settings test to assert `google/embeddinggemma-300m`. |
| L-003-010 | P2 | maintainability | .opencode/skills/mcp-coco-index/tests/test_config.py:49 | `assert "all-MiniLM-L6-v2" in config.embedding_model` | confirmed-residue | Update the env-derived config test to assert the canonical EmbeddingGemma model. |
| L-003-011 | P2 | maintainability | .opencode/skills/system-spec-kit/mcp_server/tests/fixtures/sample-memories.json:77 | `"content": "Vector embeddings use 384 dimensions for semantic search",` | confirmed-residue | Update fixture content to 768 dimensions or make it model-neutral so tests do not preserve MiniLM assumptions. |
| L-003-012 | P2 | maintainability | .opencode/skills/system-spec-kit/mcp_server/tests/fixtures/similarity-test-cases.json:139 | `"embeddingModel": "all-MiniLM-L6-v2",` | confirmed-residue | Update similarity fixture metadata to EmbeddingGemma and 768 dimensions, or remove model-specific metadata if not required. |

--- iteration-004.md ---
| L-004-001 | P1 | correctness | .opencode/skills/system-spec-kit/shared/embeddings.ts:868 | `export const DEFAULT_MODEL_NAME: string = 'nomic-ai/nomic-embed-text-v1.5';` | confirmed-residue | Change the legacy facade fallback to `onnx-community/embeddinggemma-300m-ONNX` with q8/768 expectations, or remove the fallback if factory profiles are now the only source of truth. |
| L-004-002 | P1 | correctness | .opencode/skills/system-spec-kit/shared/embeddings/profile.ts:79 | "return \`${baseDir}/context-index.sqlite\`;" | confirmed-residue | Remove or quarantine the legacy generic sqlite filename branch so all active Memory MCP profiles use filename-keyed `context-index__<profile>.sqlite` paths. |
| L-004-003 | P1 | correctness | .opencode/skills/system-spec-kit/scripts/evals/map-ground-truth-ids.ts:34 | `const DB_PATH = path.join(DB_DIR, 'context-index.sqlite');` | confirmed-residue | Resolve the active Memory MCP database from the current embedding profile instead of hardcoding the deleted generic sqlite filename. |
| L-004-004 | P1 | correctness | .opencode/skills/system-spec-kit/scripts/evals/run-ablation.ts:50 | `const PROD_DB_PATH = path.join(DB_DIR, 'context-index.sqlite');` | confirmed-residue | Use the profile-derived post-014 sqlite path for ablation runs so evaluation reads the active hf-local EmbeddingGemma store. |
| L-004-005 | P1 | correctness | .opencode/skills/system-spec-kit/scripts/evals/run-bm25-baseline.ts:39 | `const PROD_DB_PATH = path.join(DB_DIR, 'context-index.sqlite');` | confirmed-residue | Replace the hardcoded generic database path with the active profile database resolver. |
| L-004-006 | P1 | correctness | .opencode/skills/system-spec-kit/mcp_server/lib/eval/memory-state-baseline.ts:65 | `return path.resolve(DEFAULT_DB_DIR, CONTEXT_DB_FILENAME);` | confirmed-residue | Stop defaulting baseline capture to `context-index.sqlite`; derive the current Memory MCP profile path unless an explicit override is supplied. |
| L-004-007 | P1 | correctness | .opencode/skills/system-spec-kit/scripts/memory/cleanup-index-scope-violations.ts:77 | `new URL('../../../mcp_server/database/context-index__voyage__voyage-4__1024.sqlite', import.meta.url),` | confirmed-residue | Make the cleanup script target the active profile database or require an explicit `--db` path; do not hardcode the old Voyage 1024 store. |
| L-004-008 | P1 | correctness | .codex/config.toml:21 | `_NOTE_1_DATABASE = "Default DB: context-index__llama-cpp__unsloth-embeddinggemma-300m-gguf__768__q8.sqlite (auto-derived from provider+model+dim+dtype; hf-local ONNX q8 is the fallback)."` | confirmed-residue | Update the Codex MCP config note so the default database is the hf-local EmbeddingGemma ONNX q8 profile; keep llama-cpp as explicit opt-in only. |
| L-004-009 | P1 | correctness | .claude/mcp.json:17 | `"_NOTE_1_DATABASE": "Default DB: context-index__llama-cpp__unsloth-embeddinggemma-300m-gguf__768__q8.sqlite (auto-derived from provider+model+dim+dtype; hf-local ONNX q8 is the fallback).",` | confirmed-residue | Align the Claude MCP default database note with post-014 hf-local defaults and remove llama-cpp-as-default wording. |
| L-004-010 | P1 | correctness | opencode.json:27 | `"_NOTE_1_DATABASE": "Default DB: context-index__llama-cpp__unsloth-embeddinggemma-300m-gguf__768__q8.sqlite (auto-derived from provider+model+dim+dtype; hf-local ONNX q8 is the fallback).",` | confirmed-residue | Align the OpenCode MCP config note with the hf-local EmbeddingGemma default database profile. |
| L-004-011 | P1 | correctness | .gemini/settings.json:34 | `"_NOTE_1_DATABASE": "Default DB: context-index__llama-cpp__unsloth-embeddinggemma-300m-gguf__768__q8.sqlite (auto-derived from provider+model+dim+dtype; hf-local ONNX q8 is the fallback).",` | confirmed-residue | Align the Gemini MCP config note with the hf-local EmbeddingGemma default database profile. |

