Pattern families β OrchestKit surfaces
Green = strong implementation Β· amber = partial Β· red = missing. Click tabs above for per-family deep dives.
flowchart LR classDef good fill:#1d3325,stroke:#3fb950,color:#e6edf3 classDef warn fill:#332a14,stroke:#d29922,color:#e6edf3 classDef bad fill:#3a1d1d,stroke:#f85149,color:#e6edf3 classDef fam fill:#2a1d3a,stroke:#d2a8ff,color:#e6edf3,font-weight:bold F1["π Self-correction loops
6.2/10"]:::fam F2["βοΈ Verifier sub-agents
7.0/10"]:::fam F3["π§ Memory outer loop
4.4/10"]:::fam F1 --> A1["prd-to-goal 9/10
PRD β /goal assertions"]:::good F1 --> A2["brainstorm iterative-opt 8.5/10
autoresearch metric loop"]:::good F1 --> A3["cover heal loop 7/10
ci-sentinel 7/10"]:::warn F1 --> A4["rubric-file convention
grader-in-loop Β· CMA Outcomes"]:::bad F2 --> B1["verify 6-7 fork graders
composite GATES merge"]:::good F2 --> B2["adversarial refutation
blind refuters + ledger + quorum"]:::good F2 --> B3["assess advisory-only
no dimension blockers"]:::warn F2 --> B4["machine-readable
rubric contract"]:::bad F3 --> C1["investigate 7/10
dream Β· staleness cron Β· lint"]:::good F3 --> C2["consult 4/10
only 2/6 auto-inject"]:::warn F3 --> C3["VERIFY loop never closes
distill fragmented Β· journal unbuilt"]:::bad
β Strongest: independent-context verification
verify/review-pr/assess already do what Lance recommends β grading in fork contexts, blind adversarial refuters, citation re-verification, anti-sycophancy. The bones of "Outcomes" exist.
β οΈ Partial: loops exist but each is bespoke
prd-to-goal is a reference rubric-as-assertions implementation, but every loop (cover Γ3, brainstorm stuck-5, ci-sentinel hourly) hard-codes its own convergence logic. No shared rubric file, no grader auditing /goal assertions.
π΄ Weakest: memory stops at step 2β3
Per Lance's taxonomy OrchestKit memory behaves like "Opus 4.7": good investigation, low verification coverage, fragmented distillation (recent-decisions.md corrupted), mostly-manual consultation. The Fable 5 differentiator β closing failββ¦βconsult β is the open prize.
π Family 1 β Self-correction loops Β· 6.2/10
"Let Claude run, collect feedback via the goal or rubric, self-correct, and proceed until satisfied."
flowchart LR classDef have fill:#1d3325,stroke:#3fb950,color:#e6edf3 classDef miss fill:#3a1d1d,stroke:#f85149,color:#e6edf3,stroke-dasharray:5 4 R["π Rubric / Goal"]:::have --> RUN["βΆοΈ Run task"]:::have RUN --> FB["π Collect feedback
tests Β· metrics Β· CI"]:::have FB --> SC["π§ Self-correct"]:::have SC --> RUN FB --> DONE{"rubric
satisfied?"}:::have DONE -- yes --> STOP["βΉοΈ stop"]:::have RF["π rubric.json file contract
(user-editable, cross-skill)"]:::miss -. missing .-> R GR["π§ββοΈ grader audits assertions
on /goal timeout"]:::miss -. missing .-> DONE CMA["βοΈ CMA Outcomes integration"]:::miss -. missing .-> STOP
| Surface | Loop mechanism | Score | Missing |
|---|---|---|---|
| prd-to-goal | PRD β /goal until observable assertions | 9.0 | assertion-quality grader |
| brainstorm iterative-opt | baseline β ideate β measure β keep/discard β stuck-5 | 8.5 | rubric file; underused |
| audit-full | /goal until findings.critical == 0 | 8.0 | configurable severity ladder |
| ScheduleWakeup | cache-aware poll β decide β reschedule | 7.5 | no rubric, imperative only |
| cover (healer) | generate β run β heal failures Γ3 | 7.0 | hard-coded 3; no test-quality rubric |
| ci-sentinel | hourly classify-failing-PRs, propose-don't-apply | 7.0 | no auto-learning from approved fixes |
| verify / assess | rubrics exist but grade once β no loop | 5.0 | no "score < target β improve β re-run" |
βοΈ Family 2 β Verifier sub-agents Β· 7.0/10
"A verifier sub-agent tends to outperform self-critique, because grading is done in an independent context window." The gap is authority: who can actually block stop?
flowchart TB classDef good fill:#1d3325,stroke:#3fb950,color:#e6edf3 classDef warn fill:#332a14,stroke:#d29922,color:#e6edf3 classDef bad fill:#3a1d1d,stroke:#f85149,color:#e6edf3 W["π οΈ Producer (lead context)"] --> V1["verify: 6-7 fork graders
8 dims, composite under 6.0 BLOCKS β "]:::good W --> V2["review-pr: blind refuters
quorum + ledger + no-auto-flip β "]:::good W --> V3["assess: refuters but
ADVISORY ONLY β οΈ"]:::warn W --> V4["bare-eval: fresh-context judge
primitive, no verdict surface β οΈ"]:::warn V1 --> GATE{"stop-gate"} V2 --> GATE V3 -. no authority .-> GATE GATE --> X1["β no dimension-level blockers
(security 3.2 hides in passing composite)"]:::bad GATE --> X2["β no Outcomes-style
grader-must-approve-stop"]:::bad
| Surface | Independent context | Refutation | Stop-gating |
|---|---|---|---|
| verify | β fork Γ6-7 | β | YES β composite gates merge |
| review-pr | β fork, no team | β blind, quorum | YES β CRITICAL β request-changes |
| assess | β fork Γ4 | β Phase 2.5 | NO β advisory only |
| cover | β worktree/tier | β | implicit (flag after 3 heals) |
| bare-eval / eval-runner | β --bare fresh | β | NO β pipeline primitive |
| brainstorm devil's advocate | β agents | partial | NO β ranking only |
π§ Family 3 β Memory progression Β· 4.4/10
Fable 5 completes fail β investigate β verify β distill β consult (verification coverage up to 73%). OrchestKit currently exits around step 2β3 β the "Opus 4.7" profile.
flowchart LR classDef s3 fill:#3a1d1d,stroke:#f85149,color:#e6edf3 classDef s7 fill:#1d3325,stroke:#3fb950,color:#e6edf3 classDef s5 fill:#332a14,stroke:#d29922,color:#e6edf3 classDef s4 fill:#332a14,stroke:#d29922,color:#e6edf3 FAIL["1οΈβ£ FAIL Β· 3/10
remember skill, handoffs
β no 'ignored-advice' tracking"]:::s3 INV["2οΈβ£ INVESTIGATE Β· 7/10
dream refs, staleness cron,
memory-lint, validator"]:::s7 VER["3οΈβ£ VERIFY Β· 5/10
staleness classified BUT
reports never consumed"]:::s5 DIS["4οΈβ£ DISTILL Β· 3/10
dream dedup exists BUT
recent-decisions.md corrupted,
experiment journal unbuilt"]:::s3 CON["5οΈβ£ CONSULT Β· 4/10
2/6 auto-inject; brainstorm
prints a reminder, not results"]:::s4 FAIL --> INV --> VER --> DIS --> CON CON -. outer loop never closes .-> FAIL
Gap M1 β VERIFY never closes the loop
Nightly staleness reports have zero consumers; no signal records whether a consulted (or ignored) memory changed an outcome. This is the exact step Lance identifies as the Fable 5 differentiator.
Gap M2 β DISTILL fragmented + corrupted index
recent-decisions.md (the distilled-rules index) contains mangled truncated entries and has no owning write mechanism. Brainstorm's experiment journal is referenced in phase-workflow.md but the TSV was never implemented.
Gap M3 β CONSULT is manual
priorDecisionsLoader injects "IMPORTANT: search memory" instead of running the search and injecting results. Only session-handoff + memory-lint auto-inject.
π³οΈ Gap heatmap β ranked by (impact Γ cross-family reach) / effort
Note how the rubric-file contract and stop-gating grader each appear in two families β one fix, double coverage.
| # | Gap | Family | Severity | Effort | Fix sketch |
|---|---|---|---|---|---|
| 1 | No machine-readable rubric contract | π + βοΈ | HIGH | 3-4 d | ork-rubric/1.0 schema; backfill verify/assess/review-pr/prd-to-goal |
| 2 | assess advisory-only β no stop-gating | βοΈ | HIGH | 2-3 d | emit assess-verdict.json; gate /ork:implement at <5.5 |
| 3 | Memory VERIFY loop never closes | π§ | HIGH | 1 wk | consume staleness reports in CI; record consultβoutcome signals |
| 4 | recent-decisions.md corrupted; journal unbuilt | π§ | HIGH | 2-3 d | dream owns regeneration; implement brainstorm TSV journal |
| 5 | No grader-in-the-loop for /goal | π | MED | 2 d | post-timeout grader audits assertion quality |
| 6 | verify lacks dimension-level blockers | βοΈ | MED | 1-2 d | security <4.0 always blocks regardless of composite |
| 7 | CONSULT manual (4/6 surfaces) | π§ | MED | 2-3 d | priorDecisionsLoader runs search, injects results (relevance-gated) |
| 8 | No cross-skill loop primitive | π | MED | ? | likely a chain-patterns reference, NOT a skill ("/workflows: use don't wrap") |
| 9 | No CMA / Outcomes integration | π + βοΈ | LOW* | n/a | watch β hosted product surface, plugin can't reach it yet |
* LOW for the plugin today; HIGH strategically if CMA exposes an API the plugin can target.
πΊοΈ Recommended sequencing
A β B β (C reassessed). Approach A closes the #1 gap of two families with one contract.
flowchart TB classDef rec fill:#2a1d3a,stroke:#d2a8ff,color:#e6edf3,font-weight:bold classDef nor fill:#161b22,stroke:#79c0ff,color:#e6edf3 classDef opt fill:#161b22,stroke:#8b949e,color:#8b949e,stroke-dasharray:5 4 A["π °οΈ Rubric contract + stop-gating grader Β· 8.3
ork-rubric/1.0 schema Β· assess gates implement
verify dimension blockers Β· /goal assertion grader
β 1 week"]:::rec B["π ±οΈ Close the memory VERIFY loop Β· 7.8
consume staleness reports Β· consultβoutcome signals
fix recent-decisions.md Β· experiment journal TSV
auto-inject prior decisions Β· β 1 week"]:::nor C["π ² loop-until-rubric primitive / CMA parity Β· 6.5
β οΈ devil's advocate: '/workflows β use, don't wrap'
after A this may reduce to a documented pattern"]:::opt A -->|"rubric schema unlocks
verification instrumentation"| B A -->|"grader gate makes C
mostly free"| C B --> DONE["π OrchestKit completes the progression
= 'Fable 5 memory profile' per Lance's taxonomy"] C -. reassess .-> DONE
π °οΈ Rubric contract + gating β composite 8.3 Β· RECOMMENDED
One JSON schema (dimensions, weights, min_pass, min_blocker, bands) shared by verify/assess/review-pr/prd-to-goal. assess emits a verdict file that gates /ork:implement; verify gains dimension blockers; /goal gains a post-timeout assertion grader.
π ±οΈ Memory verify loop β composite 7.8
Targets the weakest family and the Fable-5-specific differentiator. Weekly job diffs staleness reports + warns on PRs touching stale-memory files; consult events recorded; dream regenerates recent-decisions.md.
π ² Loop primitive β composite 6.5 Β· DEFER
A generic loop-until-rubric harness risks wrapping what CC-native /goal + Workflow already provide (prior decision: "use /workflows, don't wrap"). Build A first; C likely becomes a chain-patterns reference.
ποΈ Goal Builder β turn this analysis into a /goal line
Pick scope and effort; copy the generated kickoff prompt + /goal line into Claude Code.