OrchestKit 8.12.3 — Post-Ship Verification & Hardening

Independent verify + assess of the 5 PRs shipped to main @ 0d787d196, then 4 bundled low-severity follow-ups. Interactive playground for the fix/post-ship-hardening PR.

Verification verdict

Ran an independent fresh build + drift + typecheck + skill/agent validation on a clean main worktree (not CI's word), and adversarially verified each handoff claim against the merged diffs. Result: 7/9 claims confirmed, 2 partial (both low-sev), 0 refuted. plugins/ drift-free, all suites green. The ship holds.

Quality assessment (/ork:assess dimensions)

Correctness
9/10
Root-cause discipline
9/10
Security / hygiene
9/10
Preview-reversal call
8/10

Claim-by-claim

ClaimVerdictNote
#2118 doctor counts 111 skills / 37 agentsconfirmedshared/ & README excluded → 111 / 37
#2118 model bump → Opus 4.8 (5 spots)confirmedno @ts-ignore / silencing
#2115 dep majors + TS6 drop w/ RCAconfirmedesbuild 0.28 · @types/node 24 · lucide 1.x
#2116 marketplace engine floor >=2.1.148confirmedno leftover 2.1.113
#2118 AskUserQuestion preview-stripconfirmed92 keys removed, 0 remain
#5 ci-report "//0" = div-by-zero?confirmedjq default-operator, not division
#2116 override label = admin-bypass? adversarialconfirmedNO bypass — sanctioned label, all required checks green
#2118 retired preview assertion = silencing? adversarialpartialnot silencing — but no replacement guard (this PR adds it)
#2117 docs floor + counts (hand-authored)partialheadline fix holds; stragglers in adjacent pages

The 4 bundled fixes

#1 Doc straggler floor + count root-cause

Root cause: the docs generator counted all dirs (incl. src/skills/shared/ with no SKILL.md) and globbed README.md as an agent — publishing 112 skills / 38 agents and emitting a bogus reference/agents/README.mdx. Plus a hand-authored troubleshooting page still cited a stale CC floor.

scripts/_build-docs-generate.py
- skill_dirs = sorted(d for d in skills_dir.iterdir() if d.is_dir())
+ skill_dirs = sorted(d for d in skills_dir.iterdir()
+     if d.is_dir() and (d / "SKILL.md").exists())   # 112 → 111
- agent_files = sorted(agents_dir.glob("*.md"))
+ agent_files = sorted(f for f in agents_dir.glob("*.md")
+     if f.stem.lower() != "readme")                 # 38 → 37, drops README.mdx

docs/site/content/docs/troubleshooting/index.mdx
- requires >= 2.1.34   + requires >= 2.1.148

Deferred (not a number-swap): skills/reference-skills.mdx lists 160 distinct slugs (only 16 are real reference skills) while claiming "61" — deep content rot, filed as a separate audit issue rather than papered over here.

#2 AskUserQuestion preview regression guard

Root cause: #2118 stripped all preview fields (CC side-by-side nav bug, 2026-05-28) but downgraded the test to an unconditional pass and left preview in the permitted-key set — a future skill could silently re-add it. This adds a real guard + a re-enable tripwire.

tests/skills/structure/test-askuserquestion-schema.sh
- VALID_OPTION_KEYS = {"label", "description", "preview"}
+ VALID_OPTION_KEYS = {"label", "description"}   # preview FORBIDDEN (nav bug)
+ if "preview" in keys: violations.append(... forbidden: dead up/down nav ...)

tests/skills/test-skill-cc-features.sh
- pass "AUQ previews: N skills use preview (informational)"
+ if auq_skills_with_preview == 0: pass ... else: FAIL (regression)
#3 release-please-guard self-clears on override

Root cause: the guard fires only on opened/synchronize/reopened, so a release-please-override label applied after a failing run never re-dispatched it — leaving a stale red check on the merge commit that reads like an admin bypass.

.github/workflows/release-please-guard.yml
- types: [opened, synchronize, reopened]
+ types: [opened, synchronize, reopened, labeled, unlabeled]
#4 cost-estimator: Opus 4.8 pricing + alias

Root cause: no claude-opus-4-8 pricing row existed and the opus alias still pointed at 4.7 — a session reported by full id claude-opus-4-8 missed exact+partial match and fell back to Sonnet pricing (under-counted cost).

src/hooks/src/lib/cost-estimator.ts
+ 'claude-opus-4-8': { input 5.0, output 25.0, cache_read 0.5, cache_write 6.25 }
- opus: 'claude-opus-4-7'   + opus: 'claude-opus-4-8'

Independent build evidence

npm run build → git diff --quiet -- plugins/   exit 0 (NO drift)
npm run typecheck (tsc --noEmit)               clean
npm run test:skills (632 rules · 111 skills)   pass
npm run test:agents (37 agents)                pass
AUQ guard: 0 skills author a preview field     guard holds