Why it matters
The earlier 61-skill LLM audit (6.9M tokens) found 5 drift cases — all in SKILL.md bodies. This sub-second grep found 125, because the drift lived in the references/ and rules/ files the audit never scanned (the agent-teams example blocks, duplicated across skills). A deterministic gate beats an LLM sweep for this bug class — 25× the coverage, ~7M fewer tokens, and it runs on every PR.
The drift it catches (before → after)
Before — stale, fails at runtime
SendMessage(type="message",
recipient="frontend-dev",
content="API contract ready")
SendMessage(type="broadcast", content="...")
PushNotification(title="ork:cover complete",
body=f"{pct}% coverage")
After — live schema
SendMessage(to="frontend-dev",
message="API contract ready")
# no broadcast primitive — send per-teammate
SendMessage(to="backend-dev", message="...")
PushNotification(
message=f"ork:cover complete — {pct}% coverage",
status="proactive")
Transform rules (applied by the codemod, gated by the verifier)
| Pattern | Fix |
SendMessage(type:"message", recipient:, content:) | → to: / message:, drop type |
SendMessage(type:"shutdown_request", …) before TeamDelete() | → delete (TeamDelete handles teardown) |
SendMessage(type:"broadcast", …) | → per-teammate send (no broadcast primitive) |
PushNotification(title=, body=) | → message= + status="proactive" |
How it works
tests/skills/structure/test-tool-call-signatures.sh
→ walk src/skills/**/*.md
→ for each SendMessage(/PushNotification( : extract top-level param names
(string- & depth-aware parser, reused from test-askuserquestion-schema.sh)
→ ERROR on any name not in the live-schema allowed set
(conservative: unknown NAMES only — never flags partial/illustrative snippets)
→ wired into `npm run test:skills`; extend by adding a tool to the registry