6 min read · For developers using Claude Code's /goal command in production
/goal like a todo wastes the command. The pattern that holds: clear goal, measurable success, shown proof, hard limits. That's a 4-field rubric — the same shape ThumbGate's rubric-engine enforces at gate-fire time. Pair them and the agent cannot fake completion.A common usage of Claude Code's /goal command looks like this:
/goal fix the auth bug
This is a todo, not a goal. There is no way for the agent or you to know when the work is actually done. The agent will declare success after the first plausible-looking change, you will discover it was wrong an hour later, and the same conversation will re-enter the loop.
Each field maps to a specific failure mode the agent commits when the field is missing:
| Field | What it answers | Failure mode if missing |
|---|---|---|
| 1. Clear goal | What is the outcome, in one sentence, that someone outside the project would understand? | Scope creep. Agent expands the task to whatever it can find. |
| 2. Measurable success | What single check, run after the work, returns 0 / pass / a specific number? | "Looks done." Agent declares completion on optimistic intermediate signals. |
| 3. Shown proof | What output, file, or screenshot will be in the final message proving the check ran? | Hallucinated completion. Agent reports a green test that was never executed. |
| 4. Hard limits | What is the deadline, retry cap, or scope cliff that stops the work even if not yet "done"? | Infinite spin. Agent keeps trying variants past any reasonable budget. |
Same task, todo-shape vs goal-shape:
Todo shape (anti-pattern): /goal fix the auth bug Goal shape: /goal fix the auth bug success: npm test --testPathPattern=auth returns exit 0 with 12 passing proof: paste the final 'PASS auth' line of the test output limit: stop after 3 implementation attempts; if still failing, file a /thumbsdown
The agent now has a contract it cannot fake. The success criterion is mechanical (exit code + count). The proof is inspectable (a literal line of output). The limit is a stop condition that triggers a feedback capture instead of a fifth wasted attempt.
The 4-field pattern is exactly the shape of a rubric. ThumbGate's scripts/rubric-engine.js evaluates each completed agent task against a 4-field rubric:
| Claude Code /goal field | ThumbGate rubric field | What ThumbGate does |
|---|---|---|
| Clear goal | rubric.goal |
Stored as the canonical task description in the feedback log |
| Measurable success | rubric.verification.check |
The check is run before the agent's "done" claim is promoted to memory |
| Shown proof | rubric.verification.evidence |
Captured into the lesson DB; missing proof = no promotion |
| Hard limits | rubric.budget |
Tied to budget-guard.js — when limit hit, ThumbGate forces a thumbs-down capture and surfaces it for review |
/goal sets the contract at task start. ThumbGate's rubric-engine enforces it at task end. The agent cannot promote a "done" memory unless every rubric field has the proof attached.
Three concrete agent failure modes the paired pattern blocks:
In your project root:
npx thumbgate init
Then, in any conversation, use the goal-shape phrasing above. ThumbGate's rubric-engine runs as part of the PreToolUse hook chain; no extra config required. The first time the agent tries to promote a "done" claim without proof, you'll see the gate fire in your dashboard.
Works with Claude Code, Cursor, Codex, Gemini, Amp, OpenCode, and any MCP-compatible agent.