Claude Code /goal vs Todo: The 4-Field Pattern That Actually Holds

6 min read · For developers using Claude Code's /goal command in production

TL;DR: Treating /goal like a todo wastes the command. The pattern that holds: clear goal, measurable success, shown proof, hard limits. That's a 4-field rubric — the same shape ThumbGate's rubric-engine enforces at gate-fire time. Pair them and the agent cannot fake completion.

The todo-shaped /goal anti-pattern

A common usage of Claude Code's /goal command looks like this:

/goal fix the auth bug

This is a todo, not a goal. There is no way for the agent or you to know when the work is actually done. The agent will declare success after the first plausible-looking change, you will discover it was wrong an hour later, and the same conversation will re-enter the loop.

The verifiable-contract reframe: A goal is not a wish. It is a contract with a clear outcome, a measurable success criterion, an inspectable proof, and a stop condition. Anything short of all four is a todo.

The 4-field pattern

Each field maps to a specific failure mode the agent commits when the field is missing:

Field What it answers Failure mode if missing
1. Clear goal What is the outcome, in one sentence, that someone outside the project would understand? Scope creep. Agent expands the task to whatever it can find.
2. Measurable success What single check, run after the work, returns 0 / pass / a specific number? "Looks done." Agent declares completion on optimistic intermediate signals.
3. Shown proof What output, file, or screenshot will be in the final message proving the check ran? Hallucinated completion. Agent reports a green test that was never executed.
4. Hard limits What is the deadline, retry cap, or scope cliff that stops the work even if not yet "done"? Infinite spin. Agent keeps trying variants past any reasonable budget.

Concrete /goal phrasing

Same task, todo-shape vs goal-shape:

Todo shape (anti-pattern):
/goal fix the auth bug

Goal shape:
/goal fix the auth bug
  success: npm test --testPathPattern=auth returns exit 0 with 12 passing
  proof: paste the final 'PASS auth' line of the test output
  limit: stop after 3 implementation attempts; if still failing, file a /thumbsdown

The agent now has a contract it cannot fake. The success criterion is mechanical (exit code + count). The proof is inspectable (a literal line of output). The limit is a stop condition that triggers a feedback capture instead of a fifth wasted attempt.

Where ThumbGate plugs in

The 4-field pattern is exactly the shape of a rubric. ThumbGate's scripts/rubric-engine.js evaluates each completed agent task against a 4-field rubric:

Claude Code /goal field ThumbGate rubric field What ThumbGate does
Clear goal rubric.goal Stored as the canonical task description in the feedback log
Measurable success rubric.verification.check The check is run before the agent's "done" claim is promoted to memory
Shown proof rubric.verification.evidence Captured into the lesson DB; missing proof = no promotion
Hard limits rubric.budget Tied to budget-guard.js — when limit hit, ThumbGate forces a thumbs-down capture and surfaces it for review
Two layers, same shape: Claude Code's /goal sets the contract at task start. ThumbGate's rubric-engine enforces it at task end. The agent cannot promote a "done" memory unless every rubric field has the proof attached.

What this prevents in practice

Three concrete agent failure modes the paired pattern blocks:

Five minutes to wire it

In your project root:

npx thumbgate init

Then, in any conversation, use the goal-shape phrasing above. ThumbGate's rubric-engine runs as part of the PreToolUse hook chain; no extra config required. The first time the agent tries to promote a "done" claim without proof, you'll see the gate fire in your dashboard.

Pair /goal with a rubric the agent cannot fake.

Works with Claude Code, Cursor, Codex, Gemini, Amp, OpenCode, and any MCP-compatible agent.

$ npx thumbgate init
Try it now: npx thumbgate init GitHub →