One card per capability agent-deck promises a user can do. Each fast-gate capability is verified by a test that performs the real action through the compiled binary on an isolated tmux socket and asserts on the real effect: registry rows or live pane content. Nightly capabilities need real agents, keys, or network and run out of band.
We run the add command to register a new session, then read the saved registry back. We confirm the row exists with the title, tool, group, and folder we asked for.
Pass when: new registry row carries the given title, tool, group, and working directory
Profile: default TITLE GROUP PATH ID -------------------------------------------------------------------------------------------- cap-add capgrp /tmp/TestCapability_Lifecycle_Add3841... 18761039-177 Total: 1 sessions
We register a session and start it. We then ask the throwaway tmux server whether a live pane actually appeared, and confirm the registry flips the session to an active state.
Pass when: a real tmux pane appears on the isolated socket and status becomes active
IDLE (1): ○ cap-start shell - ~/project Total: 1 sessions in profile 'default'
We start a session, then stop it. We confirm the tmux pane is gone and the registry returns the session to the stopped state.
Pass when: tmux pane disappears and registry status returns to stopped
Profile: default TITLE GROUP PATH ID -------------------------------------------------------------------------------------------- cap-stop 001 /tmp/TestCapability_Lifecycle_Stop156... 5f61729b-177 Total: 1 sessions
We start a session and restart it. We confirm exactly one pane exists afterward (no accidental duplicate) and the session is active again.
Pass when: exactly one pane remains after restart and status is active (guards the #30 double-spawn)
IDLE (1): ○ cap-restart shell - ~/project Total: 1 sessions in profile 'default'
We confirm a stopped session can be removed and disappears from the registry, and that removing a still-running session is refused unless forced, so it is not destroyed by accident.
Pass when: stopped session leaves the registry; a running session is refused without force
$ agent-deck session remove cap-rm-running Error: session 'cap-rm-running' is in state 'starting'; only stopped/error sessions may be removed without --force $ agent-deck list (after stopping and removing cap-rm-stopped) Profile: default TITLE GROUP PATH ID -------------------------------------------------------------------------------------------- cap-rm-running 001 /tmp/TestCapability_Lifecycle_Rm21683... c04ab467-177 Total: 1 sessions
We use the single launch command, which creates, starts, and messages a session at once, pointed at the stand-in echo agent. We confirm the registry row exists and the echoed message shows up on screen.
Pass when: one launch command creates the row, starts the pane, and the echoed message appears
ECHOBOT READY > PINGLAUNCH-cap-e2e-token ECHO:PINGLAUNCH-cap-e2e-token
Forking is only valid for supported tools with live context (Claude session ID or Pi JSONL). We confirm forking an unsupported session is cleanly refused and creates no orphan child row. The full context-inheriting fork is a documented nightly gap.
Pass when: forking an unsupported session is refused and no child row is created
$ agent-deck session fork cap-fork Error: session 'cap-fork' is not a Claude session (tool: shell)
We launch a tiny stand-in agent that simply repeats whatever you say. We send it a unique message through the normal send command, then read the screen back. If the screen shows the echoed message we know it reached the agent and a reply came out the other side.
Pass when: the pane shows ECHO:<token> after a real send, proving readiness, send-keys, and capture read-back
ECHOBOT READY > PING-TestCapability_Agent_EchoRoundTrip ECHO:PING-TestCapability_Agent_EchoRoundTrip
We launch a real Claude session, wait for its prompt, send a fixed instruction, and check the reply. This needs a real API key and network, so it runs nightly, not on every release.
Pass when: a real Claude reply contains the expected token
We fork a live Claude or Pi session and confirm the child inherits the conversation with a distinct id and a parent link. This needs real tool session data from a live transcript, so it runs nightly.
Pass when: child session links the parent and inherits conversation context
We register a stub tool server, attach it to a session, and read the session's .mcp.json file back to confirm the entry was written. We then detach it and confirm the entry is gone. We also confirm attaching an unknown server is refused and never writes a broken config.
Pass when: .mcp.json gains the server entry on attach and loses it on detach; an unknown server is refused
$ agent-deck mcp attached cap-mcp
Session: cap-mcp
LOCAL (~/project/.mcp.json):
• stubmcp
.mcp.json on disk:
{
"mcpServers": {
"stubmcp": {
"type": "stdio",
"command": "true",
"args": [
"--noop"
]
}
}
}We attach a tool server and ask a real agent to list its tools, confirming the agent honors the attachment. This needs a real agent to introspect, so it runs nightly.
Pass when: the agent lists the attached MCP server
Against a throwaway git repo, we create a session on a new branch in its own worktree and confirm the worktree directory really exists on disk. We then run finish, and confirm the worktree and branch are removed, the session is gone, and the ORIGINAL repository is untouched.
Pass when: the worktree dir is created on its own branch, then finish removes it and the session while leaving the source repo intact (the #1200 data-loss guard)
$ agent-deck worktree info cap-wt (after add --worktree -b) Session: cap-wt Branch: feature/capfeature Worktree Path: ~/repo/.worktrees/feature-capfeature Main Repo: ~/repo Status: exists $ agent-deck list (after worktree finish: session gone, repo intact) No sessions found in profile 'default'.
We create sessions in two different groups and confirm that filtering the registry by group returns exactly that group's members, with no session bleeding across groups.
Pass when: each group lists exactly its own sessions; no cross-group leakage
Groups: NAME SESSIONS STATUS -------------------------------------------------- alpha 2 ○ 2 beta 1 ○ 1 Total: 2 groups, 3 sessions
We add one session under the default profile and another under a separate profile, then list each profile. We confirm neither profile can see the other's session.
Pass when: a session in one profile is invisible to the other (no cross-profile data bleed)
$ agent-deck list (default profile) Profile: default TITLE GROUP PATH ID -------------------------------------------------------------------------------------------- in-default 001 /tmp/TestCapability_Profiles_Isolatio... 7eca0dea-177 Total: 1 sessions $ agent-deck -p capalt list (isolated profile) Profile: capalt TITLE GROUP PATH ID -------------------------------------------------------------------------------------------- in-capalt 001 /tmp/TestCapability_Profiles_Isolatio... 625039ff-177 Total: 1 sessions
We launch two distinct stand-in tools and confirm each one reaches an active state AND echoes back the unique message we sent it, proving the launch and readiness machinery is not tied to a single tool.
Pass when: two different tools each reach active and echo their token back
tool "echobot" (cap-tool-a): ECHOBOT READY > PINGA-cap-e2e ECHO:PINGA-cap-e2e tool "parrot" (cap-tool-b): ECHOBOT READY > PINGB-cap-e2e ECHO:PINGB-cap-e2e
A worker prints a completion sentinel on its last turn. We run the real Stop-hook handler and confirm it records a done outcome (status and summary), then run the notifier daemon and confirm it emits a distinct finished signal to the parent instead of an ambiguous waiting. An ordinary turn with no sentinel records no completion.
Pass when: the Stop-hook persists done status=ok, and the daemon emits a finished event carrying that outcome (issue #1186)
hook status file (what the Stop-hook persisted):
{
"status": "waiting",
"event": "Stop",
"done_status": "ok",
"done_summary": "capability wave2 shipped"
}
finished event emitted by notify-daemon (the [DONE] signal to the parent):
child_title: cap-child
child_session_id: 0a81e066-1779906947
kind: finished
done_status: ok
done_summary: capability wave2 shippedWe make a worker transition once, then run the notifier daemon twice over the same idle worker. We confirm exactly one notification is produced across both passes, proving the de-duplication ledger persists between polls.
Pass when: two daemon passes over the same idle child emit exactly one event (issue #1187 dedup)
two `notify-daemon --once` passes over the same idle child emitted exactly one transition event (the second was deduped): child_title: dd-child child_session_id: 5b085245-1779906947 from_status: running to_status: idle