Command blocks for agent terminals
An AI agent's terminal is a wall of text. You can't tell where one command ends and the next begins, and finding an old result is a scroll-and-pray expedition. So we segmented the wall into command blocks — and made detection work even for agents that emit no markers at all.
The wall of output
Run Claude Code or Codex for an afternoon and the terminal becomes one continuous scroll. Tool calls, file diffs, test runs, build logs — all flowing into the same undifferentiated stream. When you want to go back to "that command that failed twenty minutes ago", there are no boundaries to navigate by. You scroll, you squint, you give up and re-run it.
A traditional shell session has natural rhythm: you type a command, you read its output, you type the next one. Even then the scrollback is flat. With an agent driving the terminal the problem is worse — the agent fires off dozens of commands faster than you can read them, and you weren't the one who typed them, so you don't even have muscle memory of what ran.
Why now
This wasn't worth building until two things were true. First, agent sessions got long. Short sessions don't accumulate enough scrollback to hurt; multi-hour autonomous runs do, and those became our normal. Second, we'd already replaced xterm.js with a native Alacritty-based terminal, which gave us direct access to the grid and to the VTE escape-sequence stream. Command blocks need both: a model of the output and a reliable signal for where one block starts and the next ends.
The detection problem
The textbook answer is OSC 133 — the shell-integration protocol where the shell emits escape sequences marking the start of a prompt, the start of a command, and the start of its output. We support it: TUIC injects OSC 133 markers into zsh, bash, and fish via shell integration, so a plain shell session segments cleanly.
But here's the catch that shaped the whole design: AI agents don't emit OSC 133. Claude Code, Codex, and the rest run as Ink-based TUIs. They paint their own output; they don't go through your shell's prompt hooks. If we'd relied on explicit markers alone, the feature would have been empty for the exact case it was built for — an agent session.
Agent-aware detection
So detection runs on two rails. When explicit markers exist (OSC 133 from the shell, or our own OSC 7770;block= protocol for agents that want to cooperate), we use them. When they don't, a heuristic in the output parser reconstructs blocks from the shape of the output itself.
The key signal is the tool-call glyph. Claude Code decorates each tool invocation with a ⏺ bullet (U+23FA) — ⏺ ToolName(args) — and closes diff summaries with ⏺⎿. The parser watches for these, treats a tool call as a block boundary, and synthesizes AgentBlock start/end events that feed the same block model the OSC path produces. The rest of the system doesn't care which rail produced the boundary — it just sees blocks.
That's the insight the feature rests on: don't require the agent to announce its blocks; infer them from what the agent already prints.
What you get once output has structure
Once each prompt-and-output cycle is a first-class object, a pile of navigation falls out almost for free:
- Semantic scrollbar marks. Each block leaves a color-coded tick on the scrollbar — a map of the whole session you can read at a glance and click to jump.
- Block folding.
Cmd+Shift+.collapses a block to its first line. Hide the build logs you've already read; keep the diff you care about. - Gutter selection. Click the gutter next to a block to select its entire output — no more dragging across page breaks to copy one command's result.
- Block-scoped search.
Cmd+Shift+Bscopes find-in-terminal to a single block instead of the whole 10,000-line buffer. - Block navigation.
Cmd+Shift+Up/Downjumps prompt to prompt, skipping the output in between. - Timestamp overlay. Hold
Ctrl+Cmdto show when each block ran.
All of it lives under Settings > Terminal > Blocks.
The tradeoffs we accepted
We considered the clean, marker-only design — trust OSC 133 and OSC 7770;block=, build nothing else. It's simpler and never wrong about a boundary. We rejected it because it would be correct and useless: the agents we care about emit no markers, so the block list would sit empty during the sessions that need it most. Heuristics are messier — a glyph match can occasionally mis-segment — but a feature that works for 90% of real sessions beats a pristine one that works for none of them.
We also capped blocks at 500 per session, evicting the oldest. Keeping every block forever is a slow memory leak in a long-running session; 500 covers far more history than anyone scrolls back to, and the buffer underneath still holds 10,000 lines regardless.
What's next
Blocks are the substrate, not the destination. Now that output is structured, the interesting questions are about what we can do with that structure: per-block status (did this command succeed?), folding entire categories of output at once, and letting agents emit richer OSC 7770 metadata so a tool call carries its own label and outcome. The wall is a list now — and a list is something you can actually work with.