← Blog

Command blocks for agent terminals

An AI agent's terminal is a wall of text. You can't tell where one command ends and the next begins, and finding an old result is a scroll-and-pray expedition. So we segmented the wall into command blocks — and made detection work even for agents that emit no markers at all.

The wall of output

Run Claude Code or Codex for an afternoon and the terminal becomes one continuous scroll. Tool calls, file diffs, test runs, build logs — all flowing into the same undifferentiated stream. When you want to go back to "that command that failed twenty minutes ago", there are no boundaries to navigate by. You scroll, you squint, you give up and re-run it.

A traditional shell session has natural rhythm: you type a command, you read its output, you type the next one. Even then the scrollback is flat. With an agent driving the terminal the problem is worse — the agent fires off dozens of commands faster than you can read them, and you weren't the one who typed them, so you don't even have muscle memory of what ran.

Why now

This wasn't worth building until two things were true. First, agent sessions got long. Short sessions don't accumulate enough scrollback to hurt; multi-hour autonomous runs do, and those became our normal. Second, we'd already replaced xterm.js with a native Alacritty-based terminal, which gave us direct access to the grid and to the VTE escape-sequence stream. Command blocks need both: a model of the output and a reliable signal for where one block starts and the next ends.

The detection problem

The textbook answer is OSC 133 — the shell-integration protocol where the shell emits escape sequences marking the start of a prompt, the start of a command, and the start of its output. We support it: TUIC injects OSC 133 markers into zsh, bash, and fish via shell integration, so a plain shell session segments cleanly.

But here's the catch that shaped the whole design: AI agents don't emit OSC 133. Claude Code, Codex, and the rest run as Ink-based TUIs. They paint their own output; they don't go through your shell's prompt hooks. If we'd relied on explicit markers alone, the feature would have been empty for the exact case it was built for — an agent session.

Agent-aware detection

So detection runs on two rails. When explicit markers exist (OSC 133 from the shell, or our own OSC 7770;block= protocol for agents that want to cooperate), we use them. When they don't, a heuristic in the output parser reconstructs blocks from the shape of the output itself.

The key signal is the tool-call glyph. Claude Code decorates each tool invocation with a bullet (U+23FA) — ⏺ ToolName(args) — and closes diff summaries with ⏺⎿. The parser watches for these, treats a tool call as a block boundary, and synthesizes AgentBlock start/end events that feed the same block model the OSC path produces. The rest of the system doesn't care which rail produced the boundary — it just sees blocks.

That's the insight the feature rests on: don't require the agent to announce its blocks; infer them from what the agent already prints.

What you get once output has structure

Once each prompt-and-output cycle is a first-class object, a pile of navigation falls out almost for free:

All of it lives under Settings > Terminal > Blocks.

The tradeoffs we accepted

We considered the clean, marker-only design — trust OSC 133 and OSC 7770;block=, build nothing else. It's simpler and never wrong about a boundary. We rejected it because it would be correct and useless: the agents we care about emit no markers, so the block list would sit empty during the sessions that need it most. Heuristics are messier — a glyph match can occasionally mis-segment — but a feature that works for 90% of real sessions beats a pristine one that works for none of them.

We also capped blocks at 500 per session, evicting the oldest. Keeping every block forever is a slow memory leak in a long-running session; 500 covers far more history than anyone scrolls back to, and the buffer underneath still holds 10,000 lines regardless.

What's next

Blocks are the substrate, not the destination. Now that output is structured, the interesting questions are about what we can do with that structure: per-block status (did this command succeed?), folding entire categories of output at once, and letting agents emit richer OSC 7770 metadata so a tool call carries its own label and outcome. The wall is a list now — and a list is something you can actually work with.