6 min read · For teams running multi-agent systems across multiple LLMs
Agent swarms route different parts of a task to different models — one model for front-end work, another for backend, a third for visual reasoning, a fourth for efficient bulk operations. The pitch is real: the right model for the right slice of work, in parallel.
The hidden cost is repetition. Each agent in the swarm has its own context window. When one agent makes a mistake — force-pushes to main, deletes a production file, paginates wrong against a rate-limited API — the other agents in the swarm have no idea. They are statistically likely to make the same mistake, because they were trained on the same internet and given the same prompt scaffolding.
A natural reflex is to copy the same "do not force-push" rule into every agent's system prompt. This fails for three reasons:
The pattern that actually works is one gate layer that every agent consults before any tool call. The gate layer holds a single lesson database, a single set of prevention rules, and a single PreToolUse hook surface.
| Concern | Per-agent prompt rules | Shared gate layer |
|---|---|---|
| New mistake caught once | Other agents still vulnerable | All agents protected immediately |
| Rule consistency | Drifts across agents over time | Single source of truth on disk |
| Enforcement | Best-effort — agent can override | Hard block at the tool boundary |
| Token cost of repeated failures | N× per task (one per agent) | 1× per task (gate refuses) |
ThumbGate is a single MCP server with a PreToolUse hook. Every agent in the swarm — whether it speaks to Claude, GPT, Gemini, or any other model — issues tool calls through the same MCP layer. The hook fires once per tool call, regardless of which agent issued it.
Three concrete properties make this work in a swarm:
.thumbgate/ on disk (or set the THUMBGATE_FEEDBACK_DIR environment variable). All lessons, gates, and feedback land in one place.Honesty matters here. A shared gate layer is not a swarm orchestrator. It does not route work between agents, decide which model is best for a subtask, or balance load. Those are the swarm framework's job.
What the gate layer does is make the swarm cheaper to operate by removing the most expensive failure mode: paying the same mistake-tax N times. If your swarm framework gives you concurrency and the gate layer gives you shared safety memory, the combination is what makes multi-agent systems economically viable for production work.
If your swarm runs locally, point every agent at the same project directory:
cd /path/to/your-project
npx thumbgate init
If your agents run as separate processes or containers, share the feedback dir via env var:
export THUMBGATE_FEEDBACK_DIR=/shared/thumbgate
# every agent in the swarm reads from and writes to this directory
That is the entire integration. The gate layer is now in front of every tool call in the swarm, learning from feedback captured anywhere in the swarm.
Works with Claude Code, Cursor, Codex, Gemini, Amp, OpenCode, and any MCP-compatible agent.