AgentBridge · feat/multi-pair

单机多对 Claude + Codex Single-Machine Multi-Pair Collaboration

Approach A — Shared-Nothing Multi-Instance
每一对 Claude↔Codex 都是完全隔离的实例:独立 daemon + 独立端口三元组 + 独立状态目录。
Each Claude↔Codex pair is a fully isolated instance: its own daemon, its own port triple, and its own state directory.

01 背景与方案 What Was Built

问题 / Problem

AgentBridge 此前每台机器只能运行 一对 Claude↔Codex:单个 daemon,固定端口 4500 / 4501 / 4502。想同时跑两份协作(比如一份写功能、一份做 review)就会端口冲突、状态互相覆盖。

Previously AgentBridge ran only one Claude↔Codex pair per machine: a single daemon on fixed ports 4500 / 4501 / 4502. Running two collaborations at once caused port collisions and state clobbering.

方案 / Solution

每一对 = 一个隔离的 daemon + 它自己的端口三元组 + 它自己的状态目录

🔑 关键洞察(运行时已验证) / Key Insight (runtime-verified)

Claude Code 会把继承到的环境变量原样传给它的 plugin MCP server。 因此选择「跑哪一对」纯粹靠在启动前设置环境变量即可完成 —— .mcp.json 完全不用改

Claude Code passes inherited environment variables straight through to its plugin MCP server. A pair is therefore selected purely by setting env vars before launch — .mcp.json is unchanged.

02 架构 Architecture

两对并排运行,互不干扰 / Two pairs side by side, fully isolated:

  abg claude --pair work            abg claude --pair review
  (env: ports + statedir = work)    (env: ports + statedir = review)
           │                                  │
           ▼                                  ▼
  daemon(work)  4500/01/02          daemon(review)  4510/11/12
           │                                  │
           ▼                                  ▼
  codex --pair work                 codex --pair review
    --remote :4501                    --remote :4511

每一对都完全隔离,各自拥有独立的 daemon.pid / status.json / 日志 / killed sentinel,统一放在 <stateDir>/pairs/<pairId>/ 下。

Each pair is fully isolated, with its own daemon.pid / status.json / logs / killed sentinel under <stateDir>/pairs/<pairId>/.

03 Slot → Port 映射 Slot → Port Table

每一对占据一个 slot,slot 决定端口三元组。规律:第 N 个 slot 在经典基址上 + N*10
Each pair occupies a slot, which determines its port triple. Rule: slot N is offset + N*10 from the classic base.

slot appPort
CODEX_WS_PORT
proxyPort
CODEX_PROXY_PORT
controlPort
AGENTBRIDGE_CONTROL_PORT
0 (第一对 / first pair)450045014502
1451045114512
2452045214522
N4500 + N*10+1+2
📌 注意 / Note

slot 0 = 经典端口(the classic ports)。所以 单对用户 100% 不受影响 —— 行为与升级前完全一致。

slot 0 = the classic ports, so a single-pair user is 100% unchanged.

04 用法 Usage

这是最重要的一节。下面所有命令都可以直接复制粘贴。
This is the most important section. Every command below is copy-pasteable.

启动一个命名对 / Start a named pair

终端 1 启动 Claude 侧,终端 2 启动 Codex 侧(两者用同一个 --pair 名字配对):

Terminal 1 & Terminal 2
# Terminal 1
abg claude --pair work

# Terminal 2
abg codex --pair work

第二个对(自动分配下一个 slot)/ A second pair, auto-assigned next slot

# Terminal 3
abg claude --pair review

# Terminal 4
abg codex --pair review

review 这一对会自动拿到下一个空闲 slot(例如 slot 1 → 端口 4510/11/12),与 work 对完全隔离。

The review pair automatically takes the next free slot (e.g. slot 1 → ports 4510/11/12), fully isolated from work.

不带参数:按目录自动推导 / No flag: pair derived from cwd

abg claude

不带 --pair 时,对的身份由当前目录自动推导(realpath + 短 hash)。在同一个项目目录里再次运行,会 重连到同一个对

Without --pair, the pair identity is auto-derived from the current directory (realpath + short hash). Running it again in the same project reconnects to the same pair.

列出所有对 / List all pairs

abg pairs           # pairId, slot, ports, cwd, running/stopped, pid
abg pairs --json    # machine-readable output for scripting
abg pairs rm <id>   # stop a pair and free its slot

停止 / Kill

# 停止所有对(以及任何遗留的旧版单对 daemon)
# Stop ALL pairs (and any legacy single-pair daemon)
abg kill

# 只停 "work" 这一对(保留它的注册表条目 / slot)
# Stop only the "work" pair (keeps its registry entry / slot)
abg kill --pair work

对身份规则 / Pair identity rules

05 底层原理 How It Works · env-injection seam

每个 CLI 命令的最顶端都会先跑一个 「对解析器(pair resolver)」

  1. 在跨进程锁(cross-process lock)保护下,到注册表 <base>/pairs/registry.json分配 / 查找 这个对的 slot。
  2. 然后设置这些环境变量:AGENTBRIDGE_STATE_DIR(该对的 state 目录)、AGENTBRIDGE_CONTROL_PORTCODEX_WS_PORTCODEX_PROXY_PORT,外加 AGENTBRIDGE_BASE_DIR(注册表 base —— 子进程 abg pairs/kill 据此解析正确的 registry)和 AGENTBRIDGE_PAIR_ID(对 id,用于诊断 / status.json)。
  3. 既有的 claude / codex / kill 代码,以及 daemon,因为本来就从环境变量读取这些值,所以无需任何改动就「自动工作」。

A "pair resolver" runs at the top of each CLI command: under a cross-process lock it allocates/looks up the pair's slot in <base>/pairs/registry.json, then sets AGENTBRIDGE_STATE_DIR (the pair dir), AGENTBRIDGE_CONTROL_PORT, CODEX_WS_PORT, CODEX_PROXY_PORT, plus AGENTBRIDGE_BASE_DIR (the registry base, so child abg pairs/kill resolve the real registry) and AGENTBRIDGE_PAIR_ID. The existing claude/codex/kill code and the daemon then just work, because they already read these from env.

🧩 这就是整个特性的接缝 / The seam

靠环境变量注入,而不是改动核心代码。 Env-injection, not core-code surgery.

06 并发安全 Concurrency Safety

07 升级迁移 Migration Note

⚠️ 升级后请执行一次 / Run once after upgrading

升级到多对版本后,如果还有一个 旧版(pre-multi-pair)daemon 在跑,请执行一次:

abg kill

新代码会 检测到这个 legacy-root daemon 并给出引导

After upgrading, if an old (pre-multi-pair) daemon is still running, run abg kill once — the new code detects the legacy-root daemon and guides you.

08 验证状态 Verification

全部通过。 / All green.

项 / Item状态 / Status
单元测试(pair-registry / concurrency / resolver / command)
Unit tests
✅ pass — 含 8 进程并发 + seeded-dead-lock 锁测试,20+ 次 stress 0 碰撞
incl. an 8-process + seeded-dead-lock concurrency test, 0 collisions over 20+ stress runs
类型检查 + 全量测试 + 插件同步(bun run check
Type check + full suite + plugin sync
✅ 334 pass / 0 fail,typecheck clean,bundles in sync,version-aligned 0.1.6
真实双对运行时 E2E(Codex sandbox,仓库原脚本)
Real two-pair runtime E2E
✅ pass — pair a=slot2 (4520/4521/4522, status pairId "a")、pair b=slot3 (4530/4531/4532, status pairId "b");端口/state/日志隔离 ✓;kill --pair a 只停 a、b 仍 LISTEN ✓;kill --all 后 4520-4532 全清 ✓
交叉评审(Claude×2 全新 reviewer + Codex / 轮)
Cross-review
✅ 8 轮迭代收敛到连续 2 轮 0 真实 issue(第 7、8 轮)+ post-gate packaging delta focused 复审 0 issue。修掉的真实 bug:跨进程锁竞争(3 次迭代)、AGENTBRIDGE_STATE_DIR 双重语义、空串 env、kill --help 误杀、pair-scoped 命令提示(含 SessionStart hook)