Best roadmap path: keep Claude Code for Claude, add Codex for OpenAI.
source + help verified1. Read This First
Ship path: keep Claude Code as the default and preserve setup-token onboarding. When Right Agent supports two backends, add Codex alongside Claude Code as the OpenAI backend instead of treating Codex as a Claude-token replacement.
Corrected auth result: the real setup token was tested beyond Claude Code after explicit approval. Pi passed through ANTHROPIC_OAUTH_TOKEN. OpenCode and LangChain reached Anthropic only through ANTHROPIC_API_KEY/x-api-key and hit 429 rate-limit responses. Codex ignored Claude envs because its value is a separate OpenAI-auth backend. Rig source has no OAuth env path; a direct first-party Anthropic x-api-key probe produced the same 429 shape.
Best new research finding: generic LangChain is not a drop-in CLI backend, but the LangChain ecosystem now has a concrete Deep Agents layer with SKILL.md, filesystem/shell backends, LangGraph checkpoints, and ACP code.
2. Visual Decision
3. Scoring Model and Full Overview
Scores are Right Agent product-fit scores for external users, not raw library quality. Implementation effort is deliberately excluded. Composite roadmap scores, such as Codex + Claude Code, measure product capability after multi-backend support exists. That means a composite can outrank a single runtime while still carrying integration risk.
Score cells are heatmapped by percentage of each criterion maximum; bar width shows points captured inside that column. Totals are raw sums out of 85.
| Weight | Criterion | What earns points |
|---|---|---|
| 25 | Auth/onboarding | Works with current setup-token/subscription flow, or has a clean replacement for external users. |
| 15 | Skills compatibility | Preserves OpenClaw/ClawHub SKILL.md semantics with minimal adapter work. |
| 15 | Session durability | Resume, fork, background continuation, and inspectable state are native or easy to model. |
| 15 | Sandbox communication | Clean process/API surface for a Rust bot talking to a sandboxed runtime. |
| 15 | Evidence and policy risk | Higher confidence from source/local tests and fewer vendor-policy unknowns. |
| Rank | Platform/library | Auth /25 |
Skills /15 |
State /15 |
Sandbox /15 |
Evidence /15 |
Total /85 |
Lane | Overview verdict |
|---|---|---|---|---|---|---|---|---|---|
| 1 | Codex + Claude Code | 25 | 15 | 15 | 14 | 15 | 84 | Two-backend path | Highest product-capability path once multi-backend support exists: Claude Code keeps Claude subscription/setup-token semantics while Codex adds a native OpenAI backend with exec --json, app-server, skills, resume, and fork. |
| 2 | Claude Code | 25 | 15 | 14 | 14 | 13 | 81 | Default single runtime | Best current single-backend fit because it already matches setup-token auth, skills, MCP, streaming, resume/fork, and current product behavior. It loses a few points for known subscription/billing anecdotes and platform-specific operational rough edges. |
| 3 | Pi | 23 | 14 | 14 | 11 | 12 | 74 | Claude-subscription alternate | Only non-Claude-Code candidate with a clean real-token pass. It beats OpenCode for Claude-subscription replacement; loses sandbox points because the CLI was smoke-tested in OpenShell but the RPC/tool/event contract has source/test proof rather than a full OpenShell e2e run. |
| 4 | Deep Agents + LangGraph | 10 | 13 | 15 | 12 | 15 | 65 | Programmatic harness | Best programmatic path for skills plus durable resume/fork. Low auth score because it is not setup-token native. |
| 5 | ACP | 10 | 8 | 12 | 15 | 15 | 60 | Adapter/protocol | Good bot-to-harness envelope when the backend already owns auth/state. Auth and skills score low because ACP carries these concerns; it does not define them. |
| 6 | OpenCode | 10 | 15 | 13 | 12 | 8 | 58 | CLI/ACP alternate | Strongest alternate CLI surface and native skills/session/ACP shape. It loses auth/evidence points because no successful Claude setup-token pass was observed; API-key-shaped path only reached 429. |
| 7 | Rig.rs | 12 | 0 | 9 | 13 | 12 | 46 | Rust library lane | Rust-native and attractive for long-term ownership, but it lacks native SKILL.md, Claude setup-token auth, and ready-made Right Agent session semantics. |
| 8 | LangSmith Deployment | 5 | 8 | 11 | 2 | 13 | 39 | Deployment product | Useful for managed/self-hosted runtime research, but not a local sandbox subprocess default for Right Agent. |
| 8 | Generic LangChain | 10 | 3 | 4 | 10 | 12 | 39 | Model/tool library | Useful abstraction layer, but insufficient alone as a Right Agent backend without LangGraph/Deep Agents and a custom harness. |
| 10 | LangServe | 4 | 2 | 2 | 6 | 12 | 26 | REST wrapper | Lowest fit for this product decision. It can expose chains, but does not solve sandboxed agent lifecycle, setup-token auth, or resume/fork. |
Best current single backend: setup-token auth, skills, MCP, streaming, and current product parity.
local real-token passBest Claude-subscription alternate: real setup token passed, with native skills and resume/fork controls.
local real-token passBest owned programmatic harness if Right Agent accepts Python/JS runtime ownership.
first-class new laneAdapter fit score only: useful protocol envelope, not a backend or auth/session source.
stable + draft splitBest CLI/ACP alternate: strong run/server/session paths, no successful setup-token pass.
source-provenRust-native provider/tooling library; Right Agent would own most agent runtime semantics.
source-provenTechnical auth success does not prove policy fitness. For external users, subscription/setup-token use needs vendor guidance, revocation handling, and sandbox-only credential injection.
4. Capability Heatmap
Cells are quick-read summaries. Expand the details below for evidence and caveats.
| Candidate | Setup token | Skills | MCP/tools | Resume | Fork/background | Sandbox communication | Right Agent ownership |
|---|---|---|---|---|---|---|---|
| Claude Code | native Real CLAUDE_CODE_OAUTH_TOKEN pass. |
native Current product skill path remains intact. |
native MCP + strict config already integrated. |
native--resume/--continue. |
native--fork-session, background agents. |
Process stdio with stream JSON inside OpenShell. | Adapter + policy normalization. |
| OpenCode | partial Claude/OAuth envs ignored; ANTHROPIC_API_KEY path reached Anthropic 429. |
native Loads .claude/.agents skill roots. |
native Run, server, MCP, ACP surfaces. |
source Session continue/export/import paths. |
source--fork on continued sessions. |
Prefer stdio JSON first; ACP only if event semantics help. | Auth onboarding, schema parity, event mapping. |
| Pi | pass Real setup token passed through ANTHROPIC_OAUTH_TOKEN. |
native Agent Skills standard and --skill. |
adapter Extensions, custom providers, RPC mode. |
native--continue/--resume/--session. |
native--fork exposed in help. |
Prefer --mode rpc or JSON/text subprocess inside sandbox. |
RPC mapping, OAuth policy, tool approval semantics. |
| LangChain/LangGraph/Deep Agents | API key 429 Claude/OAuth envs ignored; setup token as ANTHROPIC_API_KEY reached Anthropic 429. |
Deep Agents Native in Deep Agents; not generic LangChain. |
native LangChain MCP adapters and custom tools. |
native LangGraph threads/checkpointers. |
native Time travel, replay, update-state fork. |
Right Agent-owned Python/JS harness inside OpenShell. | Full runtime harness, auth, prompts, final output, policy. |
| Rig.rs | wire 429 Source uses ANTHROPIC_API_KEY/x-api-key; direct first-party probe returned 429. |
none No native SKILL.md. |
library MCP/tooling primitives, not product harness. |
build Use memory/history primitives. |
build Right Agent defines branch semantics. |
Rust runner in sandbox over stdio JSON-RPC or ACP. | Almost everything above provider/tool abstractions. |
| Codex + Claude Code | separate Claude Code keeps setup-token; Codex uses OpenAI auth. |
native Codex skills loader supports SKILL.md. |
native MCP server/app-server surfaces. |
nativeresume command. |
nativefork command. |
codex exec --json or app-server JSON-RPC inside sandbox. |
Backend router, OpenAI auth onboarding, cost/account labeling. |
| ACP-backed runtime | none Protocol does not define provider auth. |
none Skills live in backend/wrapper. |
protocol Tool calls and permissions are protocol messages. |
optionalsession/load, session/resume gated by capabilities. |
draftsession/fork is RFD/draft, not stable. |
JSON-RPC over stdio preferred. | Wrapper state store, fork/background semantics. |
5. Local Auth/Test Matrix
All tests used disposable directories under /private/tmp or sandbox-local /tmp. The real setup token was never printed. After explicit approval, real-token tests were run through Claude Code, Pi, OpenCode, LangChain Python/JS, and Codex env paths. Rig.rs is a library, not a first-party harness CLI, so the report records source evidence plus a direct first-party Anthropic wire probe matching Rig's x-api-key path.
| Runtime | Install/version | Env tested | Command shape | Observed result | Evidence level |
|---|---|---|---|---|---|
| Claude Code | npx -y @anthropic-ai/claude-code@2.1.141 |
Real CLAUDE_CODE_OAUTH_TOKEN; no ANTHROPIC_API_KEY. |
claude -p "Reply exactly RA_OK" --output-format text --model sonnet --tools "" --no-session-persistence |
PASS: status 0, output RA_OK, no token leak. Dir: /private/tmp/right-agent-auth-claude-cc.JA4uXT. |
local real-token |
| Claude Code negative control | @anthropic-ai/claude-code@2.1.141 |
Real token placed only in ANTHROPIC_OAUTH_TOKEN; no Claude env. |
Same non-interactive command. | FAIL EXPECTED: Not logged in Β· Please run /login. Dir: /private/tmp/right-agent-auth-claude-anthoauth.6m1IGd. |
local real-token negative |
| OpenCode | npx -y opencode-ai@1.14.49 |
Real token in CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_OAUTH_TOKEN, and ANTHROPIC_API_KEY variants. |
opencode run --pure --model anthropic/claude-sonnet-4-5-20250929 |
PARTIAL: Claude/OAuth env variants were ignored; ANTHROPIC_API_KEY enabled Anthropic provider and reached API, then timed out after 429 rate-limit errors in logs. Dir: /private/tmp/right-agent-auth-opencode-anthoapi-model.btqqy1. |
local real-token source |
| Pi | npx -y @earendil-works/pi-coding-agent@0.74.0 |
Real setup token in ANTHROPIC_OAUTH_TOKEN; no ANTHROPIC_API_KEY. |
pi -p --model anthropic/claude-sonnet-4-5-20250929 --no-tools --no-session |
PASS: status 0, output RA_OK. Dir: /private/tmp/right-agent-auth-pi-real.Lu1sds. |
local real-token source |
| Codex + Claude Code strategy | npx -y @openai/codex@0.130.0 |
Real token in CLAUDE_CODE_OAUTH_TOKEN and ANTHROPIC_OAUTH_TOKEN variants; no OpenAI key. |
codex exec --json; codex app-server --listen stdio://; source protocol scan. |
DUAL-BACKEND FIT: Claude envs are ignored, as expected. Help/source verify exec --json, app-server, mcp-server, resume, fork, skills/list, and app-server JSON-RPC. The correct path is OpenAI auth plus Claude Code, not Claude token translation. |
local help source |
| LangChain Python + LangGraph | uv pip install langchain-anthropic==1.4.3 langgraph==1.2.0 deepagents==0.6.1 langchain-mcp-adapters==0.2.2 |
Real token in CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_OAUTH_TOKEN, and ANTHROPIC_API_KEY variants. |
Construct ChatAnthropic and invoke claude-sonnet-4-5-20250929; separately run local LangGraph checkpoint/fork harness. |
PARTIAL: Claude/OAuth env variants constructed then failed auth resolution; ANTHROPIC_API_KEY path reached Anthropic and returned 429. LangGraph local resume/fork harness still passed. Dirs: /private/tmp/right-agent-auth-langchain-py-real-ANTHOAPI.qvoaJZ, /private/tmp/right-agent-auth-langchain-py.dekdPY. |
local harness source |
| LangChain.js + LangGraph.js | npm install @langchain/anthropic@1.3.29 @langchain/langgraph@1.3.0 deepagents@1.10.2 @langchain/mcp-adapters@1.1.3 |
Real token in CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_OAUTH_TOKEN, and ANTHROPIC_API_KEY variants. |
Construct ChatAnthropic; run local LangGraph.js memory saver harness. |
PARTIAL: Claude/OAuth env variants reported Anthropic API key not found; ANTHROPIC_API_KEY path reached Anthropic and returned 429 with retries disabled. LangGraph.js local harness still passed. Dirs: /private/tmp/right-agent-auth-langchain-js-real-ANTHOAPI-noretry.bCWTp3, /private/tmp/right-agent-auth-langchain-js.NKhQXm. |
local real-token source |
| Rig.rs | rig-core 0.37.0, source tag rig-v0.37.0 |
Source supports only ANTHROPIC_API_KEY; direct first-party probe used real token on matching x-api-key path. |
Source read plus direct curl to Anthropic /v1/messages with x-api-key. |
SOURCE + WIRE: no first-party Rig harness CLI exists to run. Source uses ANTHROPIC_API_KEY/x-api-key; direct Anthropic x-api-key probe returned 429. Source has no CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_OAUTH_TOKEN path. |
source first-party wire test |
6. Resolved Claim Ledger
This section is the audit surface for prior weak claims. "Blocked" means the report records the exact blocker and the next proof path; it is not scored as confirmed evidence.
Claims Cleared
| Claim | Resolution | Evidence | Product meaning |
|---|---|---|---|
| Pi setup-token support | confirmed | Real /tmp/setup-token passed via ANTHROPIC_OAUTH_TOKEN; source resolves that env before ANTHROPIC_API_KEY and sends OAuth bearer auth for sk-ant-oat tokens. | Technically viable Claude-subscription alternate; policy/revocation remains a release gate. |
| OpenCode setup-token support | not confirmed | Real CLAUDE_CODE_OAUTH_TOKEN/ANTHROPIC_OAUTH_TOKEN variants were ignored. ANTHROPIC_API_KEY shape reached Anthropic and returned 429. | Do not advertise Claude setup-token support. Treat as API-key or non-Claude backend until proven otherwise. |
| Rig.rs auth | source + wire | Source reads ANTHROPIC_API_KEY and emits x-api-key. Direct Anthropic x-api-key probe with the setup token returned 429. | No native Claude setup-token/OAuth lane. Rig requires a Right Agent-owned harness and auth decision. |
| ACP resume/fork | split answer | ACP stable docs define optional loadSession and sessionCapabilities.resume; fork is an RFD/draft session/fork. | ACP can carry resume only when the agent advertises it. Fork is backend/wrapper-specific for now. |
| LangChain ecosystem | expanded | LangChain split into generic LangChain, LangGraph, Deep Agents, LangSmith/Agent Server, and LangServe. Deep Agents source/docs show SKILL.md, backends, checkpointers, and ACP. | The viable programmatic lane is Deep Agents + LangGraph, not generic LangChain alone. |
| Codex + Claude Code scoring | clarified | Codex source/help show OpenAI-auth backend surfaces; Claude Code keeps Claude setup-token onboarding. | The composite scores product capability after a backend router exists. It is not a claim that Codex consumes Claude tokens. |
| OpenShell execution | smoke-proven | Inside right-test-shared, isolated temp HOME/cache executed Pi 0.74.0, OpenCode 1.14.49, and Codex 0.130.0. Python exists; Cargo is absent. | Node CLIs are runnable in OpenShell. LangChain can run with Python; Rig runner must be prebuilt/copied or sandbox image must include Cargo. |
Auth and Setup-Token Matrix
| Candidate | Setup-token result | Observed auth surface | Release decision |
|---|---|---|---|
| Claude Code | pass | CLAUDE_CODE_OAUTH_TOKEN official and local pass. | Default Claude backend. |
| Codex standalone | not applicable | OpenAI auth; Claude envs ignored. | OpenAI backend only. |
| Claude Code + Codex | preserved | Claude Code handles Claude setup-token; Codex handles OpenAI auth. | Best two-backend roadmap. |
| OpenCode | no pass | ANTHROPIC_API_KEY shape reached 429; OAuth envs ignored. | Gate behind API-key/non-Claude auth. |
| Pi | pass | ANTHROPIC_OAUTH_TOKEN real-token pass. | Technical alternate, policy-gated. |
| Rig.rs | no OAuth path | ANTHROPIC_API_KEY/x-api-key source path. | Requires custom auth/harness story. |
| LangChain Python/JS | API-key 429 | ANTHROPIC_API_KEY; OAuth envs ignored. | Use API/cloud auth or wrap Claude Code. |
| LangGraph | inherits model auth | Checkpoint/runtime layer, not provider-auth layer. | Pair with selected model adapter. |
| Deep Agents | inherits model auth | Uses LangChain/provider auth. | Good harness layer; auth still ours. |
| LangSmith Agent Server | not setup-token | LangSmith/cloud deployment auth. | Deployment product, not local default. |
| LangServe | not setup-token | App/server-owned provider credentials. | REST wrapper only. |
| ACP protocol | none | Protocol does not define provider auth. | Auth belongs to backend/wrapper. |
State, Skills, and Sandbox Matrix
| Candidate | Skills | Resume/fork/background | Sandbox communication |
|---|---|---|---|
| Claude Code | native SKILL.md | native resume/continue/fork/background agents. | Existing stdio stream JSON inside OpenShell. |
| Codex standalone | native skills/app-server surfaces. | native resume, fork, thread protocol. | OpenShell smoke pass for CLI; use exec --json or app-server JSON-RPC. |
| Claude Code + Codex | preserved | two stores each backend keeps its own session semantics. | Backend router chooses Claude stream or Codex JSON-RPC per session. |
| OpenCode | native Claude/Agents-style skill roots. | CLI continue/session/fork/export/import surfaces. | OpenShell smoke pass for CLI. Use stdio JSON first; ACP/serve after event audit. |
| Pi | native Agent Skills and --skill. | CLI/RPC continue/resume/session/fork; RPC tests cover state, JSONL sessions, bash. | OpenShell smoke pass for CLI. Full RPC/tool e2e in OpenShell remains the next proof. |
| Rig.rs | none | library memory/history primitives; Right Agent defines branches. | Rust runner should run as prebuilt binary in sandbox. Current sandbox lacks Cargo. |
| LangChain Python/JS | library custom code required. | not alone use LangGraph for durable state. | Run Right Agent-owned Python/JS harness in OpenShell. |
| LangGraph | via wrapper | native threads, checkpointers, replay, update-state fork. | Python exists in OpenShell; package/runtime install must be sandbox-controlled. |
| Deep Agents | native SKILL.md progressive disclosure. | LangGraph checkpointer-backed; background depends on deployment/runtime. | Run backend filesystem/shell inside OpenShell; ACP example exists. |
| LangSmith Agent Server | deployment | server background runs/task queues. | External/self-hosted service, not sandbox subprocess by default. |
| LangServe | not enough | not enough wrapper around chains. | HTTP service adds network surface without solving agent lifecycle. |
| ACP protocol | backend-owned | optional/draft resume optional; fork draft. | JSON-RPC over stdio inside sandbox; never expose host fs/terminal. |
7. Detailed Findings
π§© LangChain / LangGraph / Deep Agents ecosystem first-class candidate
What the ecosystem provides
LangChain Python/JS gives model/tool abstractions, custom tools, streaming, provider integrations, and MCP adapters. LangGraph adds durable execution through threads and checkpointers. LangSmith Deployment adds Agent Server, background runs, task queues, streaming, cron jobs, and deployment/runtime APIs. LangServe remains a REST wrapper and should not be the core Right Agent lane for sandboxed bot-harness execution.
Deep Agents is the relevant agent-harness layer. Its docs and source show SKILL.md support, progressive disclosure, filesystem/shell backends, human-in-the-loop gates, CLI/headless mode, persistent local SQLite sessions, MCP, and ACP library code.
langchain-anthropic reads ANTHROPIC_API_KEY. Real CLAUDE_CODE_OAUTH_TOKEN and ANTHROPIC_OAUTH_TOKEN variants were ignored by Python/JS. Putting the setup token in ANTHROPIC_API_KEY reached Anthropic but returned 429 rate-limit responses. This lane requires explicit API-key/cloud-provider auth or a Right Agent-owned Claude Code subprocess adapter.Resume/fork answer
LangGraph directly supports checkpoint-backed threads, replay, and fork-like branching via get_state_history() and update_state(). Replay re-executes nodes after the checkpoint; it is not a cache read. Fork is a new checkpoint branch from a prior config.
With ACP, LangGraph can preserve Right Agent resume/fork if the wrapper maps ACP session IDs to LangGraph thread/checkpoint IDs and exposes stable session/load/session/resume. Fork needs draft ACP session/fork or a Right Agent custom method/meta convention.
π Claude Code subscription/setup-token official + local pass
Official facts: current Claude Code authentication docs list credential precedence and explicitly define CLAUDE_CODE_OAUTH_TOKEN as a one-year token generated by claude setup-token. The token authenticates with Pro, Max, Team, or Enterprise subscription credentials; --bare does not read it. Docs also say ANTHROPIC_API_KEY takes precedence and can move usage to API billing.
Local fact: clean install of @anthropic-ai/claude-code@2.1.141 with only real CLAUDE_CODE_OAUTH_TOKEN returned RA_OK. The negative-control test with only ANTHROPIC_OAUTH_TOKEN failed with βNot logged inβ.
Policy/business risk: Help Center billing docs describe Pro/Max usage limits and separate API-credit behavior. GitHub issues show unresolved/auth-cost anecdotes, including setup-token not enough in some containers, subscription recognition failures, API fallback/billing confusion, and double-billing allegations. These are not authoritative facts, but they are product-risk signals.
External-user decision: official Claude Code remains the default setup-token path. Pi has a real technical pass through ANTHROPIC_OAUTH_TOKEN, but third-party consumption of subscription tokens requires vendor-policy review, revocation handling, and sandbox-only injection before user-facing release.
π‘ Pi real OAuth pass
Pi now has a real-token pass. Source shows ANTHROPIC_OAUTH_TOKEN takes precedence over ANTHROPIC_API_KEY. Its Anthropic provider treats tokens containing sk-ant-oat as OAuth, sends bearer auth, and includes Claude Code beta/identity headers. Clean local run with the real setup token in ANTHROPIC_OAUTH_TOKEN returned RA_OK.
Skills are native: help exposes --skill, --no-skills, and docs/source show Agent Skills conventions. Session controls are native: --continue, --resume, --session, and --fork.
Remaining risk: Pi is third-party code consuming a Claude subscription token. The technical pass is real, but product policy requires vendor guidance and a revocation/containment story before exposing this to external users.
π¦ OpenCode best CLI alternate
OpenCode remains the best CLI/ACP alternate because source and CLI help show run, serve, acp, session continuation/fork, MCP management, provider auth, and native skill loading. It loads skill roots from Claude/Agents-style locations, which makes current Right Agent user skill preservation realistic. It is not ranked above Pi for Claude-subscription replacement because it does not have a successful setup-token pass.
Setup-token parity did not pass. Source search found no CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_OAUTH_TOKEN Anthropic provider path. Real-token tests with those envs were ignored. Putting the setup token in ANTHROPIC_API_KEY enabled the Anthropic provider and reached Anthropic, but the run timed out after 429 rate-limit errors. That is an attempted real-token path, not a successful Claude-subscription pass.
Recommendation: build OpenCode behind capability gates. Start with its native run/server event path. Use ACP only if it improves lifecycle and permission normalization inside OpenShell.
π¦ Rig.rs programmatic Rust lane
Rig.rs should not be rejected for lacking CLI session identity. The relevant question is whether Right Agent wants to own a Rust runner. Source shows Anthropic provider support, streaming/memory/tool primitives, and MCP examples. Source also shows Anthropic auth is API-key based: Client::from_env() reads ANTHROPIC_API_KEY, and AnthropicKey emits x-api-key. A direct first-party Anthropic x-api-key probe with the real setup token returned 429, matching the OpenCode/LangChain API-key-shaped path.
Rig.rs does not supply native SKILL.md, OpenClaw skill directories, Claude setup-token auth, or a ready Right Agent-style subprocess protocol. If selected, Right Agent builds skills, prompt rendering, session storage, fork/background semantics, MCP lifecycle, and JSON-RPC/ACP transport.
βοΈ Codex + Claude Code two-backend path
Codex should be researched as the OpenAI half of a two-backend Right Agent, paired with Claude Code. Claude Code keeps Claude subscription/setup-token onboarding; Codex adds OpenAI models, OpenAI account auth, native SKILL.md loading, resume/fork, MCP server support, and two integration surfaces: codex exec --json for simple noninteractive turns and experimental codex app-server JSON-RPC for richer thread/control-plane integration.
Clean real-token tests with CLAUDE_CODE_OAUTH_TOKEN and ANTHROPIC_OAUTH_TOKEN ignored Claude envs and attempted OpenAI auth, returning 401 without OpenAI credentials. That is not a negative for the revised path; it confirms Codex must not be evaluated as a Claude-token replacement.
Right Agent path: add a backend router with explicit provider identity: claude-code sessions use Claude setup-token and current stream normalization; codex sessions use OpenAI auth and Codex event normalization. Keep credentials separate, label billing/account mode in every audit record, and make bot commands expose backend selection before Codex reaches external users.
π ACP capabilities and limits stable vs draft
| ACP capability | Status | Right Agent answer |
|---|---|---|
| Session creation | stable | session/new creates a session if auth allows it. |
| Session listing | stable optional | session/list exists only when the agent advertises capability. |
| Load/resume | stable optional | session/load and session/resume require advertised support and backend persistence. |
| Fork | draft/RFD | session/fork is not portable stable ACP. |
| Background continuation | framework-specific | Use backend run/task queues or custom wrapper methods; ACP does not create background semantics by itself. |
| Turn cancellation | stable | session/cancel is defined. |
| Tools and permissions | stable | Good event shape; requires Right Agent audit mapping. |
| Filesystem/terminal | client capability | Danger: client-side fs/terminal must be inside OpenShell, never the host bot. |
Custom _meta | extension only | Safe for hints/observability; unsafe as portable semantics because ACP says implementations must not assume values. |
Direct answer: ACP supports resume only when the specific agent advertises stable optional sessionCapabilities.resume or loadSession. ACP does not support fork by itself in stable protocol; fork is a draft RFD. ACP plus Rig.rs can preserve resume/fork only if Right Agent builds storage and fork semantics. ACP plus LangGraph can preserve resume/fork more naturally by mapping ACP sessions to LangGraph threads/checkpoints, but the wrapper still owns the product contract.
8. Bot-to-Sandbox Communication
Claude Code
Bot spawns claude -p --output-format stream-json inside OpenShell. Keep credentials outside sandbox until injected. Use existing stream normalizer.
OpenCode
Start with subprocess run/server JSON. Evaluate ACP after event mapping. Do not let ACP fs/terminal reach host.
Pi
Use RPC/text mode inside sandbox. Map sessions/fork explicitly and verify noninteractive tool approval behavior.
Codex + CC
Use a backend router: Claude Code handles Claude/subscription sessions; Codex handles OpenAI sessions through exec --json or app-server JSON-RPC. Keep credentials, billing labels, and session stores separate.
Deep Agents
Run a Right Agent-owned Python/JS harness inside OpenShell. Bot talks stdio JSON-RPC/ACP or sandbox-local HTTP. Keep LangChain tools in sandbox.
Rig.rs
Compile a Right Agent Rust runner inside sandbox. Use stdio JSON-RPC first; ACP wrapper optional.
LangSmith Deployment
Useful for managed runtime research, but external SaaS or self-hosted Agent Server is a deployment product choice, not a local sandbox subprocess by default.
ACP
Good common envelope when native or wrapper-owned. It is not enough for auth, billing, skills, persistence, fork, or sandbox policy.
9. Recommendation
- Keep Claude Code as default. It is the only backend with official setup-token semantics and a clean real-token pass.
- Add a backend capability trait. Include auth mode, skills, MCP, schema output, streaming, resume, fork, background continuation, ACP support, sandbox transport, and policy status.
- Make Codex + Claude Code the first two-backend path. Claude Code remains the Claude/subscription backend; Codex becomes the OpenAI backend through OpenAI auth,
exec --json, and app-server JSON-RPC. Do not spend effort making Codex consume Claude setup tokens. - Keep Pi as the Claude-subscription alternate. It is the only non-Claude-Code runtime with a clean setup-token pass. The open question is product policy, not technical viability.
- Build OpenCode as the CLI/ACP alternate. It has the closest CLI/server/session shape. Do not promise Claude setup-token support unless the 429/API-key path is confirmed under usable limits and vendor policy.
- Promote Deep Agents/LangGraph to the programmatic lane. It can preserve skills and resume/fork if Right Agent owns the wrapper and accepts Python/JS runtime complexity.
- Keep Rig.rs as a Rust-native prototype lane. It is attractive only if owning the full runtime is desirable.
- Use ACP selectively. ACP is useful for bot-harness communication, not a substitute for backend capability detection.
- Separate onboarding by backend. Claude Code gets setup-token onboarding. Pi can technically use setup-token OAuth through
ANTHROPIC_OAUTH_TOKEN. OpenCode/LangChain/Rig need API-key-shaped auth proof under real limits or a wrapper that shells out to official Claude Code. - Treat third-party setup-token support as policy-sensitive. The technical matrix now has real-token evidence, but external-user exposure requires vendor guidance, revocation handling, and sandbox-only credential injection.
10. Evidence Log and Sources
Evidence quality key
Source URLs and local source snapshots
- official Claude Code auth docs: Authentication; Pro/Max billing help: Use Claude Code with Pro or Max; GitHub Actions docs: Claude Code GitHub Actions.
- GitHub Claude Code issues/discussions: #8938 setup-token container auth, #23315 billing allegation, #31012 subscription recognition, #2944 API fallback, Action discussion #428.
- official LangChain docs: Deep Agents skills, LangChain MCP, LangGraph time travel, LangGraph interrupts, LangSmith Deployment, Agent Server, background runs.
- source LangChain clones:
/private/tmp/right-agent-langchain-srccommit9db9628d793909ce8f3eed4e57b5935f2fb0a235;/private/tmp/right-agent-langgraph-srccommit076e2a3627206f5a1aef573aaca4a01e5af897ca;/private/tmp/right-agent-deepagents-srccommitb3c577de9243b1dc9919f5534273667c50fd0c2d. - source OpenCode: local source
/private/tmp/right-agent-opencode-src, tagv1.14.49, commit1a47578445eb2c630976c089b8e62085f5b5cc17; packageopencode-ai@1.14.49. - source Pi: local source
/private/tmp/right-agent-pi-src, tagv0.74.0, commit1eee081e29c1323c40b98db11d0a62b919831881; package@earendil-works/pi-coding-agent@0.74.0. - source Rig.rs: local source
/private/tmp/right-agent-rig-src, tagrig-v0.37.0, commit63d215f14409293c1d68cd43265f5283c3326fae; craterig-core 0.37.0. - source Codex: local source
/private/tmp/right-agent-codex-src, tagrust-v0.130.0, commit58573da43ab697e8b79f152c53df4b42230395a8; package@openai/codex@0.130.0. Re-checked CLI help forexec --json,app-server,mcp-server,resume,fork, and source protocol items includingthread/resume,thread/fork,skills/list, andgetAuthStatus. - official ACP docs: schema, session setup, session list, session fork RFD, file system, terminals.
- local Package metadata checked on 2026-05-14: Claude Code
2.1.141, OpenCode1.14.49, Pi0.74.0, Codex0.130.0,langchain-anthropic 1.4.3,langgraph 1.2.0,deepagents 0.6.1,langchain-mcp-adapters 0.2.2,@langchain/anthropic 1.3.29,@langchain/langgraph 1.3.0,deepagents 1.10.2,@langchain/mcp-adapters 1.1.3,langsmith 0.8.4,langserve 0.3.3. - local Real-token auth retest on 2026-05-14: Pi
ANTHROPIC_OAUTH_TOKENpass; OpenCode/LangChainANTHROPIC_API_KEYpaths reached Anthropic 429; Codex ignored Claude envs; direct first-party Anthropicx-api-keyand bearer OAuth probes both returned 429 after earlier Claude Code/Pi passes. - local OpenShell smoke on 2026-05-14:
right-test-sharedexecuted Pi0.74.0, OpenCode1.14.49, and Codex0.130.0with isolated sandbox-local temp HOME/cache and no secrets;python3exists at/sandbox/.venv/bin/python3;cargois absent.
Redacted command and source appendix
- command Real-token commands used environment variables populated from
/tmp/setup-tokenwith shell tracing off. The token value was never included in command text, stdout, or this report. - command OpenShell smoke shape:
openshell sandbox exec -n right-test-shared --no-tty --timeout 180 -- sh -lc '... npx -y @earendil-works/pi-coding-agent@0.74.0 --version; npx -y opencode-ai@1.14.49 --version; npx -y @openai/codex@0.130.0 --version'. - command Pi real-token shape:
ANTHROPIC_OAUTH_TOKEN=<redacted> npx -y @earendil-works/pi-coding-agent@0.74.0 --provider anthropic --model claude-sonnet-4-5 --mode json --no-tools --no-session -p 'Reply exactly RA_OK'. - source Pi auth:
/private/tmp/right-agent-pi-src/packages/ai/src/env-api-keys.ts:96-99and/private/tmp/right-agent-pi-src/packages/ai/src/providers/anthropic.ts:758-839. - source Pi RPC:
/private/tmp/right-agent-pi-src/packages/coding-agent/test/rpc.test.ts:14-126covers OAuth/API-key gated RPC startup, state, session JSONL, compaction, and bash command execution. - source Rig Anthropic auth:
/private/tmp/right-agent-rig-src/crates/rig-core/src/providers/anthropic/client.rs:55-60emitsx-api-key;:110-116readsANTHROPIC_API_KEY. - source Deep Agents skills:
/private/tmp/right-agent-deepagents-src/action.yml:153-203installs directories containingSKILL.md. - source Deep Agents ACP:
/private/tmp/right-agent-deepagents-src/libs/acp/examples/demo_agent.py:13-19imports Deep Agents, LangGraph checkpointer, andAgentServerACP;:45-78builds the checkpointer-backed agent;:118-119runs ACP. - source Codex app-server protocol:
/private/tmp/right-agent-codex-src/codex-rs/app-server-protocol/schema/typescript/v2/ThreadResumeParams.ts:10-20andThreadForkParams.ts:10-18.
Completion audit against /tmp/goal-unconfirmed
- ACP is not presented as a backend alternative: passed.
- LangChain/LangGraph is researched as first-class: passed, with Deep Agents separated from generic LangChain.
- ACP resume/fork answered directly: passed, stable optional resume/load, draft fork.
- Skills support assessed for every candidate: passed.
- Claude setup-token research includes official docs, GitHub anecdotes, and local test: passed.
- Local tests use clean installs/config: passed for all executed tests.
- Third-party real-token tests: rerun after explicit approval. Pi passed. OpenCode/LangChain API-key-shaped paths reached Anthropic 429. Codex ignored Claude envs. Rig.rs has no first-party harness CLI; source plus direct first-party wire-equivalent probe was recorded.
- OpenShell proof: passed for Pi/OpenCode/Codex CLI smoke; Python present; Cargo absent, so Rig sandbox execution requires prebuilt binary or sandbox image change.
- Weak claims: passed. Remaining partials are listed in the resolved-claim ledger with exact evidence level and next proof path.
- Secret exposure: token was not printed, not embedded in this report, and not found in disposable
/private/tmp/right-agent-auth-*files after the run. - Revised scoring: passed, with implementation effort excluded, weighted fit-only rubric, per-aspect point columns, and arithmetic totals out of 85 for Claude Code, Codex + Claude Code, Pi, OpenCode, Deep Agents/LangGraph, ACP, Rig.rs, LangSmith Deployment, generic LangChain, and LangServe. Local verifier confirmed all ten score rows sum correctly.
- Codex path re-research: passed, Codex is now presented as the OpenAI half of a two-backend
Codex + Claude Codestrategy, not as a Claude setup-token replacement. - Quick comparison without full prose: heatmap, scorecards, filters, and test matrix provided.