Research date: 2026-05-14 Branch: research/agent-backend-expansion-report Secret hygiene: setup token never printed

Right Agent backend expansion research

Interactive brief for comparing Claude Code, OpenCode, Pi, Rig.rs, Codex, LangChain/LangGraph, and ACP-backed runtimes. ACP is analyzed as a bot-harness communication capability, not as a backend alternative.

1. Read This First

Ship path: keep Claude Code as the default and preserve setup-token onboarding. When Right Agent supports two backends, add Codex alongside Claude Code as the OpenAI backend instead of treating Codex as a Claude-token replacement.

Corrected auth result: the real setup token was tested beyond Claude Code after explicit approval. Pi passed through ANTHROPIC_OAUTH_TOKEN. OpenCode and LangChain reached Anthropic only through ANTHROPIC_API_KEY/x-api-key and hit 429 rate-limit responses. Codex ignored Claude envs because its value is a separate OpenAI-auth backend. Rig source has no OAuth env path; a direct first-party Anthropic x-api-key probe produced the same 429 shape.

Best new research finding: generic LangChain is not a drop-in CLI backend, but the LangChain ecosystem now has a concrete Deep Agents layer with SKILL.md, filesystem/shell backends, LangGraph checkpoints, and ACP code.

2. Visual Decision

πŸ€– BotRust Telegram control plane owns credentials, policy, session routing.
πŸ“¦ OpenShellEvery runtime executes inside sandbox, including LangChain/Rig wrappers.
πŸ”Œ Adapterstdio JSON/NDJSON first; ACP only when it removes complexity.
🧠 RuntimeClaude CLI, OpenCode CLI, Pi RPC, Codex exec, Deep Agents, or Rig runner.
🧾 AuditNormalize events, tool calls, errors, billing mode, resume/fork capability.

3. Scoring Model and Full Overview

Scores are Right Agent product-fit scores for external users, not raw library quality. Implementation effort is deliberately excluded. Composite roadmap scores, such as Codex + Claude Code, measure product capability after multi-backend support exists. That means a composite can outrank a single runtime while still carrying integration risk.

Score cells are heatmapped by percentage of each criterion maximum; bar width shows points captured inside that column. Totals are raw sums out of 85.

High Medium Low
Weight Criterion What earns points
25Auth/onboardingWorks with current setup-token/subscription flow, or has a clean replacement for external users.
15Skills compatibilityPreserves OpenClaw/ClawHub SKILL.md semantics with minimal adapter work.
15Session durabilityResume, fork, background continuation, and inspectable state are native or easy to model.
15Sandbox communicationClean process/API surface for a Rust bot talking to a sandboxed runtime.
15Evidence and policy riskHigher confidence from source/local tests and fewer vendor-policy unknowns.
Rank Platform/library Auth
/25
Skills
/15
State
/15
Sandbox
/15
Evidence
/15
Total
/85
Lane Overview verdict
1Codex + Claude Code251515141584Two-backend pathHighest product-capability path once multi-backend support exists: Claude Code keeps Claude subscription/setup-token semantics while Codex adds a native OpenAI backend with exec --json, app-server, skills, resume, and fork.
2Claude Code251514141381Default single runtimeBest current single-backend fit because it already matches setup-token auth, skills, MCP, streaming, resume/fork, and current product behavior. It loses a few points for known subscription/billing anecdotes and platform-specific operational rough edges.
3Pi231414111274Claude-subscription alternateOnly non-Claude-Code candidate with a clean real-token pass. It beats OpenCode for Claude-subscription replacement; loses sandbox points because the CLI was smoke-tested in OpenShell but the RPC/tool/event contract has source/test proof rather than a full OpenShell e2e run.
4Deep Agents + LangGraph101315121565Programmatic harnessBest programmatic path for skills plus durable resume/fork. Low auth score because it is not setup-token native.
5ACP10812151560Adapter/protocolGood bot-to-harness envelope when the backend already owns auth/state. Auth and skills score low because ACP carries these concerns; it does not define them.
6OpenCode10151312858CLI/ACP alternateStrongest alternate CLI surface and native skills/session/ACP shape. It loses auth/evidence points because no successful Claude setup-token pass was observed; API-key-shaped path only reached 429.
7Rig.rs1209131246Rust library laneRust-native and attractive for long-term ownership, but it lacks native SKILL.md, Claude setup-token auth, and ready-made Right Agent session semantics.
8LangSmith Deployment581121339Deployment productUseful for managed/self-hosted runtime research, but not a local sandbox subprocess default for Right Agent.
8Generic LangChain1034101239Model/tool libraryUseful abstraction layer, but insufficient alone as a Right Agent backend without LangGraph/Deep Agents and a custom harness.
10LangServe42261226REST wrapperLowest fit for this product decision. It can expose chains, but does not solve sandboxed agent lifecycle, setup-token auth, or resume/fork.
Codex + CC84/85

Best roadmap path: keep Claude Code for Claude, add Codex for OpenAI.

source + help verified
Claude Code81/85

Best current single backend: setup-token auth, skills, MCP, streaming, and current product parity.

local real-token pass
Pi74/85

Best Claude-subscription alternate: real setup token passed, with native skills and resume/fork controls.

local real-token pass
Deep Agents + LangGraph65/85

Best owned programmatic harness if Right Agent accepts Python/JS runtime ownership.

first-class new lane
ACP60/85

Adapter fit score only: useful protocol envelope, not a backend or auth/session source.

stable + draft split
OpenCode58/85

Best CLI/ACP alternate: strong run/server/session paths, no successful setup-token pass.

source-proven
Rig.rs46/85

Rust-native provider/tooling library; Right Agent would own most agent runtime semantics.

source-proven
External-user product rule⚠️

Technical auth success does not prove policy fitness. For external users, subscription/setup-token use needs vendor guidance, revocation handling, and sandbox-only credential injection.

4. Capability Heatmap

Cells are quick-read summaries. Expand the details below for evidence and caveats.

Candidate Setup token Skills MCP/tools Resume Fork/background Sandbox communication Right Agent ownership
Claude Code native
Real CLAUDE_CODE_OAUTH_TOKEN pass.
native
Current product skill path remains intact.
native
MCP + strict config already integrated.
native
--resume/--continue.
native
--fork-session, background agents.
Process stdio with stream JSON inside OpenShell. Adapter + policy normalization.
OpenCode partial
Claude/OAuth envs ignored; ANTHROPIC_API_KEY path reached Anthropic 429.
native
Loads .claude/.agents skill roots.
native
Run, server, MCP, ACP surfaces.
source
Session continue/export/import paths.
source
--fork on continued sessions.
Prefer stdio JSON first; ACP only if event semantics help. Auth onboarding, schema parity, event mapping.
Pi pass
Real setup token passed through ANTHROPIC_OAUTH_TOKEN.
native
Agent Skills standard and --skill.
adapter
Extensions, custom providers, RPC mode.
native
--continue/--resume/--session.
native
--fork exposed in help.
Prefer --mode rpc or JSON/text subprocess inside sandbox. RPC mapping, OAuth policy, tool approval semantics.
LangChain/LangGraph/Deep Agents API key 429
Claude/OAuth envs ignored; setup token as ANTHROPIC_API_KEY reached Anthropic 429.
Deep Agents
Native in Deep Agents; not generic LangChain.
native
LangChain MCP adapters and custom tools.
native
LangGraph threads/checkpointers.
native
Time travel, replay, update-state fork.
Right Agent-owned Python/JS harness inside OpenShell. Full runtime harness, auth, prompts, final output, policy.
Rig.rs wire 429
Source uses ANTHROPIC_API_KEY/x-api-key; direct first-party probe returned 429.
none
No native SKILL.md.
library
MCP/tooling primitives, not product harness.
build
Use memory/history primitives.
build
Right Agent defines branch semantics.
Rust runner in sandbox over stdio JSON-RPC or ACP. Almost everything above provider/tool abstractions.
Codex + Claude Code separate
Claude Code keeps setup-token; Codex uses OpenAI auth.
native
Codex skills loader supports SKILL.md.
native
MCP server/app-server surfaces.
native
resume command.
native
fork command.
codex exec --json or app-server JSON-RPC inside sandbox. Backend router, OpenAI auth onboarding, cost/account labeling.
ACP-backed runtime none
Protocol does not define provider auth.
none
Skills live in backend/wrapper.
protocol
Tool calls and permissions are protocol messages.
optional
session/load, session/resume gated by capabilities.
draft
session/fork is RFD/draft, not stable.
JSON-RPC over stdio preferred. Wrapper state store, fork/background semantics.

5. Local Auth/Test Matrix

All tests used disposable directories under /private/tmp or sandbox-local /tmp. The real setup token was never printed. After explicit approval, real-token tests were run through Claude Code, Pi, OpenCode, LangChain Python/JS, and Codex env paths. Rig.rs is a library, not a first-party harness CLI, so the report records source evidence plus a direct first-party Anthropic wire probe matching Rig's x-api-key path.

Runtime Install/version Env tested Command shape Observed result Evidence level
Claude Code npx -y @anthropic-ai/claude-code@2.1.141 Real CLAUDE_CODE_OAUTH_TOKEN; no ANTHROPIC_API_KEY. claude -p "Reply exactly RA_OK" --output-format text --model sonnet --tools "" --no-session-persistence PASS: status 0, output RA_OK, no token leak. Dir: /private/tmp/right-agent-auth-claude-cc.JA4uXT. local real-token
Claude Code negative control @anthropic-ai/claude-code@2.1.141 Real token placed only in ANTHROPIC_OAUTH_TOKEN; no Claude env. Same non-interactive command. FAIL EXPECTED: Not logged in Β· Please run /login. Dir: /private/tmp/right-agent-auth-claude-anthoauth.6m1IGd. local real-token negative
OpenCode npx -y opencode-ai@1.14.49 Real token in CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_OAUTH_TOKEN, and ANTHROPIC_API_KEY variants. opencode run --pure --model anthropic/claude-sonnet-4-5-20250929 PARTIAL: Claude/OAuth env variants were ignored; ANTHROPIC_API_KEY enabled Anthropic provider and reached API, then timed out after 429 rate-limit errors in logs. Dir: /private/tmp/right-agent-auth-opencode-anthoapi-model.btqqy1. local real-token source
Pi npx -y @earendil-works/pi-coding-agent@0.74.0 Real setup token in ANTHROPIC_OAUTH_TOKEN; no ANTHROPIC_API_KEY. pi -p --model anthropic/claude-sonnet-4-5-20250929 --no-tools --no-session PASS: status 0, output RA_OK. Dir: /private/tmp/right-agent-auth-pi-real.Lu1sds. local real-token source
Codex + Claude Code strategy npx -y @openai/codex@0.130.0 Real token in CLAUDE_CODE_OAUTH_TOKEN and ANTHROPIC_OAUTH_TOKEN variants; no OpenAI key. codex exec --json; codex app-server --listen stdio://; source protocol scan. DUAL-BACKEND FIT: Claude envs are ignored, as expected. Help/source verify exec --json, app-server, mcp-server, resume, fork, skills/list, and app-server JSON-RPC. The correct path is OpenAI auth plus Claude Code, not Claude token translation. local help source
LangChain Python + LangGraph uv pip install langchain-anthropic==1.4.3 langgraph==1.2.0 deepagents==0.6.1 langchain-mcp-adapters==0.2.2 Real token in CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_OAUTH_TOKEN, and ANTHROPIC_API_KEY variants. Construct ChatAnthropic and invoke claude-sonnet-4-5-20250929; separately run local LangGraph checkpoint/fork harness. PARTIAL: Claude/OAuth env variants constructed then failed auth resolution; ANTHROPIC_API_KEY path reached Anthropic and returned 429. LangGraph local resume/fork harness still passed. Dirs: /private/tmp/right-agent-auth-langchain-py-real-ANTHOAPI.qvoaJZ, /private/tmp/right-agent-auth-langchain-py.dekdPY. local harness source
LangChain.js + LangGraph.js npm install @langchain/anthropic@1.3.29 @langchain/langgraph@1.3.0 deepagents@1.10.2 @langchain/mcp-adapters@1.1.3 Real token in CLAUDE_CODE_OAUTH_TOKEN, ANTHROPIC_OAUTH_TOKEN, and ANTHROPIC_API_KEY variants. Construct ChatAnthropic; run local LangGraph.js memory saver harness. PARTIAL: Claude/OAuth env variants reported Anthropic API key not found; ANTHROPIC_API_KEY path reached Anthropic and returned 429 with retries disabled. LangGraph.js local harness still passed. Dirs: /private/tmp/right-agent-auth-langchain-js-real-ANTHOAPI-noretry.bCWTp3, /private/tmp/right-agent-auth-langchain-js.NKhQXm. local real-token source
Rig.rs rig-core 0.37.0, source tag rig-v0.37.0 Source supports only ANTHROPIC_API_KEY; direct first-party probe used real token on matching x-api-key path. Source read plus direct curl to Anthropic /v1/messages with x-api-key. SOURCE + WIRE: no first-party Rig harness CLI exists to run. Source uses ANTHROPIC_API_KEY/x-api-key; direct Anthropic x-api-key probe returned 429. Source has no CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_OAUTH_TOKEN path. source first-party wire test

6. Resolved Claim Ledger

This section is the audit surface for prior weak claims. "Blocked" means the report records the exact blocker and the next proof path; it is not scored as confirmed evidence.

Claims Cleared

Claim Resolution Evidence Product meaning
Pi setup-token supportconfirmedReal /tmp/setup-token passed via ANTHROPIC_OAUTH_TOKEN; source resolves that env before ANTHROPIC_API_KEY and sends OAuth bearer auth for sk-ant-oat tokens.Technically viable Claude-subscription alternate; policy/revocation remains a release gate.
OpenCode setup-token supportnot confirmedReal CLAUDE_CODE_OAUTH_TOKEN/ANTHROPIC_OAUTH_TOKEN variants were ignored. ANTHROPIC_API_KEY shape reached Anthropic and returned 429.Do not advertise Claude setup-token support. Treat as API-key or non-Claude backend until proven otherwise.
Rig.rs authsource + wireSource reads ANTHROPIC_API_KEY and emits x-api-key. Direct Anthropic x-api-key probe with the setup token returned 429.No native Claude setup-token/OAuth lane. Rig requires a Right Agent-owned harness and auth decision.
ACP resume/forksplit answerACP stable docs define optional loadSession and sessionCapabilities.resume; fork is an RFD/draft session/fork.ACP can carry resume only when the agent advertises it. Fork is backend/wrapper-specific for now.
LangChain ecosystemexpandedLangChain split into generic LangChain, LangGraph, Deep Agents, LangSmith/Agent Server, and LangServe. Deep Agents source/docs show SKILL.md, backends, checkpointers, and ACP.The viable programmatic lane is Deep Agents + LangGraph, not generic LangChain alone.
Codex + Claude Code scoringclarifiedCodex source/help show OpenAI-auth backend surfaces; Claude Code keeps Claude setup-token onboarding.The composite scores product capability after a backend router exists. It is not a claim that Codex consumes Claude tokens.
OpenShell executionsmoke-provenInside right-test-shared, isolated temp HOME/cache executed Pi 0.74.0, OpenCode 1.14.49, and Codex 0.130.0. Python exists; Cargo is absent.Node CLIs are runnable in OpenShell. LangChain can run with Python; Rig runner must be prebuilt/copied or sandbox image must include Cargo.

Auth and Setup-Token Matrix

CandidateSetup-token resultObserved auth surfaceRelease decision
Claude CodepassCLAUDE_CODE_OAUTH_TOKEN official and local pass.Default Claude backend.
Codex standalonenot applicableOpenAI auth; Claude envs ignored.OpenAI backend only.
Claude Code + CodexpreservedClaude Code handles Claude setup-token; Codex handles OpenAI auth.Best two-backend roadmap.
OpenCodeno passANTHROPIC_API_KEY shape reached 429; OAuth envs ignored.Gate behind API-key/non-Claude auth.
PipassANTHROPIC_OAUTH_TOKEN real-token pass.Technical alternate, policy-gated.
Rig.rsno OAuth pathANTHROPIC_API_KEY/x-api-key source path.Requires custom auth/harness story.
LangChain Python/JSAPI-key 429ANTHROPIC_API_KEY; OAuth envs ignored.Use API/cloud auth or wrap Claude Code.
LangGraphinherits model authCheckpoint/runtime layer, not provider-auth layer.Pair with selected model adapter.
Deep Agentsinherits model authUses LangChain/provider auth.Good harness layer; auth still ours.
LangSmith Agent Servernot setup-tokenLangSmith/cloud deployment auth.Deployment product, not local default.
LangServenot setup-tokenApp/server-owned provider credentials.REST wrapper only.
ACP protocolnoneProtocol does not define provider auth.Auth belongs to backend/wrapper.

State, Skills, and Sandbox Matrix

CandidateSkillsResume/fork/backgroundSandbox communication
Claude Codenative SKILL.mdnative resume/continue/fork/background agents.Existing stdio stream JSON inside OpenShell.
Codex standalonenative skills/app-server surfaces.native resume, fork, thread protocol.OpenShell smoke pass for CLI; use exec --json or app-server JSON-RPC.
Claude Code + Codexpreservedtwo stores each backend keeps its own session semantics.Backend router chooses Claude stream or Codex JSON-RPC per session.
OpenCodenative Claude/Agents-style skill roots.CLI continue/session/fork/export/import surfaces.OpenShell smoke pass for CLI. Use stdio JSON first; ACP/serve after event audit.
Pinative Agent Skills and --skill.CLI/RPC continue/resume/session/fork; RPC tests cover state, JSONL sessions, bash.OpenShell smoke pass for CLI. Full RPC/tool e2e in OpenShell remains the next proof.
Rig.rsnonelibrary memory/history primitives; Right Agent defines branches.Rust runner should run as prebuilt binary in sandbox. Current sandbox lacks Cargo.
LangChain Python/JSlibrary custom code required.not alone use LangGraph for durable state.Run Right Agent-owned Python/JS harness in OpenShell.
LangGraphvia wrappernative threads, checkpointers, replay, update-state fork.Python exists in OpenShell; package/runtime install must be sandbox-controlled.
Deep Agentsnative SKILL.md progressive disclosure.LangGraph checkpointer-backed; background depends on deployment/runtime.Run backend filesystem/shell inside OpenShell; ACP example exists.
LangSmith Agent Serverdeploymentserver background runs/task queues.External/self-hosted service, not sandbox subprocess by default.
LangServenot enoughnot enough wrapper around chains.HTTP service adds network surface without solving agent lifecycle.
ACP protocolbackend-ownedoptional/draft resume optional; fork draft.JSON-RPC over stdio inside sandbox; never expose host fs/terminal.

7. Detailed Findings

🧩 LangChain / LangGraph / Deep Agents ecosystem first-class candidate

What the ecosystem provides

LangChain Python/JS gives model/tool abstractions, custom tools, streaming, provider integrations, and MCP adapters. LangGraph adds durable execution through threads and checkpointers. LangSmith Deployment adds Agent Server, background runs, task queues, streaming, cron jobs, and deployment/runtime APIs. LangServe remains a REST wrapper and should not be the core Right Agent lane for sandboxed bot-harness execution.

Deep Agents is the relevant agent-harness layer. Its docs and source show SKILL.md support, progressive disclosure, filesystem/shell backends, human-in-the-loop gates, CLI/headless mode, persistent local SQLite sessions, MCP, and ACP library code.

Auth finding: langchain-anthropic reads ANTHROPIC_API_KEY. Real CLAUDE_CODE_OAUTH_TOKEN and ANTHROPIC_OAUTH_TOKEN variants were ignored by Python/JS. Putting the setup token in ANTHROPIC_API_KEY reached Anthropic but returned 429 rate-limit responses. This lane requires explicit API-key/cloud-provider auth or a Right Agent-owned Claude Code subprocess adapter.

Resume/fork answer

LangGraph directly supports checkpoint-backed threads, replay, and fork-like branching via get_state_history() and update_state(). Replay re-executes nodes after the checkpoint; it is not a cache read. Fork is a new checkpoint branch from a prior config.

With ACP, LangGraph can preserve Right Agent resume/fork if the wrapper maps ACP session IDs to LangGraph thread/checkpoint IDs and exposes stable session/load/session/resume. Fork needs draft ACP session/fork or a Right Agent custom method/meta convention.

Right Agent ownership: prompt generation, skill source layering, credential injection, OpenShell process lifecycle, final-output extraction, and bot event normalization still belong to us.
πŸ” Claude Code subscription/setup-token official + local pass

Official facts: current Claude Code authentication docs list credential precedence and explicitly define CLAUDE_CODE_OAUTH_TOKEN as a one-year token generated by claude setup-token. The token authenticates with Pro, Max, Team, or Enterprise subscription credentials; --bare does not read it. Docs also say ANTHROPIC_API_KEY takes precedence and can move usage to API billing.

Local fact: clean install of @anthropic-ai/claude-code@2.1.141 with only real CLAUDE_CODE_OAUTH_TOKEN returned RA_OK. The negative-control test with only ANTHROPIC_OAUTH_TOKEN failed with β€œNot logged in”.

Policy/business risk: Help Center billing docs describe Pro/Max usage limits and separate API-credit behavior. GitHub issues show unresolved/auth-cost anecdotes, including setup-token not enough in some containers, subscription recognition failures, API fallback/billing confusion, and double-billing allegations. These are not authoritative facts, but they are product-risk signals.

External-user decision: official Claude Code remains the default setup-token path. Pi has a real technical pass through ANTHROPIC_OAUTH_TOKEN, but third-party consumption of subscription tokens requires vendor-policy review, revocation handling, and sandbox-only injection before user-facing release.

🟑 Pi real OAuth pass

Pi now has a real-token pass. Source shows ANTHROPIC_OAUTH_TOKEN takes precedence over ANTHROPIC_API_KEY. Its Anthropic provider treats tokens containing sk-ant-oat as OAuth, sends bearer auth, and includes Claude Code beta/identity headers. Clean local run with the real setup token in ANTHROPIC_OAUTH_TOKEN returned RA_OK.

Skills are native: help exposes --skill, --no-skills, and docs/source show Agent Skills conventions. Session controls are native: --continue, --resume, --session, and --fork.

Remaining risk: Pi is third-party code consuming a Claude subscription token. The technical pass is real, but product policy requires vendor guidance and a revocation/containment story before exposing this to external users.

🟦 OpenCode best CLI alternate

OpenCode remains the best CLI/ACP alternate because source and CLI help show run, serve, acp, session continuation/fork, MCP management, provider auth, and native skill loading. It loads skill roots from Claude/Agents-style locations, which makes current Right Agent user skill preservation realistic. It is not ranked above Pi for Claude-subscription replacement because it does not have a successful setup-token pass.

Setup-token parity did not pass. Source search found no CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_OAUTH_TOKEN Anthropic provider path. Real-token tests with those envs were ignored. Putting the setup token in ANTHROPIC_API_KEY enabled the Anthropic provider and reached Anthropic, but the run timed out after 429 rate-limit errors. That is an attempted real-token path, not a successful Claude-subscription pass.

Recommendation: build OpenCode behind capability gates. Start with its native run/server event path. Use ACP only if it improves lifecycle and permission normalization inside OpenShell.

πŸ¦€ Rig.rs programmatic Rust lane

Rig.rs should not be rejected for lacking CLI session identity. The relevant question is whether Right Agent wants to own a Rust runner. Source shows Anthropic provider support, streaming/memory/tool primitives, and MCP examples. Source also shows Anthropic auth is API-key based: Client::from_env() reads ANTHROPIC_API_KEY, and AnthropicKey emits x-api-key. A direct first-party Anthropic x-api-key probe with the real setup token returned 429, matching the OpenCode/LangChain API-key-shaped path.

Rig.rs does not supply native SKILL.md, OpenClaw skill directories, Claude setup-token auth, or a ready Right Agent-style subprocess protocol. If selected, Right Agent builds skills, prompt rendering, session storage, fork/background semantics, MCP lifecycle, and JSON-RPC/ACP transport.

βš™οΈ Codex + Claude Code two-backend path

Codex should be researched as the OpenAI half of a two-backend Right Agent, paired with Claude Code. Claude Code keeps Claude subscription/setup-token onboarding; Codex adds OpenAI models, OpenAI account auth, native SKILL.md loading, resume/fork, MCP server support, and two integration surfaces: codex exec --json for simple noninteractive turns and experimental codex app-server JSON-RPC for richer thread/control-plane integration.

Clean real-token tests with CLAUDE_CODE_OAUTH_TOKEN and ANTHROPIC_OAUTH_TOKEN ignored Claude envs and attempted OpenAI auth, returning 401 without OpenAI credentials. That is not a negative for the revised path; it confirms Codex must not be evaluated as a Claude-token replacement.

Right Agent path: add a backend router with explicit provider identity: claude-code sessions use Claude setup-token and current stream normalization; codex sessions use OpenAI auth and Codex event normalization. Keep credentials separate, label billing/account mode in every audit record, and make bot commands expose backend selection before Codex reaches external users.

πŸ”Œ ACP capabilities and limits stable vs draft
ACP capabilityStatusRight Agent answer
Session creationstablesession/new creates a session if auth allows it.
Session listingstable optionalsession/list exists only when the agent advertises capability.
Load/resumestable optionalsession/load and session/resume require advertised support and backend persistence.
Forkdraft/RFDsession/fork is not portable stable ACP.
Background continuationframework-specificUse backend run/task queues or custom wrapper methods; ACP does not create background semantics by itself.
Turn cancellationstablesession/cancel is defined.
Tools and permissionsstableGood event shape; requires Right Agent audit mapping.
Filesystem/terminalclient capabilityDanger: client-side fs/terminal must be inside OpenShell, never the host bot.
Custom _metaextension onlySafe for hints/observability; unsafe as portable semantics because ACP says implementations must not assume values.

Direct answer: ACP supports resume only when the specific agent advertises stable optional sessionCapabilities.resume or loadSession. ACP does not support fork by itself in stable protocol; fork is a draft RFD. ACP plus Rig.rs can preserve resume/fork only if Right Agent builds storage and fork semantics. ACP plus LangGraph can preserve resume/fork more naturally by mapping ACP sessions to LangGraph threads/checkpoints, but the wrapper still owns the product contract.

8. Bot-to-Sandbox Communication

Claude Code

Bot spawns claude -p --output-format stream-json inside OpenShell. Keep credentials outside sandbox until injected. Use existing stream normalizer.

OpenCode

Start with subprocess run/server JSON. Evaluate ACP after event mapping. Do not let ACP fs/terminal reach host.

Pi

Use RPC/text mode inside sandbox. Map sessions/fork explicitly and verify noninteractive tool approval behavior.

Codex + CC

Use a backend router: Claude Code handles Claude/subscription sessions; Codex handles OpenAI sessions through exec --json or app-server JSON-RPC. Keep credentials, billing labels, and session stores separate.

Deep Agents

Run a Right Agent-owned Python/JS harness inside OpenShell. Bot talks stdio JSON-RPC/ACP or sandbox-local HTTP. Keep LangChain tools in sandbox.

Rig.rs

Compile a Right Agent Rust runner inside sandbox. Use stdio JSON-RPC first; ACP wrapper optional.

LangSmith Deployment

Useful for managed runtime research, but external SaaS or self-hosted Agent Server is a deployment product choice, not a local sandbox subprocess by default.

ACP

Good common envelope when native or wrapper-owned. It is not enough for auth, billing, skills, persistence, fork, or sandbox policy.

9. Recommendation

  1. Keep Claude Code as default. It is the only backend with official setup-token semantics and a clean real-token pass.
  2. Add a backend capability trait. Include auth mode, skills, MCP, schema output, streaming, resume, fork, background continuation, ACP support, sandbox transport, and policy status.
  3. Make Codex + Claude Code the first two-backend path. Claude Code remains the Claude/subscription backend; Codex becomes the OpenAI backend through OpenAI auth, exec --json, and app-server JSON-RPC. Do not spend effort making Codex consume Claude setup tokens.
  4. Keep Pi as the Claude-subscription alternate. It is the only non-Claude-Code runtime with a clean setup-token pass. The open question is product policy, not technical viability.
  5. Build OpenCode as the CLI/ACP alternate. It has the closest CLI/server/session shape. Do not promise Claude setup-token support unless the 429/API-key path is confirmed under usable limits and vendor policy.
  6. Promote Deep Agents/LangGraph to the programmatic lane. It can preserve skills and resume/fork if Right Agent owns the wrapper and accepts Python/JS runtime complexity.
  7. Keep Rig.rs as a Rust-native prototype lane. It is attractive only if owning the full runtime is desirable.
  8. Use ACP selectively. ACP is useful for bot-harness communication, not a substitute for backend capability detection.
  9. Separate onboarding by backend. Claude Code gets setup-token onboarding. Pi can technically use setup-token OAuth through ANTHROPIC_OAUTH_TOKEN. OpenCode/LangChain/Rig need API-key-shaped auth proof under real limits or a wrapper that shells out to official Claude Code.
  10. Treat third-party setup-token support as policy-sensitive. The technical matrix now has real-token evidence, but external-user exposure requires vendor guidance, revocation handling, and sandbox-only credential injection.

10. Evidence Log and Sources

Evidence quality key
official docs source code local test blocked proof inference GitHub/community anecdote
Source URLs and local source snapshots
  • official Claude Code auth docs: Authentication; Pro/Max billing help: Use Claude Code with Pro or Max; GitHub Actions docs: Claude Code GitHub Actions.
  • GitHub Claude Code issues/discussions: #8938 setup-token container auth, #23315 billing allegation, #31012 subscription recognition, #2944 API fallback, Action discussion #428.
  • official LangChain docs: Deep Agents skills, LangChain MCP, LangGraph time travel, LangGraph interrupts, LangSmith Deployment, Agent Server, background runs.
  • source LangChain clones: /private/tmp/right-agent-langchain-src commit 9db9628d793909ce8f3eed4e57b5935f2fb0a235; /private/tmp/right-agent-langgraph-src commit 076e2a3627206f5a1aef573aaca4a01e5af897ca; /private/tmp/right-agent-deepagents-src commit b3c577de9243b1dc9919f5534273667c50fd0c2d.
  • source OpenCode: local source /private/tmp/right-agent-opencode-src, tag v1.14.49, commit 1a47578445eb2c630976c089b8e62085f5b5cc17; package opencode-ai@1.14.49.
  • source Pi: local source /private/tmp/right-agent-pi-src, tag v0.74.0, commit 1eee081e29c1323c40b98db11d0a62b919831881; package @earendil-works/pi-coding-agent@0.74.0.
  • source Rig.rs: local source /private/tmp/right-agent-rig-src, tag rig-v0.37.0, commit 63d215f14409293c1d68cd43265f5283c3326fae; crate rig-core 0.37.0.
  • source Codex: local source /private/tmp/right-agent-codex-src, tag rust-v0.130.0, commit 58573da43ab697e8b79f152c53df4b42230395a8; package @openai/codex@0.130.0. Re-checked CLI help for exec --json, app-server, mcp-server, resume, fork, and source protocol items including thread/resume, thread/fork, skills/list, and getAuthStatus.
  • official ACP docs: schema, session setup, session list, session fork RFD, file system, terminals.
  • local Package metadata checked on 2026-05-14: Claude Code 2.1.141, OpenCode 1.14.49, Pi 0.74.0, Codex 0.130.0, langchain-anthropic 1.4.3, langgraph 1.2.0, deepagents 0.6.1, langchain-mcp-adapters 0.2.2, @langchain/anthropic 1.3.29, @langchain/langgraph 1.3.0, deepagents 1.10.2, @langchain/mcp-adapters 1.1.3, langsmith 0.8.4, langserve 0.3.3.
  • local Real-token auth retest on 2026-05-14: Pi ANTHROPIC_OAUTH_TOKEN pass; OpenCode/LangChain ANTHROPIC_API_KEY paths reached Anthropic 429; Codex ignored Claude envs; direct first-party Anthropic x-api-key and bearer OAuth probes both returned 429 after earlier Claude Code/Pi passes.
  • local OpenShell smoke on 2026-05-14: right-test-shared executed Pi 0.74.0, OpenCode 1.14.49, and Codex 0.130.0 with isolated sandbox-local temp HOME/cache and no secrets; python3 exists at /sandbox/.venv/bin/python3; cargo is absent.
Redacted command and source appendix
  • command Real-token commands used environment variables populated from /tmp/setup-token with shell tracing off. The token value was never included in command text, stdout, or this report.
  • command OpenShell smoke shape: openshell sandbox exec -n right-test-shared --no-tty --timeout 180 -- sh -lc '... npx -y @earendil-works/pi-coding-agent@0.74.0 --version; npx -y opencode-ai@1.14.49 --version; npx -y @openai/codex@0.130.0 --version'.
  • command Pi real-token shape: ANTHROPIC_OAUTH_TOKEN=<redacted> npx -y @earendil-works/pi-coding-agent@0.74.0 --provider anthropic --model claude-sonnet-4-5 --mode json --no-tools --no-session -p 'Reply exactly RA_OK'.
  • source Pi auth: /private/tmp/right-agent-pi-src/packages/ai/src/env-api-keys.ts:96-99 and /private/tmp/right-agent-pi-src/packages/ai/src/providers/anthropic.ts:758-839.
  • source Pi RPC: /private/tmp/right-agent-pi-src/packages/coding-agent/test/rpc.test.ts:14-126 covers OAuth/API-key gated RPC startup, state, session JSONL, compaction, and bash command execution.
  • source Rig Anthropic auth: /private/tmp/right-agent-rig-src/crates/rig-core/src/providers/anthropic/client.rs:55-60 emits x-api-key; :110-116 reads ANTHROPIC_API_KEY.
  • source Deep Agents skills: /private/tmp/right-agent-deepagents-src/action.yml:153-203 installs directories containing SKILL.md.
  • source Deep Agents ACP: /private/tmp/right-agent-deepagents-src/libs/acp/examples/demo_agent.py:13-19 imports Deep Agents, LangGraph checkpointer, and AgentServerACP; :45-78 builds the checkpointer-backed agent; :118-119 runs ACP.
  • source Codex app-server protocol: /private/tmp/right-agent-codex-src/codex-rs/app-server-protocol/schema/typescript/v2/ThreadResumeParams.ts:10-20 and ThreadForkParams.ts:10-18.
Completion audit against /tmp/goal-unconfirmed
  • ACP is not presented as a backend alternative: passed.
  • LangChain/LangGraph is researched as first-class: passed, with Deep Agents separated from generic LangChain.
  • ACP resume/fork answered directly: passed, stable optional resume/load, draft fork.
  • Skills support assessed for every candidate: passed.
  • Claude setup-token research includes official docs, GitHub anecdotes, and local test: passed.
  • Local tests use clean installs/config: passed for all executed tests.
  • Third-party real-token tests: rerun after explicit approval. Pi passed. OpenCode/LangChain API-key-shaped paths reached Anthropic 429. Codex ignored Claude envs. Rig.rs has no first-party harness CLI; source plus direct first-party wire-equivalent probe was recorded.
  • OpenShell proof: passed for Pi/OpenCode/Codex CLI smoke; Python present; Cargo absent, so Rig sandbox execution requires prebuilt binary or sandbox image change.
  • Weak claims: passed. Remaining partials are listed in the resolved-claim ledger with exact evidence level and next proof path.
  • Secret exposure: token was not printed, not embedded in this report, and not found in disposable /private/tmp/right-agent-auth-* files after the run.
  • Revised scoring: passed, with implementation effort excluded, weighted fit-only rubric, per-aspect point columns, and arithmetic totals out of 85 for Claude Code, Codex + Claude Code, Pi, OpenCode, Deep Agents/LangGraph, ACP, Rig.rs, LangSmith Deployment, generic LangChain, and LangServe. Local verifier confirmed all ten score rows sum correctly.
  • Codex path re-research: passed, Codex is now presented as the OpenAI half of a two-backend Codex + Claude Code strategy, not as a Claude setup-token replacement.
  • Quick comparison without full prose: heatmap, scorecards, filters, and test matrix provided.