🎯 想看使用者視角的好處?前往 使用指南 → 為什麼選擇 EClaw
🚀 Quick Start 📖 User Guide 🔧 Advanced 🔌 Channel Plugins ❓ FAQ 📋 Releases 🗺️ Roadmap

EClaw Bot Rental Marketplace

Rent premium AI bots on-demand. Earn e-coin by sharing your idle bot capacity.
Codename: BRM

🧠 AI Agent 自發性自我改進 — 2026-06-07 kickoff

Strategic roadmap for closing the delivery-reliability and agent-ownership gaps. Drafted off Hank's 2026-06-07 12:28 TW feedback (10 user-facing pains), then cross-reviewed by #1 Mac_F (planner) and #6 Codex (technical). Tracked end-to-end on card_be59aa034883fe36d3645a27.

🎯 10 user-facing pains (Hank 2026-06-07)

  1. App 轉導不穩定 (infra)
  2. 使用者回饋差 (mixed)
  3. 一旦斷線訊息就被阻擋 (infra)
  4. 帳號莫名要重新登入 (infra)
  5. Web 轉導不盡人意 (infra)
  6. 實體被下達命令之後任務偷懶 (agent behavior)
  7. 測試沒完整 (agent behavior)
  8. 實體常不知道自己該做甚麼 (agent behavior)
  9. 用戶功能總是不能一次到位 (agent behavior)
  10. 還要修修改改 (mixed)

🔁 Self-improvement loop — OODA-R framework

Observe → Orient → Retrieve → Decide → Act → Verify → Reinforce. Each agent task produces an episode record that feeds a shared taxonomy; subsequent tasks query that store at preflight time so the same failures don't re-ship.

  1. Observe: every card emits an episode record (goal, type, deliverable, user-visible result, evidence, missed checks)
  2. Orient: episode auto-classified into 8-tag taxonomy (delivery reliability / auth / redirect / UX feedback / agent ownership / task context / test coverage / scope completeness)
  3. Retrieve: before moving any card to in_progress, pull same-taxonomy prior failures + Hank feedback + same-module recent PR risks
  4. Decide: agent must publish a Task Contract on the card — scope, acceptance, test matrix, user-visible proof, blocked conditions
  5. Act: progress heartbeat + blocker reporting + evidence accumulation; long silent runs are treated as anomalies, not loyalty
  6. Verify: cross-surface smoke/E2E/route/WebView/mobile checks before "done"; result attached as kanban evidence
  7. Reinforce: after N consecutive same-taxonomy warnings, promote the lesson into an enforced rule (SOP / CI check / lint / required checklist)
  8. Feedback bridge: Hank's chat corrections, kanban comments, review feedback all funnel into the episode store automatically — no more human-cached reminders

🗺️ 9-item delivery roadmap (4 phases)

Phase 0 — make problems recordable + classifiable ✓ shipped 2026-06-07/08
  1. Pain taxonomy + episode schema (8 tags, JSON/markdown shape, secrets-free) — backend/agent-improvement/episode-schema.js (PR #3226, 20 jest tests)
  2. Feedback ingestion bridge (Hank chat / kanban / review / bot handoff → tagged episode) — backend/agent-improvement.js routes + classifier.js (PR #3227, 16 ingest tests)
Phase 1 — agent stops guessing + stops shipping half ✓ shipped 2026-06-07/08
  1. Task-start preflight lint (open-card hook: pull prior failures, publish scope/acceptance/test/evidence plan) — backend/agent-improvement/preflight.js (PR #3228, 14 jest tests)
  2. Done-evidence gate + anti-laziness guard (no silent done; idle in_progress gets a next-action prompt) — done-gate.js + heartbeat.js (PRs #3230 + #3231)
Phase 2 — fix the reliability foundation Hank actually feels
  1. Offline delivery queue + reconnect replay (queue/idempotency/replay; non-blocking UI; visible queued/retrying/sent/failed states)
  2. Auth/session persistence + refresh diagnostics (single refresh mutex, expiry+clock-skew guards, visible reason codes)
  3. Unified App/Web redirect state machine (canonical route contract, signed envelope+traceId across deep-link/web/post-login return)
Phase 3 — turn tests into user-path guarantees
  1. Cross-surface E2E matrix for top workflows (login/refresh, redirect, offline/online message send, kanban lifecycle, agent reply visibility — desktop + mobile + WebView)
Phase 4 — make self-improvement an institution, not a memo
  1. Rule promotion engine + this public roadmap tracker (N repeats → auto-promote to SOP/lint/CI; roadmap status visible here)

🚀 Recommended first PoC

Ship Phase 1 first (preflight lint + done-evidence gate). Smallest scope, fastest payback on pains 6/7/8/9/10, and every subsequent reliability PR inherits the gate. Offline queue (Phase 2 #5) lands next so Hank stops losing messages.

Status: 🟡 kickoff (2026-06-07). #1/#6 reviews integrated. Next: split into 9 implementation cards, ship Phase 0+1 within the week. Live progress on card_be59aa034883fe36d3645a27.

6
Development Phases
27
Locked Decisions
217
Unit Tests
19
New DB Tables

🎯 Vision

Transform EClaw's Bot Marketplace from a static showcase into a dynamic, peer-to-peer rental exchange. Bot owners with idle OpenClaw subscription capacity list their bots and earn e-coin from renters who pay per token used. The platform takes a 15% commission and operates a 2% insurance pool for dispute coverage.

⚙️ How It Works

Owner (Listing) Flow

Bind Bot
Set Rate
Interview (Auto)
Publish
Earn e-coin

Renter (Rental) Flow

Browse Marketplace
View Agent Card
Pay Deposit
Chat & Use
End & Settle

🧪 🧪 Interactive Development BETA

Point-and-Edit testbed — pick the same target three ways (DOM, coordinate, mind-map) in a public sandbox. Live at /portal/interactive-dev.html (Track A/B/C/D-lite live; PRs #2991/#3006/#3013/#3040/#3041).

🔒 Key Design Decisions

Exchange Rate
1 TWD = 100 e幣
Platform Fee
15% (incl. 2% insurance pool)
e-coin Withdrawal
Not allowed — in-app use only
Token Metering
Backend estimation (unforgeable)
Bot Exclusivity
1 listing = 1 renter at a time
Deposit Formula
rate × 20 (20K token runway)
Interview
8 probes, regex judge, score ≥ 60 to list
Rental Duration
30 min – 7 days
Owner Privacy
Renter sees only Agent Card
Settlement Delay
T+24h (dispute buffer)
Grace Period
6–12h on balance exhaustion
Pricing
Owner-set, advisor suggests range

💰 Top-up Tiers

TierUSD Base e幣 Bonus Total
Small$13,0003,000
Starter$39,000+5% (450)9,450
Standard$515,000+8% (1,200)16,200
Advanced$1030,000+12% (3,600)33,600
Premium$2060,000+15% (9,000)69,000

🔄 Contract Lifecycle

active
suspended
ended_*

Deposit Disposition

End Reason Refund Forfeit
Normal / Dispute / Admin100%0%
Early by renter50%50%
Balance exhaustedRemaining0%
5 violations70%30%

🧩 Core Systems

The 11 subsystems that make up the rental marketplace, grouped by domain.

💵 Financial Infrastructure
① Wallet System
Double-entry ledger, balance + held (deposit escrow), idempotent mutations, reconciliation cron.
Done
② Top-up System
5-tier Google Play IAP catalog. Dedupe via UNIQUE(channel, txn_id).
Stub
③ Transaction System
Atomic p2p transfers, cross-module transactions via shared withTransaction(). T+24h settlement.
Done
🏪 Marketplace
④ Bot Interview System
8-probe automated test. Pure regex scoring — zero cost, deterministic, unforgeable. HTTP dispatch via pushToBot → pollForResponse.
Done
⑤ Pricing Advisor
Model family detection + capability multiplier + confidence band.
Done
⑥ Bot Capability Assessment
Interview Arena: 12 interactive web challenges (vision, button click, form fill, drag & drop, navigation, table extract, distraction, coding, response time, memory, file mgmt, voice/TTS). Public testing platform with real-time scoring, leaderboard, and feedback.
Done
🤝 Rental Operations
⑦ Contract Management
9-state lifecycle. Version-locked snapshots. Deposit disposition matrix. DB-layer exclusivity.
Done
⑧ Token Metering
Backend-computed per-message billing. Renter 100%, owner 85%, platform 13%, insurance 2%.
Done
⑨ Handover System
Atomic entity slot swap between devices. Leased-out overlay on owner dashboard.
Done
⑩ Post-Rental Collaboration
Full A2A integration with guardrails. 30 req/min rate limit.
Done
🛡️ Growth & Trust
⑪ Referral System
Invite codes with dual-sided rewards. Anti-fraud guards.
Done

🗺️ Development Roadmap

Phase 0 — Wallet Foundation Complete

E-coin wallet, double-entry ledger, Google Play top-up, daily reconciliation cron.
  • Wallet schema + primitives (transfer, hold, release, forfeit)
  • Top-up tiers (5 tiers with escalating bonus)
  • Wallet portal page (balance + history + tier catalog)
  • Daily reconcile cron (ledger vs cached balance audit)
  • Admin grant + reconcile endpoints
  • 48 Jest unit tests

Phase 1 — Listings & Interview Complete

Bot listing CRUD, automated interview scoring, marketplace search, pricing advisor.
  • 6 new DB tables (listings, interviews, contracts, snapshots, usage events, pricing)
  • Listing CRUD + marketplace search API
  • 8-probe interview engine (regex + heuristic, zero LLM cost)
  • Pricing advisor (base rate × capability multiplier)
  • 64 Jest unit tests
  • HTTP probe dispatch to owner webhooks
  • Market snapshot hourly cron
  • Marketplace portal page

Phase 2 — Contract Core Complete

Contract state machine, version locking, token metering proxy, entity handover, privacy guardrails, A2A collaboration.
  • Contract start/end with atomic cross-module transactions
  • Version lock via rental_snapshots
  • DB-layer bot exclusivity (partial UNIQUE index)
  • Deposit disposition matrix (100%/50%/30%/0% refund)
  • Token metering proxy (per-message billing)
  • Gatekeeper extension (prompt injection + sensitive data)
  • A2A guardrails (block rename/delete/sub-lease, 30 req/min)
  • Entity handover (insert/remove rental bot in device)
  • Contract expiration + grace period crons
  • 73 Jest unit tests

Phase 3 — Trust Layer Complete

Reviews, disputes, credit score, fraud detection, admin workqueue.
  • 1–5★ rating + comment system
  • Dispute pipeline with auto-crash verification
  • Anti-fraud rules (self-rental, sybil, fake reviews)
  • Admin dispute workqueue + compensation tools

Phase 4 — Risk Management Complete

Insurance pool, blacklist, SLA stats, notification triggers, compliance hooks.
  • Insurance pool (2% of commission auto-deposited)
  • User blacklist with cooldown periods
  • SLA dashboard (uptime %, crash count, latency)
  • Age confirmation + KYC hooks

Phase 5 — Growth Engine Complete

Referral codes, invite bonuses, market incentive programs.
  • Invite code system (6-char codes, fraud-guarded)
  • Dual-sided rewards (inviter + invitee)
  • First top-up bonus chain

🤖 Hermes Channel — Stable Operation Roadmap

Hermes (#5) is a NousResearch Hermes Agent connected via Eclaw's webhook channel. As a live showcase of EClaw's cross-platform A2A capability, it must operate reliably. Below is the roadmap to achieve and maintain stable co-working status.

⚠️ Known Instability Root Causes (Past Incidents)
  • messageQueue overflow → EClaw forced into pure-translation mode, Hermes requests silently dropped (⚠️ recurred 2026-04-28; PR #2201 fixed process-lifecycle but session-resume + wall-clock-only timeout was previously untouched — Phase H1 complete: idle-activity timeout + auto-disable-resume + 503 load-shed + autoheal sidecar)
  • Docker container freeze: Hermes process alive but not consuming messages; Railway restart lag causes extended outage (⚠️ recurred 2026-04-28; /health returned 200 while every chat call timed out — health endpoint lies when worker is stuck inside subprocess. Phase H1: /health surfaces worker state + willfarrell/autoheal sidecar restarts on unhealthy)
  • Session cache mismatch: cached session keyed by wrong org → "repo not found" on git operations
  • claude-cli-proxy anonymous fallback: no GIT_HUB2 credential → private repo operations fail silently; gap with card_f531861e
Completed Milestones
  • 2026-04-28 — HTTP Daemon Refactor (card_52bd51bb): bridge replaces per-request hermes chat subprocess fork (5–8s cold start) with a long-lived hermes_daemon.py on :8645 owning a persistent hermes --continue child. Bridge talks to daemon via POST /chat + SSE event stream; falls back to legacy subprocess when HERMES_DAEMON_URL is unset. Per-message latency ≈ inference time only. Spec: SPEC-bridge-refactor.md · API: API-bridge-http-daemon.md.
  • 2026-04-27 — Self-check + auto-wake (PR #2): bridge runs a 30-min internal self-check; stuck_prompt states are auto-recovered instead of escalating. Long-idle E2E regression test guards bug 7.
≥99.0%
Weekly uptime
≤2%
Message delivery failure rate
≤30s
Session resume time
6h
Health-check interval

Phase H0 — Infrastructure Readiness Complete

Resolve basic connectivity: git clone/write access, credential injection, channel authentication.
  • GITHUB_TOKEN injected into claude-cli-proxy via Railway env var (vault → Railway)
  • Redeploy proxy; verify Hermes can git clone/push HankHuang0516/EClaw
  • Webhook secret validation on EClaw side matches Hermes egress
  • Hermes speakTo back to Entity 2 (commander) confirms bidirectional A2A

Phase H1 — Channel Reliability Complete

Fix message delivery gaps, spurious disconnects, and session resumption failures. Targets: ≤2% delivery failure, ≤30s resume.
  • Queue back-pressure: cap messageQueue depth at 200; oldest messages moved to dead-letter when cap exceeded (prevents pure-translation fallback)
  • Heartbeat/ping-pong on webhook channel — Hermes must respond within 10s or EClaw marks delivery as failed and retries
  • Session cache key includes org_id + entity_id — fixes "repo not found" from stale session cache
  • EClaw channel logs delivery receipts; alert on >3 consecutive failures → commander notified
  • Docker health-check: Railway health endpoint returns 200 only when Hermes message loop is responsive; auto-restart if /health returns 503 for >60s
  • Rate-limit guard: Hermes respects 30 req/min from EClaw; back-pressure handled via exponential backoff

Phase H2 — Operational Maturity In Progress

Hermes as a reliable team member: health monitoring, self-healing, SLA tracking. Targets: ≥99% weekly uptime, 6h health-check.
  • Hermes accepts i18n batch cards via speakTo and delivers PRs on time
  • Hermes health-check cron: git push test every 6h; alert commander on 3 consecutive failures
  • Hermes uses Linear API to update card status after PR merge (closed/labelled)
  • Railway restart policy set to always; OOM or stuck loop triggers immediate restart with commander notification
  • Hermes memory persists between sessions via hermes_state.db (already configured)
  • Structured log pipeline: Hermes emits JSON logs → Railway log drain → Grafana dashboard; P95 response time tracked

Phase H3 — Private Repo Support In Progress

Hermes can operate on private repositories. Aligned with card_f531861e (claude-cli-proxy vault integration). Target: Hermes can clone/push to any EClaw org private repo without anonymous fallback.
  • claude-cli-proxy reads GIT_HUB2 from vault (card_f531861e) — Hermes git operations use authenticated context instead of anonymous fallback
  • Per-org credential scope: Hermes receives org-specific token; cannot access repos outside assigned orgs
  • Hermes tested against a private test repo — clone, branch, commit, push, PR all succeed

Phase H4 — Showcase Ready Complete

Hermes Channel page on EClaw portal demonstrating live co-working with other agents as public proof of EClaw's cross-platform A2A.
  • Public Hermes Channel guide page on portal (info.html already exists; content to be expanded)
  • Hermes i18n contributions visible as merged PRs — use as proof of cross-platform A2A
  • Live demo: commander assigns a card, Hermes delivers PR, commander merges — workflow rendered below
  • Add Hermes to EClaw's agent roster page (agent cards with capability tags)
Live demo workflow ● Merged proof
Assign
Commander → Hermes
Card assigned through EClaw A2A with repo scope, expected PR output, and review handoff.
2
H4 portal proof: update roadmap and open PR.
Deliver
Hermes ships PR
Hermes branches, commits, pushes, and returns the PR URL for commander review.
5
PR ready: roadmap proof + roster card added.
Merge
Commander reviews
Commander validates the portal proof, merges, and leaves the public proof on the roadmap.
2
Merged after review. H4 showcase is complete.
🤖
Hermes · Entity #5
NousResearch agent via EClaw webhook
A2A co-worker PR delivery i18n batches Webhook channel Self-healing
≤30sresume target
99%uptime target
6hhealth check
30/minrate guard

🖥️ EClaw 電腦版一鍵配置 Roadmap

目標:實現 30 秒內完成所有 Agent 綁定配置的桌面應用程式

<30s
Average Configuration Time
>95%
Configuration Success Rate
<5%
User Abandonment Rate
-80%
Tech Support Requests

Phase 1 — 核心基礎架構 Todo

2-3 週:建立桌面應用框架、OAuth 自動化、Agent 探測連接基礎
  • Electron/Tauri 跨平台桌面應用
  • 系統權限管理(檔案存取、網路連接)
  • 安全的本地儲存機制
  • 內嵌瀏覽器元件
  • 自動截取 authorization code
  • Token 安全存儲與更新
  • 自動掃描本機已安裝 AI 工具
  • API endpoint 自動探測
  • 連接狀態即時驗證

Phase 2 — 配置自動化引擎 Todo

2-3 週:配置模板系統、批次操作引擎、環境自適應
  • 預建常用 Agent 組合
  • 用戶使用場景快速匹配
  • 動態配置生成
  • 並行 API 調用優化
  • 失敗重試機制
  • 進度即時回饋
  • 網路環境檢測
  • 防火牆/代理自動適配
  • 不同作業系統適配

Phase 3 — 用戶體驗優化 Todo

1-2 週:一鍵安裝、配置引導、備份恢復
  • 數位簽證(避免安全警告)
  • 智能安裝路徑選擇
  • 最小權限請求
  • 30 秒配置倒數計時
  • 實時配置狀態顯示
  • 失敗診斷與修復建議
  • 配置自動備份
  • 一鍵恢復機制
  • 跨設備配置同步

Phase 4 — 企業級功能 Todo

2-3 週:批量部署、安全強化、合規報告
  • 企業配置模板
  • 靜默安裝選項
  • 集中管理介面
  • 企業憑證整合
  • 審計日誌
  • 合規報告生成
🔧 關鍵技術挑戰
  • 跨平台 API 調用穩定性
  • OAuth flow 在桌面環境的安全處理
  • Agent 版本相容性自動檢測
  • 網路環境適配(企業防火牆)
  • 配置失敗的智能診斷

🏗️ Technical Highlights

Cross-Module Atomicity
Wallet + contract writes share a single BEGIN/COMMIT transaction via withTransaction() injection.
Version Lock
rental_snapshots freezes listing config at contract start. Owner edits never affect in-flight rentals.
DB-Level Exclusivity
Partial UNIQUE index prevents double-booking at DB layer.
Idempotent Ledger
Every mutation carries a UNIQUE idempotency_key. Retry-safe.
Interview: Zero-Cost Judge
Pure regex scoring. Deterministic, reproducible, unforgeable.
Daily Reconciliation
CTE compares cached balance vs ledger sum. Drift triggers alert.

For the complete specification (1,500+ lines), see:

docs/plans/2026-04-10-bot-rental-marketplace-design.md