经 AI Skill Hub 精选评估,开源AI工作流 获评「强烈推荐」。这款Agent工作流在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 8.0 分,适合有一定技术背景的用户使用。
开源AI工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
开源AI工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
# 方式一:cargo install(推荐) cargo install opencrabs # 方式二:从源码编译 git clone https://github.com/adolfousier/opencrabs cd opencrabs cargo build --release # 二进制在 ./target/release/opencrabs
# 查看帮助 opencrabs --help # 基本运行 opencrabs [options] <input> # 详细使用说明请查阅文档 # https://github.com/adolfousier/opencrabs
# opencrabs 配置说明 # 查看配置选项 opencrabs --config-example > config.yml # 常见配置项 # output_dir: ./output # log_level: info # workers: 4 # 环境变量(覆盖配置文件) export OPENCRABS_CONFIG="/path/to/config.yml"
| Feature | Description |
|---|---|
| **Full Terminal Access** | 30+ built-in tools (file I/O, glob, grep, web search, code execution, image gen/analysis, memory search, cron jobs) plus **any CLI tool on your system** via bash — GitHub CLI, Docker, SSH, Python, Node, ffmpeg, curl, and everything else just work |
| **RTK Token Savings** | Automatic bash output optimization via [RTK](https://github.com/rtk-ai/rtk) integration — enabled by default, zero config. Prepends rtk to supported commands (git, cargo, npm, pnpm, yarn, docker, kubectl, grep, find, ls, tree, curl, and 100+ more) to filter noise from command output. Reduces token usage on bash commands by 60-90% without losing critical information. Check savings with /rtk command. RTK binary bundled with OpenCrabs releases; for source builds, auto-downloads to ~/.local/bin/rtk on first use |
| **Per-Session Isolation** | Each session is an independent agent with its own provider, model, context, and tool state. Sessions can run tasks in parallel against different providers — ask Claude a question in one session while Kimi works on code in another |
| **Self-Healing** | Detects and recovers from phantom tool calls, gaslighting preambles, text repetition loops, XML tool call failures, and provider errors. Short-circuits repeated failing bash commands and rejects interactive commands that would hang. Automatic context compaction at 65% (soft) and 90% (hard). Sticky fallback promotion when primary recovers |
| **Self-Sustaining** | Agent can modify its own source, build, test, and hot-restart via Unix exec() |
| **Self-Improving** | Learns from experience — saves reusable workflows as custom commands, writes lessons learned to memory, updates its own brain files. All local, no data leaves your machine |
| **Dynamic Tools** | Define custom tools at runtime via ~/.opencrabs/tools.toml — the agent can call them autonomously like built-in tools. HTTP and shell executors, template parameters ({{param}}), enable/disable without restart. The tool_manage meta-tool lets the agent create, remove, and reload tools on the fly |
| **Skills (cross-harness)** | Multi-stage workflow templates in the de-facto SKILL.md format used by Claude Code, Anthropic managed agents, and OpenClaw. Drop a SKILL.md under ~/.opencrabs/skills/<name>/ and it auto-registers as /<name> — no commands.toml entry needed. Works in the TUI **and** every connected channel (Telegram, Discord, Slack, WhatsApp). Built-ins ship with the binary (always version-matched); user skills override by file presence. Two built-ins out of the box: /security-audit (language-agnostic CVE & static-analysis audit, scores 0-100) and /cost-estimate (codebase valuation with AI-assisted ROI). Same SKILL.md is portable across harnesses |
| **Mission Control** | Full-screen /mission-control dialog showing every actionable artifact in one place: pending RSI proposals (inbox cards), recent RSI activity (improvements log feed), and the schedule queue (cron jobs + paused/active state). Apply or reject inbox proposals inline with a / r — same machinery as the agent's rsi_proposals tool, byte-identical install. Tab between panels, j/k to navigate, Enter for the detail popup, Esc to close. Cron paused jobs flag in orange, active in teal — at-a-glance state |
| **Skills picker** | Full-screen /skills dialog with a live filter input — start typing to narrow the list (case-insensitive on name + description), Tab / Shift-Tab cycle the filtered cards (wraps at the edges), Enter runs the selected skill (sends its body as a prompt to the agent), Esc closes. Built-in skills badge orange; user-installed skills badge teal. When the filter narrows to a single match, Enter just fires it — fastest path to launch a skill |
| **Browser Automation** | Native browser control via CDP (Chrome DevTools Protocol). Auto-detects your default Chromium-based browser (Chrome, Brave, Edge, Arc, Vivaldi, Opera, Chromium) and uses its profile — your logins, cookies, and extensions carry over. 7 browser tools: navigate, click, type, screenshot, eval JS, extract content, wait for elements. Headed or headless mode with display auto-detection. **Note:** Firefox is not supported (no CDP) — if Firefox is your default, OpenCrabs falls back to the first available Chromium browser. Feature-gated under browser (included by default) |
| **Natural Language Commands** | Tell OpenCrabs to create slash commands — it writes them to commands.toml autonomously via the config_manager tool |
| **Live Settings** | Agent can read/write config.toml at runtime; Settings TUI screen (press S) shows current config; approval policy persists across restarts. Default: auto-approve (use /approve to change) |
| **Web Search** | DuckDuckGo (built-in, no key needed) + EXA AI (neural, free via MCP) by default; Brave Search optional (key in keys.toml) |
| **Debug Logging** | --debug flag enables file logging; DEBUG_LOGS_LOCATION env var for custom log directory |
| **Agent-to-Agent (A2A)** | HTTP gateway implementing A2A Protocol RC v1.0 — peer-to-peer agent communication via JSON-RPC 2.0. Supports message/send, message/stream (SSE), tasks/get, tasks/cancel. Built-in a2a_send tool lets the agent proactively call remote A2A agents. Optional Bearer token auth. Includes multi-agent debate (Bee Colony) with confidence-weighted consensus. Task persistence across restarts |
| **Profiles** | Run multiple isolated instances from the same installation. Each profile gets its own config, keys, memory, sessions, and database. Create with opencrabs profile create <name>, switch with -p <name>. Migrate config between profiles with profile migrate. Export/import for sharing. Token-lock isolation prevents two profiles from using the same bot credential |
opencrabs -p hermes service install opencrabs -p hermes service start
/onboard:image in chat (or go through onboarding Advanced mode) to configurekeys.toml:[image]
api_key = "AIza..."
And config.toml:
[image.generation]
enabled = true
model = "gemini-3.1-flash-image-preview"
[image.vision]
enabled = true
model = "gemini-3.1-flash-image-preview"
```bash
https://github.com/user-attachments/assets/7f45c5f8-acdf-48d5-b6a4-0e4811a9ee23
---
models = ["qwen2.5-coder-7b-instruct", "llama-3-8B", "mistral-7B-instruct"]
> **Local LLMs (Ollama, LM Studio):** No API key needed — just set `base_url` and `default_model`.
>
> **Remote APIs (Groq, Together, etc.):** Add the key in `keys.toml` using the same name:
> toml > [providers.custom.groq] > api_key = "your-api-key" >
> **Note:** `/chat/completions` is auto-appended to base URLs that don't include it.
> **Local reasoning models (`enable_thinking`):** when a custom provider's `base_url` points at a local host (`localhost`, `127.0.0.1`, `*.local`, or an RFC1918 private IP like `192.168.x.x` / `10.x.x.x` / `172.16.x.x`–`172.31.x.x`), OpenCrabs injects `chat_template_kwargs: {"enable_thinking": true}` into every request. This mirrors `llama-server --jinja --chat-template-kwargs '{"enable_thinking":true}'` — what Unsloth Studio launches with by default — so Qwen3 / Kimi / DeepSeek-R1 templates render `<tool_call>` tags and reasoning blocks correctly, and tool calls actually execute instead of being hallucinated as text. Set `enable_thinking = false` in the provider block to disable (falls back to fast, non-thinking mode). Cloud providers are unaffected.
>
> toml > [providers.custom.lm_studio] > enabled = true > base_url = "http://localhost:1234/v1" > default_model = "qwen3-30b-a3b" > enable_thinking = false # optional — default is true for local providers >
> **Qwen / Alibaba cache (zero-config):** when a custom provider's `base_url` points at a known Qwen / Alibaba endpoint (`dashscope.aliyuncs.com`, `dashscope-intl.aliyuncs.com`, `aliyun.com`, `dialagram.me`) or the request model name starts with `qwen-`, OpenCrabs auto-injects `cache_control: {"type": "ephemeral"}` markers on the system message, the last message (streaming), and the last tool definition. This unlocks Alibaba's [explicit context cache](https://www.alibabacloud.com/help/en/model-studio/explicit-cache-best-practice) — cache hits bill input tokens at 10% of the standard price (≈90% off), with a 25% surcharge on first creation and a 5-minute TTL auto-renewed on every hit. The detection runs per-request so a provider routing between Qwen and non-Qwen models only marks the Qwen ones. Non-Qwen backends ignore the marker (it's an unknown JSON field), so the only cost on a mismatch is a few wasted bytes per request.
>
> toml > [providers.custom.dashscope] > enabled = true > base_url = "https://dashscope.aliyuncs.com/compatible-mode/v1" > default_model = "qwen3-max" > # No cache flags needed — auto-enabled on URL match. >
**Multiple custom providers** coexist — define as many as you need with different names and switch between them via `/models`:
toml [providers.custom.lm_studio] enabled = true base_url = "http://localhost:1234/v1" default_model = "qwen2.5-coder-7b-instruct"
[providers.custom.ollama] enabled = false base_url = "http://localhost:11434/v1" default_model = "mistral"
The name after `custom.` is a label you choose (e.g. `lm_studio`, `nvidia`, `groq`). The one with `enabled = true` is active. Keys go in `keys.toml` using the same label. All configured custom providers persist — switching via `/models` just toggles `enabled`.
#### Free Prototyping with NVIDIA API + Kimi K2.5
[Kimi K2.5](https://build.nvidia.com/moonshotai/kimi-k2.5) is a frontier-scale multimodal Mixture-of-Experts (MoE) model available **for free** on the NVIDIA API Catalog — no billing setup or credit card required. It handles complex reasoning and image/video understanding, making it a strong free alternative to paid models like Claude or Gemini for experimentation and agentic workflows.
**Tested and verified** with OpenCrabs Custom provider setup.
**Quick start:**
1. Sign up at the [NVIDIA API Catalog](https://build.nvidia.com/) and verify your account
2. Go to the [Kimi K2.5 model page](https://build.nvidia.com/moonshotai/kimi-k2.5) and click **Get API Key** (or "View Code" to see an auto-generated key)
3. Configure in OpenCrabs via `/models` or `config.toml`:
toml [providers.custom.nvidia] enabled = true base_url = "https://integrate.api.nvidia.com/v1" default_model = "moonshotai/kimi-k2.5" toml
Grab a pre-built binary from GitHub Releases — available for Linux (amd64/arm64), macOS (amd64/arm64), and Windows.
```bash
| Endpoint | Method | Description |
|---|---|---|
/.well-known/agent.json | GET | Agent Card — discover skills, capabilities, supported content types |
/a2a/v1 | POST | JSON-RPC 2.0 — message/send, message/stream (SSE), tasks/get, tasks/cancel |
/a2a/health | GET | Health check |
| Feature | Description |
|---|---|
| **Telegram Bot** | Full-featured Telegram bot — owner DMs share TUI session, groups get isolated per-group sessions (keyed by chat ID). Photo/voice support (STT transcribes incoming voice notes; TTS replies as OGG/Opus voice notes via send_voice when input was audio). Allowed user IDs, allowed chat/group IDs, respond_to filter (all/dm_only/mention). Passive group message capture — all messages stored for context even when bot isn't mentioned |
| **WhatsApp** | Connect via QR code pairing at runtime or from onboarding wizard. Text + image + voice (STT transcribes incoming voice notes; TTS replies as voice notes when input was audio and tts_enabled=true). Shared session with TUI, phone allowlist (allowed_phones), session persists across restarts |
| **Discord** | Full Discord bot — text + image + voice. Owner DMs share TUI session, guild channels get isolated per-channel sessions. Allowed user IDs, allowed channel IDs, respond_to filter. Full proactive control via discord_send (17 actions): send, reply, react, unreact, edit, delete, pin, unpin, create_thread, send_embed, get_messages, list_channels, add_role, remove_role, kick, ban, send_file. Generated images sent as native Discord file attachments |
| **Slack** | Full Slack bot via Socket Mode — owner DMs share TUI session, channels get isolated per-channel sessions. Text + image + voice (STT transcribes incoming audio attachments; TTS replies upload an OGG/Opus audio file via files.upload — renders inline with waveform UI — when input was audio and tts_enabled=true). Allowed user IDs, allowed channel IDs, respond_to filter. Full proactive control via slack_send (17 actions): send, reply, react, unreact, edit, delete, pin, unpin, get_messages, get_channel, list_channels, get_user, list_members, kick_user, set_topic, send_blocks, send_file. Generated images sent as native Slack file uploads. Bot token + app token from api.slack.com/apps (Socket Mode required). **Required Bot Token Scopes:** chat:write, channels:history, groups:history, im:history, mpim:history, users:read, files:read, files:write, reactions:write, app_mentions:read |
| **Trello** | Tool-only by default — the AI acts on Trello only when explicitly asked via trello_send. Opt-in polling via poll_interval_secs in config; when enabled, only @bot_username mentions from allowed users trigger a response. Full card management via trello_send (22 actions): add_comment, create_card, move_card, find_cards, list_boards, get_card, get_card_comments, update_card, archive_card, add_member_to_card, remove_member_from_card, add_label_to_card, remove_label_from_card, add_checklist, add_checklist_item, complete_checklist_item, list_lists, get_board_members, search, get_notifications, mark_notifications_read, add_attachment. API Key + Token from trello.com/power-ups/admin, board IDs and member-ID allowlist configurable |
When users send files, images, or documents across any channel, the agent receives the content automatically — no manual forwarding needed. Example: a user uploads a dashboard screenshot to a Trello card with the comment "I'm seeing this error" — the agent fetches the attachment, passes it through the vision pipeline, and responds with full context.
| Channel | Images (in) | Text files (in) | Documents (in) | Audio (in) | Audio reply (out) | Image gen (out) |
|---|---|---|---|---|---|---|
| **Telegram** | ✅ vision pipeline | ✅ extracted inline | ✅ / PDF note | ✅ STT | ✅ TTS via send_voice (OGG/Opus) | ✅ native photo |
| **WhatsApp** | ✅ vision pipeline | ✅ extracted inline | ✅ / PDF note | ✅ STT | ✅ TTS via upload + audio_message (OGG/Opus, ptt=true) | ✅ native image |
| **Discord** | ✅ vision pipeline | ✅ extracted inline | ✅ / PDF note | ✅ STT | ✅ TTS as response.ogg attachment | ✅ file attachment |
| **Slack** | ✅ vision pipeline | ✅ extracted inline | ✅ / PDF note | ✅ STT | ✅ TTS via files.upload (OGG/Opus, inline waveform) | ✅ file upload |
| **Trello** | ✅ card attachments → vision | ✅ extracted inline | — | — | — | ✅ card attachment + embed |
| **TUI** | ✅ paste path → vision | ✅ paste path → inline | — | ✅ STT | — (terminal has no native audio) | ✅ [IMG: name] display |
Images are passed to the active model's vision pipeline if it supports multimodal input, or routed to the analyze_image tool (Google Gemini vision) otherwise. Text files (.txt, .md, .json, .csv, source code, etc.) are extracted as UTF-8 and included inline up to 8 000 characters — in the TUI simply paste or type the file path.
Videos uploaded on any channel (mp4, m4v, mov, webm, mkv, avi, 3gp, flv) auto-route to analyze_video when image.vision.enabled = true with a Gemini API key. The TUI also detects pasted video paths and labels them Video #N in the attachment indicator. Provider-side limits to keep in mind: Gemini's inline-bytes mode caps at ~20 MB (we use ≤18 MB), and the resumable Files API supports up to 2 GB / ~1 hour videos. Channel-side limits are tighter — Telegram's Bot API hard-caps getFile downloads at 20 MB even though chats accept larger uploads, so videos over that size will get a friendly "compress to under 20 MB and resend" reply. Slack file downloads use the bot token (files:read scope) and inherit the workspace's per-file upload cap. Frame-extraction fallback for non-Gemini providers is not yet wired — without a Gemini key, video uploads return an "unsupported" notice.
| **OpenCrabs** (Rust) | **Node.js Frameworks** (e.g. Open Claw) | |
|---|---|---|
| **Binary size** | **26–29 MB** single binary, zero dependencies | **1 GB+** node_modules with hundreds of transitive packages |
| **Runtime** | None — runs natively | Requires Node.js runtime + npm install |
| **Attack surface** | Zero network listeners. Outbound HTTPS only | Server infrastructure: open ports, auth layers, middleware |
| **API key security** | Keys on your machine only. zeroize clears them from RAM on drop, [REDACTED] in all debug output | Keys in env vars or config. GC doesn't guarantee memory clearing. Heap dumps can leak secrets |
| **Data residency** | 100% local — SQLite DB, embeddings, brain files, all in ~/.opencrabs/ | Server-side storage, potential multi-tenant data, network transit |
| **Supply chain** | Single compiled binary. Rust's type system prevents buffer overflows, use-after-free, data races at compile time | npm ecosystem: typosquatting, dependency confusion, prototype pollution |
| **Memory safety** | Compile-time guarantees — no GC, no null pointers, no data races | GC-managed, prototype pollution, type coercion bugs |
| **Concurrency** | tokio async + Rust ownership = zero data races guaranteed | Single-threaded event loop, worker threads share memory unsafely |
| **Native TTS/STT** | Built-in local speech-to-text (whisper.cpp) and text-to-speech — ~130 MB total stack, fully offline | No native voice. Requires external APIs (Google, AWS, Azure) or heavy Python dependencies (PyTorch, ~5 GB+) |
| **Telemetry** | Zero. No analytics, no tracking, no remote logging | Server infra typically includes monitoring, logging pipelines, APM |
高质量的AI工作流项目,值得关注
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
AI Skill Hub 点评:开源AI工作流 的核心功能完整,质量优秀。对于自动化工程师和运维人员来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | opencrabs |
| 原始描述 | 开源AI工作流:The self-improving all channels AI agent. Self-healing. Fully autonomous. Single。⭐770 · Rust |
| Topics | agent-orchestrationagentic-aiautonomous-agents |
| GitHub | https://github.com/adolfousier/opencrabs |
| License | MIT |
| 语言 | Rust |
收录时间:2026-05-30 · 更新时间:2026-05-30 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端