经 AI Skill Hub 精选评估,开源计算机控制MCP 获评「强烈推荐」。这款MCP工具在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 8.2 分,适合有一定技术背景的用户使用。
开源计算机控制MCP 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
开源计算机控制MCP 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
# 方式一:通过 Claude Code CLI 一键安装
claude skill install https://github.com/Wide-Moat/open-computer-use
# 方式二:手动配置 claude_desktop_config.json
{
"mcpServers": {
"-------mcp": {
"command": "npx",
"args": ["-y", "open-computer-use"]
}
}
}
# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json
# 安装后在 Claude 对话中直接使用 # 示例: 用户: 请帮我用 开源计算机控制MCP 执行以下任务... Claude: [自动调用 开源计算机控制MCP MCP 工具处理请求] # 查看可用工具列表 # 在 Claude 中输入:"列出所有可用的 MCP 工具"
// claude_desktop_config.json 配置示例
{
"mcpServers": {
"_______mcp": {
"command": "npx",
"args": ["-y", "open-computer-use"],
"env": {
// "API_KEY": "your-api-key-here"
}
}
}
}
// 保存后重启 Claude Desktop 生效
MCP server that gives any LLM its own computer — managed Docker workspaces with live browser, terminal, code execution, document skills, and autonomous sub-agents. Self-hosted, open-source, pluggable into any model.
Online demo: chat.yambr.com — Open WebUI with Computer Use already set up, sign in with GitHub or Google. (More ways to try it below.) See it in action: Demo course on docs.yambr.com — eight live scenarios captured from the chat above (pitch deck, Word doc, Excel, PDF invoice, data chart, live-rendered landing page, web scrape, building a custom skill). Real prompts, real screenshots, copy-pasteable. If any of this looks useful, a ⭐ on the repo really helps — thanks!

| Category | Tools |
|---|---|
| **Languages** | Python 3.12, Node.js 22, Java 21, Bun |
| **Documents** | LibreOffice, Pandoc, python-docx, python-pptx, openpyxl |
| **PDF** | pypdf, pdf-lib, reportlab, tabula-py, ghostscript |
| **Images** | Pillow, OpenCV, ImageMagick, sharp, librsvg |
| **Web** | Playwright (Chromium), Mermaid CLI |
| **AI** | Claude Code CLI, Playwright MCP |
| **OCR** | Tesseract (configurable languages) |
| **Media** | FFmpeg |
| **Diagrams** | Graphviz, Mermaid |
| **Dev** | TypeScript, tsx, git |
If you run Open WebUI outside the stock docker-compose.webui.yml — your own compose, Kubernetes, Portainer, or a downstream repo — there are four traps that will silently break Computer Use. All four hit us in production. Check in this order.
openwebui/Dockerfile, don't pull upstreamPulling ghcr.io/open-webui/open-webui:vX.Y.Z gives you a stock image without any of this repo's patches. Four of them are critical for UX:
| Patch | Without it |
|---|---|
fix_artifacts_auto_show | HTML/iframe renders as raw text in chat body instead of the artifacts panel |
fix_preview_url_detection | Preview iframe is never auto-inserted after file links |
fix_tool_loop_errors | Raw exceptions instead of banners; MCP call failed: Session terminated appears unwrapped |
fix_large_tool_results | TOOL_RESULT_MAX_CHARS stops truncating and the large-result upload path (via ORCHESTRATOR_URL) becomes a no-op; large outputs wreck the model context |
Only CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES keeps working on an upstream image (it's a stock Open WebUI env) — which creates a false "everything is configured" feeling.
Use build: in your downstream compose, mirroring docker-compose.webui.yml:11-15:
services:
open-webui:
build:
context: ./openwebui # path into this repo
dockerfile: Dockerfile
args:
OPENWEBUI_VERSION: "0.9.2"
image: open-webui-with-cu-patches:latest # local tag, do not pull
Verify the patches are baked into the running container:
docker exec open-webui bash -c \
'grep -rl "FIX_ARTIFACTS_AUTO_SHOW" /app/build/_app/immutable/chunks/ >/dev/null \
&& echo "patches applied" || echo "MISSING — you are on upstream image"'
The FIX_ARTIFACTS_AUTO_SHOW JS comment marker is injected by fix_artifacts_auto_show.py at build time as a version-stable identifier — it does not depend on minified Svelte variable names, which change with every Open WebUI release.
fix_preview_url_detection is now fully host-agnostic. The injected JS reads the origin directly from the matched URL at runtime (_pm[1] captures the full https://host:port prefix), so the patch requires no build-time host configuration. The COMPUTER_USE_SERVER_URL build-arg has been removed from openwebui/Dockerfile.
No action needed — the patch works automatically regardless of whether you use localhost:8081, a public domain, or Docker internal DNS. The preview iframe src is always reconstructed from the URL the model wrote into the message, which in turn comes from the server's PUBLIC_BASE_URL env var.
Verify the patch is applied:
```bash docker exec open-webui bash -c \ 'grep -rl "FIX_PREVIEW_URL_DETECTION" /app/build/_app/immutable/chunks/ >/dev/null \ && echo "patches applied" || echo "MISSING — fix_preview_url_detection not baked in"'

docker compose up --build
If you run Open WebUI separately, you need to manually:
openwebui/tools/computer_use_tools.pyai_computer_use (required for filter to work)ORCHESTRATOR_URL = internal URL of your Computer Use Server (http://computer-use-server:8081 for Docker compose)group:* and user:* wildcards) — otherwise only your admin account sees the tool and non-admin users get an empty tool list with no erroropenwebui/functions/computer_link_filter.pyNative and Stream Chat Response = On. Or set them globally once in Admin → Settings → Models → Advanced Params (function_calling: native, stream_response: true) — that becomes DEFAULT_MODEL_PARAMS for every model.The docker-compose stack handles all of this automatically.
docker exec open-webui bash -c \ 'grep -rl "FIX_PREVIEW_URL_DETECTION" /app/build/_app/immutable/chunks/ >/dev/null \ && echo "patches applied" || echo "MISSING — fix_preview_url_detection not baked in"'
docker build --platform linux/amd64 -t open-computer-use:latest .
docker compose up --build ```
```bash git clone https://github.com/Wide-Moat/open-computer-use.git cd open-computer-use cp .env.example .env
After adding a model in Open WebUI, go to Model Settings and set:
| Setting | Value | Why |
|---|---|---|
| **Function Calling** | Native | Required for Computer Use tools to work |
| **Stream Chat Response** | On | Enables real-time output streaming |
Without Function Calling: Native, the model won't invoke Computer Use tools.
All settings via .env:
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY | — | LLM API key (any OpenAI-compatible) |
OPENAI_API_BASE_URL | — | Custom API base URL (OpenRouter, etc.) |
MCP_API_KEY | — | Bearer token for MCP endpoint |
DOCKER_IMAGE | open-computer-use:latest | Sandbox container image |
COMMAND_TIMEOUT | 120 | Bash tool timeout (seconds) |
SUB_AGENT_TIMEOUT | 3600 | Sub-agent timeout (seconds) |
SINGLE_USER_MODE | — | true = one container, no chat ID needed; false = require X-Chat-Id; unset = lenient |
PUBLIC_BASE_URL | http://computer-use-server:8081 | Browser-reachable URL of the Computer Use server. Baked into /system-prompt and returned to the Open WebUI filter in the X-Public-Base-URL response header — **single source of truth** for the public URL. [Open WebUI filter URL requirements](docs/openwebui-filter.md#two-url-roles--public-server-env-and-internal-filtertool-valve). |
CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES, ORCHESTRATOR_URL, TOOL_RESULT_MAX_CHARS, TOOL_RESULT_PREVIEW_CHARS | — | Settings on the **open-webui container** (not CU-server). Required when embedding — see [Required setup when embedding Open WebUI](#required-setup-when-embedding-open-webui-into-your-own-stack). |
POSTGRES_PASSWORD | openwebui | PostgreSQL password |
VISION_API_KEY | — | Vision API key (for describe-image) |
ANTHROPIC_AUTH_TOKEN | — | Anthropic key (for Claude Code sub-agent) |
MCP_TOKENS_URL | — | Settings Wrapper URL (optional, see below) |
MCP_TOKENS_API_KEY | — | Settings Wrapper auth key |
By default, all 13 built-in skills are available to everyone. For per-user skill access and custom skills, deploy the Settings Wrapper — see settings-wrapper/README.md.
Personal Access Tokens (PATs): The settings wrapper can also store encrypted per-user PATs for external services (GitLab, Confluence, Jira, etc.). The server fetches them by user email and injects into the sandbox — so each user's AI has access to their repos/docs without sharing credentials. The server-side code for token injection is implemented (docker_manager.py), but the Open WebUI tool doesn't pass the required headers yet. This is on the roadmap — if you need PAT management, open an issue.
docker exec open-webui env | grep -E 'CHAT_RESPONSE_MAX_TOOL_CALL_RETRIES|TOOL_RESULT_|ORCHESTRATOR_URL'
docker exec computer-use-server env | grep ^PUBLIC_BASE_URL=
The server speaks standard MCP over Streamable HTTP. Point any MCP client at it — hosted or self-hosted.
- Hosted: https://api.yambr.com/mcp/computer_use with Authorization: Bearer <key from app.yambr.com>. Client configs and full reference live on docs.yambr.com. - Self-hosted: http://localhost:8081/mcp. Quick sanity check:
curl -X POST http://localhost:8081/mcp \
-H "Content-Type: application/json" \
-H "X-Chat-Id: test" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'
Full self-host integration guide (LiteLLM, Claude Desktop, custom clients): docs/MCP.md. The per-chat system prompt rides six redundant MCP-native channels (tool descriptions, /home/assistant/README.md in the sandbox, InitializeResult.instructions, resources/list for uploaded files, plus an HTTP /system-prompt endpoint for legacy integrations) — full map in docs/system-prompt.md.
The Computer Use Server speaks standard MCP over Streamable HTTP — any MCP-compatible client can connect. Open WebUI is the primary tested frontend, but not the only option.
| Client | Self-hosted URL | Hosted URL | Status |
|---|---|---|---|
| [**Open WebUI**](https://github.com/open-webui/open-webui) | Docker Compose stack included, auto-configured | n/a — use [chat.yambr.com](https://chat.yambr.com) directly (pointing your own Open WebUI at the hosted API isn't a documented path) | Tested in production |
| [**Claude Desktop**](https://claude.ai/download) | http://localhost:8081/mcp — see [docs/MCP.md](docs/MCP.md) | https://api.yambr.com/mcp/computer_use — see [docs/CLOUD.md](docs/CLOUD.md) | Works |
| [**n8n**](https://n8n.io) | MCP Tool node → http://computer-use-server:8081/mcp | MCP Tool node → https://api.yambr.com/mcp/computer_use | Works |
| [**LiteLLM**](https://github.com/BerriAI/litellm) | MCP proxy config — see [docs/MCP.md](docs/MCP.md) | MCP proxy → https://api.yambr.com/mcp/computer_use | Works |
| **Custom client** | Any HTTP client with MCP JSON-RPC — see curl examples in [docs/MCP.md](docs/MCP.md) | Same, with Authorization: Bearer sk-... (key from [app.yambr.com](https://app.yambr.com)) | Works |
Open WebUI is an extensible, self-hosted AI interface. We use it as the primary frontend because it supports tool calling, function filters, and artifacts — everything needed for Computer Use.
Compatibility: This build is strictly built and verified against Open WebUI 0.9.2. The first 3 segments of our build version (v0.9.2.X) always match the Open WebUI base version it targets. If you run a different Open WebUI version, pick the Open Computer Use build whose first 3 version segments match yours — e.g., for Open WebUI 0.8.12 use a v0.8.12.Y build.
Why not a fork? We intentionally did not fork Open WebUI. Instead, everything is bolted on via the official plugin API (tools + functions) and build-time patches for missing features. This means you can use stock Open WebUI 0.9.2 with this build (the version that the first 3 segments of our build version v0.9.2.X match) — just install the tool and filter. Patches are applied at Docker build time; strongly recommended — 4 of them affect user-visible UX (artifacts panel, preview iframe, error banners, large tool-result handling). Pulling ghcr.io/open-webui/open-webui directly skips all of them — see Required setup when embedding Open WebUI for the full checklist.
Running Claude Code through a corporate gateway (LiteLLM, Azure, Bedrock)? See docs/claude-code-gateway.md for the three-path operator recipe.
The openwebui/ directory contains:
Open Computer Use 是一个强大的自动化工具,旨在赋予 AI 操作计算机的能力。通过集成先进的控制逻辑,该项目允许 AI 模型在受控的沙盒环境中执行复杂的计算机任务,实现从文档处理到网页操作的全方位自动化。
本项目提供了一个功能完备的沙盒环境,内置了丰富的工具链以支持多种任务。支持 Python 3.12、Node.js 22、Java 21 等多种编程语言;具备强大的文档处理能力(LibreOffice, Pandoc, python-docx 等)和 PDF 操作能力(pypdf, tabula-py 等);同时集成 Pillow、OpenCV 等图像处理库,以及基于 Playwright 的 Web 自动化能力,能够应对复杂的数字化工作流。
如果你计划将 Open WebUI 集成到自定义的架构中(如使用自己的 Docker Compose、Kubernetes 或 Portainer),请务必注意四个可能导致 Computer Use 功能失效的“陷阱”。在生产环境中,这些问题往往是隐性的,建议按照官方推荐的顺序逐一检查环境配置,以确保服务能够正常运行。
你可以通过 `docker compose up --build` 快速启动项目,系统会在首次运行时构建 Workspace 镜像。若采用手动部署方式(不使用 Docker Compose),需在 Open WebUI 的 Workspace > Tools 中创建新工具,并粘贴 `computer_use_tools.py` 的内容,同时务必将 Tool ID 设置为 `ai_computer_use`,并正确配置 Valves 中的 `ORCHESTRATOR_URL` 指向你的 Computer Use Server 地址。
快速开始非常简单:首先通过 Git 克隆仓库,进入项目目录后,将 `.env.example` 复制并重命名为 `.env`。随后根据你的需求配置 API 密钥,即可开始体验 AI 驱动的计算机自动化操作。
项目的所有配置均通过 `.env` 文件进行管理。你需要设置 `OPENAI_API_KEY` 以及(如果使用第三方服务)`OPENAI_API_BASE_URL`。特别注意:在 Open WebUI 的 Model Settings 中,必须将 Function Calling 设置为 `Native` 模式,并开启 Stream Chat Response,否则模型将无法正确调用 Computer Use 工具。
本项目实现了标准的 MCP (Model Context Protocol) 集成,通过 Streamable HTTP 协议进行通信。这意味着任何兼容 MCP 的客户端都可以连接到该服务器。目前已针对 Open WebUI 完成深度测试,同时也支持通过托管端点(Hosted URL)进行远程连接,为开发者提供了极高的灵活性。
创新性强的MCP实现,为LLM注入计算机控制能力。架构清晰、Docker隔离设计合理,有较好的实用价值和发展潜力。
该工具使用 NOASSERTION 协议,商用场景请仔细阅读协议条款,必要时咨询法律意见。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
📄 NOASSERTION — 请查阅原始协议条款了解具体使用限制。
AI Skill Hub 点评:开源计算机控制MCP 的核心功能完整,质量优秀。对于Claude Desktop / Claude Code 用户来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | open-computer-use |
| 原始描述 | 开源MCP工具:MCP server that gives any LLM its own computer — managed Docker workspaces with 。⭐78 · Python |
| Topics | 计算机控制MCP服务Claude代码Docker隔离LLM代理 |
| GitHub | https://github.com/Wide-Moat/open-computer-use |
| License | NOASSERTION |
| 语言 | Python |
收录时间:2026-05-20 · 更新时间:2026-05-30 · License:NOASSERTION · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端