能力标签
🔌
MCP工具

快速爬虫工具

基于 Rust · 让 AI 助手直接操作你的系统与工具
英文名:crw
⭐ 113 Stars 🍴 9 Forks 💻 Rust 📄 AGPL-3.0 🏷 AI 7.5分
7.5AI 综合评分
crawlerdata-extractionrust
✦ AI Skill Hub 推荐

快速爬虫工具 是 AI Skill Hub 本期精选MCP工具之一。综合评分 7.5 分,整体质量较高。我们推荐使用将其纳入你的 AI 工具库,帮助提升工作效率。

📚 深度解析
快速爬虫工具 是一款基于 MCP(Model Context Protocol)标准协议的 AI 工具扩展。MCP 协议由 Anthropic 开发并开源,旨在建立 AI 模型与外部工具之间的标准化通信接口,目前已被 Claude Desktop、Claude Code、Cursor 等主流 AI 工具采纳。

通过安装 快速爬虫工具,你的 AI 助手将获得额外的工具调用能力,可以用自然语言直接操控该工具的功能,无需学习复杂的命令行语法。MCP 工具的核心价值在于"一次配置,永久增强"——配置完成后,每次与 AI 对话时都可以无缝调用这些工具。

在技术实现上,MCP 工具通过标准的 JSON-RPC 协议与 AI 客户端通信,工具的功能以"工具列表"的形式暴露给 AI 模型,AI 可以按需调用。快速爬虫工具 提供了结构化的工具调用接口,使 AI 模型能够精确地理解和使用每个功能点,显著降低 AI 在工具使用上的错误率。

与传统的 API 集成相比,MCP 工具的优势在于无需编写代码——用户只需在配置文件中添加几行 JSON,即可让 AI 获得全新能力。AI Skill Hub 将 快速爬虫工具 评为 AI 评分 7.5 分,属于同类工具中的优质选择。
📋 工具概览

快速爬虫工具 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。

GitHub Stars
⭐ 113
开发语言
Rust
支持平台
Windows / macOS / Linux
维护状态
轻量级项目,按需更新
开源协议
AGPL-3.0
AI 综合评分
7.5 分
工具类型
MCP工具
Forks
9
📖 中文文档
以下内容由 AI Skill Hub 根据项目信息自动整理,如需查看完整原始文档请访问底部「原始来源」。

快速爬虫工具 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。

📌 核心特色
  • 通过标准 MCP 协议与 Claude、Cursor 等主流 AI 客户端深度集成
  • 提供结构化工具调用接口,显著降低 AI 集成复杂度
  • 支持 Claude Desktop 和 Claude Code 无缝接入,开箱即用
  • 可与其他 MCP 工具组合叠加,构建完整 AI 工作站
  • 轻量无侵入设计,不影响现有系统架构
🎯 主要使用场景
  • 在 Claude Desktop 对话中直接调用本地工具,实现 AI 与系统的深度联动
  • 通过自然语言驱动复杂的多步骤自动化任务,代替繁琐手动操作
  • 将多个 MCP 工具组合使用,构建个人专属 AI 工作站
以下安装命令基于项目开发语言和类型自动生成,实际以官方 README 为准。
安装命令
# 方式一:通过 Claude Code CLI 一键安装
claude skill install https://github.com/us/crw

# 方式二:手动配置 claude_desktop_config.json
{
  "mcpServers": {
    "------": {
      "command": "npx",
      "args": ["-y", "crw"]
    }
  }
}

# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json
📋 安装步骤说明
  1. 确认已安装 Node.js(v18 或以上版本)
  2. 打开 Claude Desktop 或 Claude Code 的 MCP 配置文件
  3. 按「交给 Agent 安装 → Claude Desktop」标签中的 JSON 配置填入 mcpServers 字段
  4. 保存配置文件并重启 Claude 客户端
  5. 重启后,在对话中即可使用本工具
以下用法示例由 AI Skill Hub 整理,涵盖最常见的使用场景。
常用命令 / 代码示例
# 安装后在 Claude 对话中直接使用
# 示例:
用户: 请帮我用 快速爬虫工具 执行以下任务...
Claude: [自动调用 快速爬虫工具 MCP 工具处理请求]

# 查看可用工具列表
# 在 Claude 中输入:"列出所有可用的 MCP 工具"
以下配置示例基于典型使用场景生成,具体参数请参照官方文档调整。
配置示例
// claude_desktop_config.json 配置示例
{
  "mcpServers": {
    "______": {
      "command": "npx",
      "args": ["-y", "crw"],
      "env": {
        // "API_KEY": "your-api-key-here"
      }
    }
  }
}

// 保存后重启 Claude Desktop 生效
📑 README 深度解析 真实文档 完整度 81/100 查看 GitHub 原文 →
以下内容由系统直接从 GitHub README 解析整理,保留代码块、表格与列表结构。

简介

<a name="readme-top"></a> <p align="center"> <a href="https://fastcrw.com"> <img src="docs/logo.png" alt="fastCRW" height="120" /> </a> <p align="center">The web scraper built for AI agents. Single binary. Zero config.</p> <p align="center"> <a href="https://crates.io/crates/crw-server"><img src="https://img.shields.io/crates/v/crw-server.svg" alt="crates.io"></a> <a href="https://github.com/us/crw/actions"><img src="https://github.com/us/crw/workflows/CI/badge.svg" alt="CI"></a> <a href="LICENSE"><img src="https://img.shields.io/badge/license-AGPL--3.0-blue.svg" alt="License"></a> <a href="https://github.com/us/crw/stargazers"><img src="https://img.shields.io/github/stars/us/crw?style=social" alt="GitHub Stars"></a> <a href="https://fastcrw.com"><img src="https://img.shields.io/badge/Managed%20Cloud-fastcrw.com-blueviolet" alt="fastcrw.com"></a> </p> <p align="center"> <a href="https://twitter.com/fastcrw"> <img src="https://img.shields.io/badge/Follow%20on%20X-000000?style=for-the-badge&logo=x&logoColor=white" alt="Follow on X" /> </a> <a href="https://www.linkedin.com/company/fastcrw"> <img src="https://img.shields.io/badge/Follow%20on%20LinkedIn-0077B5?style=for-the-badge&logo=linkedin&logoColor=white" alt="Follow on LinkedIn" /> </a> <a href="https://discord.gg/kkFh2SC8"> <img src="https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Join our Discord" /> </a> </p> <p align="center"> <a href="https://www.producthunt.com/products/fastcrw?utm_source=badge-featured&utm_medium=badge&utm_campaign=badge-fastcrw" target="_blank" rel="noopener noreferrer"><img src="https://api.producthunt.com/widgets/embed-image/v1/featured.svg?post_id=1116966&theme=light&t=1775671073751" alt="fastCRW - Search + scrape live web results for AI agents | Product Hunt" width="250" height="54" /></a> </p> <p align="center"> Works with: Claude Code · Cursor · Windsurf · Cline · Copilot · Continue.dev · Codex · Gemini CLI </p> <p align="center"> <a href="#-quick-start">Quick Start</a> &bull; <a href="#-connect-to-ai-agents">AI Agents</a> &bull; <a href="#-benchmark">Benchmarks</a> &bull; <a href="https://docs.fastcrw.com/#rest-api">API Reference</a> &bull; <a href="https://fastcrw.com">Cloud</a> &bull; <a href="https://discord.gg/kkFh2SC8">Discord</a> </p> <p align="center"> <b>English</b> | <a href="README.zh-CN.md">中文</a> </p> </p>

<p align="center"> <img src="docs/demo.gif" alt="fastCRW Demo" width="700" /> </p>

---

What's New

Features

  • detector: add vendor-specific anti-bot block markers (c88c508)
  • renderer: add chrome_proxy as 4th fallback tier (b4da4f7)
  • renderer: per-request country via CDP proxy auth (11b4d32)

Install:

curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | CRW_BINARY=crw sh

Self-hosted (default Docker compose stack)

client = CrwClient(api_url="http://localhost:3000") results = client.search("open source web scraper 2026", limit=10)

📦 Install

One-line install (auto-detects OS & arch):

curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | CRW_BINARY=crw sh

One-line install:

curl -fsSL https://raw.githubusercontent.com/us/crw/main/install.sh | CRW_BINARY=crw-server sh

Docker:

docker run -p 3000:3000 ghcr.io/us/crw


Custom port:
bash CRW_SERVER__PORT=8080 crw-server # env var docker run -p 8080:8080 -e CRW_SERVER__PORT=8080 ghcr.io/us/crw # Docker

**Docker Compose** ships with `lightpanda` enabled by default; `chrome` is opt-in to keep small VPS deploys lean (~500MB image + 1GB resident):
bash

Setup Wizard (CLI)

The CLI includes an interactive setup wizard for easy configuration:

crw setup

The wizard guides you through: - Cloud vs Local mode selection - Browser engine setup (LightPanda or Chrome for JS rendering) - Search engine setup (SearXNG via Docker) - LLM provider configuration (BYOK for AI features) - Shell configuration (auto-adds to .zshrc/.bashrc)

Options: - crw setup --cloud — skip to cloud setup - crw setup --local — skip to local setup - crw setup --no-color — disable colored output (accessibility) - Press ESC to cancel gracefully at any prompt

See the self-hosting guide for production hardening, auth, reverse proxy, and resource tuning.

---

🚀 Quick Start

```bash

Example Domain

This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination.


#### Renderer selection & response metadata

CRW picks between three rendering backends per request:

- **`http`** (1 credit) — plain HTTP fetch. Used for static pages.
- **`lightpanda`** (1 credit) — lightweight JS renderer for most SPAs.
- **`chrome`** (2 credits) — full Chromium for sites where LightPanda's hydration crashes (e.g. some Next.js App Router pages).

By default the engine auto-selects, learns per-host preferences after repeated failures, and falls over chrome → lightpanda → http transparently. Pass `"renderer"` to pin one of `auto | http | lightpanda | chrome` (Firecrawl's `engine` is also accepted as an alias).

Every successful response includes routing metadata so callers can audit and debug:
jsonc { "data": { "markdown": "...", "renderDecision": { "kind": "failover", // autoDefault | autoPromoted | userPinned | failover | breakerSkipped "chain": ["lightpanda", "chrome"], // renderers actually attempted "reason": "nextJsClientError" // why the chain advanced }, "creditCost": 2, "warnings": [ "lightpanda returned a failed render (nextjs_client_error)" ], "metadata": { "renderedWith": "chrome", // } } } ```

When you hard-pin a renderer that fails (e.g. "renderer":"lightpanda" on a hydration-crashing page), success stays true for protocol compatibility — but data.warnings[] carries an actionable hint suggesting renderer="chrome" or auto mode. Clients should surface the warnings array.

Configuration

Layered TOML config with environment variable overrides:

  1. config.default.toml — built-in defaults
  2. config.local.toml — local overrides (or CRW_CONFIG=myconfig)
  3. Environment variables — CRW_ prefix, __ separator (e.g. CRW_SERVER__PORT=8080)

```toml [server] host = "0.0.0.0" port = 3000 rate_limit_rps = 10

[renderer] mode = "auto" # auto | lightpanda | playwright | chrome | none

[crawler] max_concurrency = 10 requests_per_second = 10.0 respect_robots_txt = true

[auth]

fastCRW — Open Source Web Scraping API for AI Agents

Power AI agents with clean web data. Single Rust binary, zero config, Firecrawl-compatible API. The open-source Firecrawl alternative you can self-host for free — or use our managed cloud.

Don't want to self-host? Sign up free → — managed cloud with global proxy network, web search, and dashboard. Same API, zero infra. 500 free credits, no credit card required.

---

Web Scraping & Crawling Features

Core

FeatureDescription
[**Scrape**](#scrape)Convert any URL to markdown, HTML, JSON, or links
[**Crawl**](#crawl)Async BFS website crawler with rate limiting
[**Map**](#map)Discover all URLs on a site instantly
[**Search**](#search)Web search + content scraping — bundled SearXNG sidecar, free Tavily alternative

More

FeatureDescription
[**LLM Extraction**](#llm-structured-extraction)Send a JSON schema, get validated structured data back
**LLM Summary**formats: ["summary"] on /v1/scrape — clean prose digest of any page. BYOK (Anthropic / OpenAI / Azure / DeepSeek / any OpenAI-compatible).
**LLM Search Answer**answer: true / summarizeResults: true on /v1/search — synthesized answer with citations or per-result summaries
[**JS Rendering**](#js-rendering)Auto-detect SPAs, render via LightPanda or Chrome
[**CLI**](#cli)Scrape any URL from your terminal — no server needed
[**MCP Server**](#mcp-server-for-ai-agents)Built-in stdio + HTTP transport for any AI agent

Use Cases: RAG pipelines · AI agent web access · content monitoring · data extraction · HTML to markdown conversion · web archiving

---

client = CrwClient(api_url="https://fastcrw.com/api", api_key="YOUR_KEY")


<details>
<summary><b>cURL</b></summary>
bash

API Endpoints

MethodEndpointDescription
POST/v1/scrapeScrape a single URL, optionally with LLM extraction
POST/v1/crawlStart async BFS crawl (returns job ID)
GET/v1/crawl/:idCheck crawl status and retrieve results
DELETE/v1/crawl/:idCancel a running crawl job
POST/v1/mapDiscover all URLs on a site
POST/v1/searchWeb search via SearXNG sidecar, with optional content scraping
GET/healthHealth check (no auth required)
POST/mcpStreamable HTTP MCP transport

Full API reference →

---

Local (embedded — no server, no API key):

claude mcp add crw -- npx crw-mcp

CLI (`crw`) — scrape URLs from your terminal

```bash brew install us/crw/crw

API Server (`crw-server`) — Firecrawl-compatible REST API

For serving multiple apps, other languages (Node.js, Go, Java), or as a shared microservice.

```bash brew install us/crw/crw-server

SDKs

Community SDKs

Node.js: No official SDK yet — use the REST API directly or npx crw-mcp for MCP. SDK examples →

---

api_keys = ["fc-key-1234"]

```

See full configuration reference.

---

stealth tier — browserless/chromium with anti-fingerprint plugin

Integrations

Frameworks: CrewAI · LangChain · Agno · Dify

Platforms: n8n · Flowise

Missing your favorite tool? Open an issue → · All integrations →

---

[0.10.0](https://github.com/us/crw/compare/v0.9.1...v0.10.0) (2026-05-20)

Why CRW? — Firecrawl & Crawl4AI Alternative

  • Single binary, 6 MB RAM — no Redis, no Node.js, no containers. Firecrawl needs 5 containers and 4 GB+. Crawl4AI requires Python + Playwright
  • 5.5x faster than Firecrawl — 833ms avg vs 4,600ms (see benchmarks). P50 at 446ms
  • 73/100 search win rate — beats Firecrawl (25/100) and Tavily (2/100) in head-to-head benchmarks
  • Free self-hosting — $0/1K scrapes vs Firecrawl's $0.83–5.33. No infra, no cold starts (85ms). No API key required for local mode
  • Agent ready — add to any MCP client in one command. Embedded mode: no server needed
  • Firecrawl-compatible API — drop-in replacement. Same /v1/scrape, /v1/crawl, /v1/map endpoints. HTML to markdown, structured data extraction, website crawler — all built-in
  • Built for RAG pipelines — clean LLM-ready markdown output for vector databases and AI data ingestion
  • Open source — AGPL-3.0, developed transparently. Join our community
MetricCRW (self-hosted)fastcrw.com (cloud)FirecrawlTavilyCrawl4AI
**Coverage (1K URLs)****92.0%****92.0%**77.2%
**Avg Scrape Latency****833ms****833ms**4,600ms
**Avg Search Latency****880ms****880ms**954ms2,000ms
**Search Win Rate****73/100****73/100**25/1002/100
**Idle RAM**6.6 MB0 (managed)~500 MB+— (cloud)
**Cold start**85 ms0 (always-on)30–60 s
**Self-hosting****Single binary**Multi-containerNoPython + Playwright
**Cost / 1K scrapes****$0** (self-hosted)From $13/mo$0.83–5.33$0
**License**AGPL-3.0ManagedAGPL-3.0ProprietaryApache-2.0

---

Search — CRW vs Firecrawl vs Tavily (100 queries, concurrent)

MetricCRWFirecrawlTavily
**Avg Latency****880ms**954ms2,000ms
**Median Latency****785ms**932ms1,724ms
**Win Rate****73/100**25/1002/100

CRW is 2.3x faster than Tavily and won 73% of latency races. Full search benchmark →

Scrape — CRW vs Firecrawl (1,000 URLs, JS rendering enabled)

Tested on Firecrawl's scrape-content-dataset-v1:

MetricCRWFirecrawl v2.5
**Coverage****92.0%**77.2%
**Avg Latency****833ms**4,600ms
**P50 Latency****446ms**
**Noise Rejection****88.4%**noise 6.8%
**Idle RAM****6.6 MB**~500 MB+
**Cost / 1K scrapes****$0** (self-hosted)$0.83–5.33

<details> <summary><b>Resource comparison</b></summary>

MetricCRWFirecrawl
Min RAM~7 MB4 GB
Recommended RAM~64 MB (under load)8–16 GB
Docker imagessingle ~8 MB binary~2–3 GB total
Cold start85 ms30–60 seconds
Containers needed1 (+optional sidecar)5

</details>

Full benchmark details →

Run the benchmark yourself:

pip install datasets aiohttp
python bench/run_bench.py

---

Open Source vs Cloud

Self-hosted (free)[fastcrw.com](https://fastcrw.com) Cloud
Core scraping
JS rendering✅ (LightPanda/Chrome)
Web search✅ (bundled SearXNG sidecar)✅ (managed)
Global proxy network
Dashboard
Commercial use without open-sourcingRequires AGPL compliance✅ Included
Cost$0From $13/mo
Sign up free →500 free credits, no credit card required.

---

🎯 aiskill88 AI 点评 A 级 2026-05-25

快速、轻量级的爬虫工具,值得使用

⚡ 核心功能
👥 适合人群
Claude Desktop / Claude Code 用户AI 工具开发者需要扩展 AI 能力的专业人士自动化工程师
🎯 使用场景
  • 在 Claude Desktop 对话中直接调用本地工具,实现 AI 与系统的深度联动
  • 通过自然语言驱动复杂的多步骤自动化任务,代替繁琐手动操作
  • 将多个 MCP 工具组合使用,构建个人专属 AI 工作站
⚖️ 优点与不足
✅ 优点
  • +标准化 MCP 协议,生态互联性强
  • +与 Claude 官方生态无缝对接
  • +即插即用,配置简单快捷
⚠️ 不足
  • 依赖 Claude 客户端,非 Claude 用户无法使用
  • MCP 协议仍在持续演进,接口可能变更
  • 需要一定的配置步骤
⚠️ 使用须知

该工具使用 AGPL-3.0 协议,商用场景请仔细阅读协议条款,必要时咨询法律意见。

AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。

建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。

📄 License 说明

⚠️ AGPL 3.0 — 最严格的 Copyleft,网络服务端使用也需开源,SaaS 使用受限。

🔗 相关工具推荐
🧩 你可能还需要
基于当前 Skill 的能力图谱,自动补全的工具组合
❓ 常见问题 FAQ
crw 是一款Rust开发的AI辅助工具。开源MCP工具:Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search A。⭐113 · Rust 主要应用场景包括:快速爬取和提取网页数据。
💡 AI Skill Hub 点评

经综合评估,快速爬虫工具 在MCP工具赛道中表现稳健,质量良好。如果你已有明确的使用需求,可以直接上手体验;如果还在评估阶段,建议对比同类工具后再做决策。

⬇️ 获取与下载
⬇ 下载源码(GPL)
⚠️ 本工具使用 AGPL-3.0 协议。您可以自由下载和使用,但衍生作品必须以相同协议开源,不可商业闭源。使用前请确认符合协议要求。
📚 深入学习 快速爬虫工具
查看分步骤安装教程和完整使用指南,快速上手这款工具
🌐 原始信息
原始名称 crw
原始描述 开源MCP工具:Fast, lightweight Firecrawl alternative in Rust. Web scraper, crawler & search A。⭐113 · Rust
Topics crawlerdata-extractionrust
GitHub https://github.com/us/crw
License AGPL-3.0
语言 Rust
🔗 原始来源
🐙 GitHub 仓库  https://github.com/us/crw 🌐 官方网站  https://fastcrw.com

收录时间:2026-05-25 · 更新时间:2026-05-26 · License:AGPL-3.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。