能力标签

🔌 MCP 🤖 Agent 🐳 Docker 💻 CLI 🔗 REST API 🧠 Claude 🕸 采集

🔌

MCP工具

SearxNcrawl

基于 Python · 让 AI 助手直接操作你的系统与工具

英文名：searxNcrawl

⭐ 105 Stars 🍴 9 Forks 💻 Python 📄 MIT 🏷 AI 7.5分

7.5AI 综合评分

mcppythonweb爬虫

⬇ 下载源码 ZIP ⚙️ 配置说明

✦ AI Skill Hub 推荐

SearxNcrawl 是 AI Skill Hub 本期精选MCP工具之一。综合评分 7.5 分，整体质量较高。我们推荐使用将其纳入你的 AI 工具库，帮助提升工作效率。

📚 深度解析

SearxNcrawl 是一款基于 MCP（Model Context Protocol）标准协议的 AI 工具扩展。MCP 协议由 Anthropic 开发并开源，旨在建立 AI 模型与外部工具之间的标准化通信接口，目前已被 Claude Desktop、Claude Code、Cursor 等主流 AI 工具采纳。

通过安装 SearxNcrawl，你的 AI 助手将获得额外的工具调用能力，可以用自然语言直接操控该工具的功能，无需学习复杂的命令行语法。MCP 工具的核心价值在于"一次配置，永久增强"——配置完成后，每次与 AI 对话时都可以无缝调用这些工具。

在技术实现上，MCP 工具通过标准的 JSON-RPC 协议与 AI 客户端通信，工具的功能以"工具列表"的形式暴露给 AI 模型，AI 可以按需调用。SearxNcrawl 提供了结构化的工具调用接口，使 AI 模型能够精确地理解和使用每个功能点，显著降低 AI 在工具使用上的错误率。

与传统的 API 集成相比，MCP 工具的优势在于无需编写代码——用户只需在配置文件中添加几行 JSON，即可让 AI 获得全新能力。AI Skill Hub 将 SearxNcrawl 评为 AI 评分 7.5 分，属于同类工具中的优质选择。

📋 工具概览

SearxNcrawl 是一款遵循 MCP（Model Context Protocol）标准协议的 AI 工具扩展。通过 MCP 协议，它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务，实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用，都可以通过自然语言在 AI 对话中直接触发，极大提升生产效率。

GitHub Stars

⭐ 105

开发语言

Python

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

MIT

AI 综合评分

7.5 分

工具类型

MCP工具

Forks

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

📌 核心特色

通过标准 MCP 协议与 Claude、Cursor 等主流 AI 客户端深度集成
提供结构化工具调用接口，显著降低 AI 集成复杂度
支持 Claude Desktop 和 Claude Code 无缝接入，开箱即用
可与其他 MCP 工具组合叠加，构建完整 AI 工作站
轻量无侵入设计，不影响现有系统架构

🎯 主要使用场景

在 Claude Desktop 对话中直接调用本地工具，实现 AI 与系统的深度联动
通过自然语言驱动复杂的多步骤自动化任务，代替繁琐手动操作
将多个 MCP 工具组合使用，构建个人专属 AI 工作站

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# 方式一：通过 Claude Code CLI 一键安装
claude skill install https://github.com/DasDigitaleMomentum/searxNcrawl

# 方式二：手动配置 claude_desktop_config.json
{
  "mcpServers": {
    "searxncrawl": {
      "command": "npx",
      "args": ["-y", "searxncrawl"]
    }
  }
}

# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json

📋 安装步骤说明

确认已安装 Node.js（v18 或以上版本）
打开 Claude Desktop 或 Claude Code 的 MCP 配置文件
按「交给 Agent 安装 → Claude Desktop」标签中的 JSON 配置填入 mcpServers 字段
保存配置文件并重启 Claude 客户端
重启后，在对话中即可使用本工具

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 安装后在 Claude 对话中直接使用
# 示例：
用户: 请帮我用 SearxNcrawl 执行以下任务...
Claude: [自动调用 SearxNcrawl MCP 工具处理请求]

# 查看可用工具列表
# 在 Claude 中输入："列出所有可用的 MCP 工具"

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

// claude_desktop_config.json 配置示例
{
  "mcpServers": {
    "searxncrawl": {
      "command": "npx",
      "args": ["-y", "searxncrawl"],
      "env": {
        // "API_KEY": "your-api-key-here"
      }
    }
  }
}

// 保存后重启 Claude Desktop 生效

📑 README 深度解析真实文档完整度 79/100 查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

searxNcrawl

searxNcrawl is a minimal MCP server and CLI toolkit for search and crawling, built on top of Crawl4AI and SearXNG.

This project is published as searxNcrawl at https://github.com/DasDigitaleMomentum/searxNcrawl and is maintained by DDM – Das Digitale Momentum GmbH & Co KG. It is the successor to searxng-mcp https://github.com/tisDDM/searxng-mcp (which should be marked deprecated).

Compared to plain Crawl4AI usage, searxNcrawl provides a proven, production-tested crawl configuration for documentation-heavy sites, optimized for clean, model-ready Markdown with less noise and better token efficiency.

It also includes built-in markdown deduplication and early support for authenticated crawling (WIP) via Playwright storage state — including a practical CDP export flow for real Chrome/Chromium login sessions.

Features

Enable CORS for specific origins (required for browser-based MCP clients)

python -m crawler.mcp_server --transport http --cors-origins "http://localhost:3000,https://myapp.com"

Dependencies

Minimal dependencies:

crawl4ai>=0.7.4 - The underlying crawler engine
tldextract>=5.1.2 - Domain parsing for site crawls
playwright>=1.40.0 - Browser automation
fastmcp>=2.0.0 - MCP server framework
httpx>=0.27.0 - HTTP client for SearXNG

Installation

```bash cd searxNcrawl python -m venv .venv source .venv/bin/activate pip install -e .

Install playwright browsers (required!)

playwright install chromium ```

Or via installed script

crawl-mcp --transport http --port 8000

Running with Docker Compose

Create a .env file (see .env.example) and run:

docker compose up --build

The MCP HTTP port is configurable via MCP_PORT in .env. Default is 9555, so the server is available at http://localhost:9555/mcp.

To run real‑world checks against the Docker setup (crawl, crawl_site, search), use:

scripts/test-realworld.sh

For extended tests including new features (remove_links, Unicode handling, schema validation):

scripts/test-extended.sh

Option 1: Copy example to user config

mkdir -p ~/.config/searxncrawl cp .env.example ~/.config/searxncrawl/.env

https://example.com/page1

Crawled: 2025-01-09 12:00:00 UTC

[Page content as markdown...]

---

https://example.com/page2

Crawled: 2025-01-09 12:00:01 UTC

[Page content as markdown...] ```

CLI Usage

After installation (pip install -e .), the crawl and search commands are available globally.

Linux example

google-chrome --remote-debugging-port=9222 --user-data-dir="$HOME/.chrome-cdp-searxncrawl"


2) Log in manually to your target app in that browser.

3) List selectable sessions:

bash crawl-capture --cdp-url http://127.0.0.1:9222 --list-sessions


4) Export by explicit session index:

bash crawl-capture \ --cdp-url http://127.0.0.1:9222 \ --cdp-session 2 \ --output ./state.json


Or let CLI selection guide you interactively:

bash crawl-capture \ --cdp-url http://127.0.0.1:9222 \ --list-sessions \ --select \ --output ./state.json


After capture/export, use the file for authenticated crawling:

bash crawl https://example.com/private --storage-state ./state.json ```

Explicit outcomes: - success (exit 0): storage state written. - timeout (exit 2): completion condition not reached in time (manual flow only). - abort (exit 130): browser/session closed before completion (manual flow only).

Safety notes: - Keep storage_state files out of version control. - Capture/export is intentionally isolated from standard crawl / crawl_site execution paths. - If multiple tabs share one browser context/profile, they share the same exported session state.

Environment Variables

Variable	Default	Description
`SEARXNG_URL`	`http://localhost:8888`	SearXNG instance URL
`SEARXNG_USERNAME`	(none)	Optional basic auth username
`SEARXNG_PASSWORD`	(none)	Optional basic auth password

SearXNG Instance Requirements

SearXNG is a privacy-respecting metasearch engine that aggregates results from multiple search engines without tracking users. To use the search functionality of searxNcrawl, you need access to a SearXNG instance with:

JSON output format enabled – The instance must have JSON format enabled in its configuration (this is typically set in settings.yml under search.formats).
Network accessibility – The instance must be reachable from where you run searxNcrawl.

You can either self-host a SearXNG instance or use a public one. For reliable results, self-hosting is recommended as public instances may have rate limits or restricted API access.

Configuration File Search Order

The CLI tools (crawl, search) look for .env files in this order:

Current directory - ./.env
User config - ~/.config/searxncrawl/.env

If no .env is found and .env.example exists in the package, it will be automatically copied to ~/.config/searxncrawl/.env as a starting point.

Quick setup for global CLI usage:

```bash

Option 2: Export environment variable

export SEARXNG_URL=http://your-searxng:8888


#### CORS Configuration (HTTP Transport)

When using HTTP transport, browser-based MCP clients may need CORS (Cross-Origin Resource Sharing) headers. Use `--cors-origins` to enable them:

bash

MCP Harness Configuration

Add to your MCP client configuration (examples include Zed, opencode, antigravity, VS Code, Claude Code, Codex, OpenClaw, etc.):

{
  "mcpServers": {
    "crawler": {
      "command": "python",
      "args": ["-m", "crawler.mcp_server"],
      "cwd": "/path/to/searxNcrawl",
      "env": {
        "SEARXNG_URL": "http://your-searxng-instance:8888"
      }
    }
  }
}

Or with uv:

{
  "mcpServers": {
    "crawler": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/searxNcrawl", "python", "-m", "crawler.mcp_server"],
      "env": {
        "SEARXNG_URL": "http://your-searxng-instance:8888"
      }
    }
  }
}

OpenClaw Configuration

OpenClaw is a popular autonomous AI agent (150k+ GitHub stars) that supports MCP natively. To integrate searxNcrawl with OpenClaw, add the following to your OpenClaw MCP config file (~/.clawdbot/mcp.json or openclaw.json):

Python with venv:

{
  "searxNcrawl": {
    "command": "python",
    "args": ["-m", "crawler.mcp_server"],
    "cwd": "/path/to/searxNcrawl",
    "env": {
      "SEARXNG_URL": "http://your-searxng-instance:8888"
    }
  }
}

With uv (no manual venv needed):

{
  "searxNcrawl": {
    "command": "uv",
    "args": ["run", "--directory", "/path/to/searxNcrawl", "python", "-m", "crawler.mcp_server"],
    "env": {
      "SEARXNG_URL": "http://your-searxng-instance:8888"
    }
  }
}

Docker HTTP endpoint:

If you prefer running searxNcrawl via Docker, start the server with:

docker compose up --build

Then configure OpenClaw to connect to the HTTP endpoint at http://localhost:9555/mcp.

Once configured, OpenClaw will have access to the crawl, crawl_site, and search tools.

Configuration

The default configuration is optimized for documentation sites. For advanced customization:

```python from crawler import crawl_page_async, build_markdown_run_config, RunConfigOverrides

Custom configuration

config = build_markdown_run_config( RunConfigOverrides( delay_before_return_html=1.0, # Wait longer for JS mean_delay=1.0, # Delay between requests scan_full_page=True, ) )

doc = await crawl_page_async("https://example.com", config=config) ```

CLI Tools

crawl - Crawl pages from the command line
crawl-capture - Manual login capture + CDP session list/select/export
search - Search the web via SearXNG
Global installation - Available system-wide after pip install -e .

Python API

Output as JSON (includes metadata and references)

crawl https://example.com --json

STDIO transport (for MCP harnesses such as Zed, opencode, antigravity, VS Code, Claude Code, Codex, OpenClaw, etc.)

python -m crawler.mcp_server

🎯 aiskill88 AI 点评 A 级 2026-05-28

功能齐全，代码质量较好

⚡ 核心功能

通过标准 MCP 协议与 Claude、Cursor 等主流 AI 客户端深度集成
提供结构化工具调用接口，显著降低 AI 集成复杂度
支持 Claude Desktop 和 Claude Code 无缝接入，开箱即用
可与其他 MCP 工具组合叠加，构建完整 AI 工作站
轻量无侵入设计，不影响现有系统架构

👥 适合人群

Claude Desktop / Claude Code 用户AI 工具开发者需要扩展 AI 能力的专业人士自动化工程师

🎯 使用场景

在 Claude Desktop 对话中直接调用本地工具，实现 AI 与系统的深度联动
通过自然语言驱动复杂的多步骤自动化任务，代替繁琐手动操作
将多个 MCP 工具组合使用，构建个人专属 AI 工作站

⚖️ 优点与不足

✅ 优点

+MIT 协议，可免费商用
+标准化 MCP 协议，生态互联性强
+与 Claude 官方生态无缝对接
+即插即用，配置简单快捷

⚠️ 不足

−依赖 Claude 客户端，非 Claude 用户无法使用
−MCP 协议仍在持续演进，接口可能变更
−需要一定的配置步骤

⚠️ 使用须知

AI Skill Hub 为第三方内容聚合平台，本页面信息基于公开数据整理，不对工具功能和质量作任何法律背书。

建议在沙箱或测试环境中充分验证后，再部署至生产环境，并做好必要的安全评估。

📄 License 说明

🔗 相关工具推荐

🧩 你可能还需要

基于当前 Skill 的能力图谱，自动补全的工具组合

技能寻求者

MCP · Agent · 工作流

natively-cluely-ai-assistant — Claude Skill 中文使用文档

免费开源的AI面试助手，实时转录，隐蔽模式，局部RAG，BYOK。无订阅，防止数据泄露。

total-agent-memory MCP工具

为Claude Code和Codex CLI提供持久化记忆功能的开源MCP工具。自动提取知识图谱，支持多轮对话上下文保留，适合需要长期记忆和

❓ 常见问题 FAQ

如何使用searxNcrawl?−

参考项目文档和示例代码

MCP 是什么？和普通 API 有什么区别？+

我需要编程基础才能使用这个 MCP 工具吗？+

这个工具支持 Claude Code 吗？还是只有 Claude Desktop？+

MCP 工具是否会访问我的本地文件或数据？+

多个 MCP 工具可以同时使用吗？+

💡 AI Skill Hub 点评

经综合评估，SearxNcrawl 在MCP工具赛道中表现稳健，质量良好。如果你已有明确的使用需求，可以直接上手体验；如果还在评估阶段，建议对比同类工具后再做决策。

⬇️ 获取与下载

⬇ 下载源码 ZIP

✅ MIT 协议 · 可免费商用 · 直接从 aiskill88 服务器下载，无需跳转 GitHub

📚 深入学习 SearxNcrawl

查看分步骤安装教程和完整使用指南，快速上手这款工具

⚙️ 安装教程 📚 使用教程

🌐 原始信息

原始名称	`searxNcrawl`
原始描述	开源MCP工具：MCP Server and CLI Tools for searxing and fetching websites。⭐105 · Python
Topics	`mcppythonweb爬虫`
GitHub	https://github.com/DasDigitaleMomentum/searxNcrawl
License	MIT
语言	Python

🔗 原始来源

🐙 GitHub 仓库 https://github.com/DasDigitaleMomentum/searxNcrawl

收录时间：2026-05-28 · 更新时间：2026-05-28 · License：MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。

SearxNcrawl

📚 深度解析

📋 工具概览

📖 中文文档

searxNcrawl

Features

Enable CORS for specific origins (required for browser-based MCP clients)

Dependencies

Installation

Install playwright browsers (required!)

Or via installed script

Running with Docker Compose

Option 1: Copy example to user config

https://example.com/page1

https://example.com/page2

CLI Usage

Linux example

Environment Variables

SearXNG Instance Requirements

Configuration File Search Order

Option 2: Export environment variable

MCP Harness Configuration

OpenClaw Configuration

Configuration

Custom configuration

CLI Tools

Python API

Output as JSON (includes metadata and references)

STDIO transport (for MCP harnesses such as Zed, opencode, antigravity, VS Code, Claude Code, Codex, OpenClaw, etc.)

⚡ 核心功能

👥 适合人群

🎯 使用场景

⚖️ 优点与不足

🔗 相关工具推荐

❓ 常见问题 FAQ

🤖 交给 Agent 安装 · SearxNcrawl