能力标签
SearxNcrawl
🔌
MCP工具

SearxNcrawl

基于 Python · 让 AI 助手直接操作你的系统与工具
英文名:searxNcrawl
⭐ 105 Stars 🍴 9 Forks 💻 Python 📄 MIT 🏷 AI 7.5分
7.5AI 综合评分
mcppythonweb爬虫
✦ AI Skill Hub 推荐

SearxNcrawl 是 AI Skill Hub 本期精选MCP工具之一。综合评分 7.5 分,整体质量较高。我们推荐使用将其纳入你的 AI 工具库,帮助提升工作效率。

📚 深度解析

SearxNcrawl 是一款基于 MCP(Model Context Protocol)标准协议的 AI 工具扩展。MCP 协议由 Anthropic 开发并开源,旨在建立 AI 模型与外部工具之间的标准化通信接口,目前已被 Claude Desktop、Claude Code、Cursor 等主流 AI 工具采纳。

通过安装 SearxNcrawl,你的 AI 助手将获得额外的工具调用能力,可以用自然语言直接操控该工具的功能,无需学习复杂的命令行语法。MCP 工具的核心价值在于"一次配置,永久增强"——配置完成后,每次与 AI 对话时都可以无缝调用这些工具。

在技术实现上,MCP 工具通过标准的 JSON-RPC 协议与 AI 客户端通信,工具的功能以"工具列表"的形式暴露给 AI 模型,AI 可以按需调用。SearxNcrawl 提供了结构化的工具调用接口,使 AI 模型能够精确地理解和使用每个功能点,显著降低 AI 在工具使用上的错误率。

与传统的 API 集成相比,MCP 工具的优势在于无需编写代码——用户只需在配置文件中添加几行 JSON,即可让 AI 获得全新能力。AI Skill Hub 将 SearxNcrawl 评为 AI 评分 7.5 分,属于同类工具中的优质选择。

📋 工具概览

SearxNcrawl 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。

GitHub Stars
⭐ 105
开发语言
Python
支持平台
Windows / macOS / Linux
维护状态
轻量级项目,按需更新
开源协议
MIT
AI 综合评分
7.5 分
工具类型
MCP工具
Forks
9

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理,如需查看完整原始文档请访问底部「原始来源」。

SearxNcrawl 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。

📌 核心特色
  • 通过标准 MCP 协议与 Claude、Cursor 等主流 AI 客户端深度集成
  • 提供结构化工具调用接口,显著降低 AI 集成复杂度
  • 支持 Claude Desktop 和 Claude Code 无缝接入,开箱即用
  • 可与其他 MCP 工具组合叠加,构建完整 AI 工作站
  • 轻量无侵入设计,不影响现有系统架构
🎯 主要使用场景
  • 在 Claude Desktop 对话中直接调用本地工具,实现 AI 与系统的深度联动
  • 通过自然语言驱动复杂的多步骤自动化任务,代替繁琐手动操作
  • 将多个 MCP 工具组合使用,构建个人专属 AI 工作站
以下安装命令基于项目开发语言和类型自动生成,实际以官方 README 为准。
安装命令
# 方式一:通过 Claude Code CLI 一键安装
claude skill install https://github.com/DasDigitaleMomentum/searxNcrawl

# 方式二:手动配置 claude_desktop_config.json
{
  "mcpServers": {
    "searxncrawl": {
      "command": "npx",
      "args": ["-y", "searxncrawl"]
    }
  }
}

# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json
📋 安装步骤说明
  1. 确认已安装 Node.js(v18 或以上版本)
  2. 打开 Claude Desktop 或 Claude Code 的 MCP 配置文件
  3. 按「交给 Agent 安装 → Claude Desktop」标签中的 JSON 配置填入 mcpServers 字段
  4. 保存配置文件并重启 Claude 客户端
  5. 重启后,在对话中即可使用本工具
以下用法示例由 AI Skill Hub 整理,涵盖最常见的使用场景。
常用命令 / 代码示例
# 安装后在 Claude 对话中直接使用
# 示例:
用户: 请帮我用 SearxNcrawl 执行以下任务...
Claude: [自动调用 SearxNcrawl MCP 工具处理请求]

# 查看可用工具列表
# 在 Claude 中输入:"列出所有可用的 MCP 工具"
以下配置示例基于典型使用场景生成,具体参数请参照官方文档调整。
配置示例
// claude_desktop_config.json 配置示例
{
  "mcpServers": {
    "searxncrawl": {
      "command": "npx",
      "args": ["-y", "searxncrawl"],
      "env": {
        // "API_KEY": "your-api-key-here"
      }
    }
  }
}

// 保存后重启 Claude Desktop 生效
📑 README 深度解析 真实文档 完整度 79/100 查看 GitHub 原文 →
以下内容由系统直接从 GitHub README 解析整理,保留代码块、表格与列表结构。

searxNcrawl

searxNcrawl is a minimal MCP server and CLI toolkit for search and crawling, built on top of Crawl4AI and SearXNG.

This project is published as searxNcrawl at https://github.com/DasDigitaleMomentum/searxNcrawl and is maintained by DDM – Das Digitale Momentum GmbH & Co KG. It is the successor to searxng-mcp https://github.com/tisDDM/searxng-mcp (which should be marked deprecated).

Compared to plain Crawl4AI usage, searxNcrawl provides a proven, production-tested crawl configuration for documentation-heavy sites, optimized for clean, model-ready Markdown with less noise and better token efficiency.

It also includes built-in markdown deduplication and early support for authenticated crawling (WIP) via Playwright storage state — including a practical CDP export flow for real Chrome/Chromium login sessions.

Features

Enable CORS for specific origins (required for browser-based MCP clients)

python -m crawler.mcp_server --transport http --cors-origins "http://localhost:3000,https://myapp.com"

Dependencies

Minimal dependencies:

  • crawl4ai>=0.7.4 - The underlying crawler engine
  • tldextract>=5.1.2 - Domain parsing for site crawls
  • playwright>=1.40.0 - Browser automation
  • fastmcp>=2.0.0 - MCP server framework
  • httpx>=0.27.0 - HTTP client for SearXNG

Installation

```bash cd searxNcrawl python -m venv .venv source .venv/bin/activate pip install -e .

Install playwright browsers (required!)

playwright install chromium ```

Or via installed script

crawl-mcp --transport http --port 8000

Running with Docker Compose

Create a .env file (see .env.example) and run:

docker compose up --build

The MCP HTTP port is configurable via MCP_PORT in .env. Default is 9555, so the server is available at http://localhost:9555/mcp.

To run real‑world checks against the Docker setup (crawl, crawl_site, search), use:

scripts/test-realworld.sh

For extended tests including new features (remove_links, Unicode handling, schema validation):

scripts/test-extended.sh

Option 1: Copy example to user config

mkdir -p ~/.config/searxncrawl cp .env.example ~/.config/searxncrawl/.env

https://example.com/page1

Crawled: 2025-01-09 12:00:00 UTC

[Page content as markdown...]

---

https://example.com/page2

Crawled: 2025-01-09 12:00:01 UTC

[Page content as markdown...] ```

CLI Usage

After installation (pip install -e .), the crawl and search commands are available globally.

Linux example

google-chrome --remote-debugging-port=9222 --user-data-dir="$HOME/.chrome-cdp-searxncrawl"


2) Log in manually to your target app in that browser.

3) List selectable sessions:
bash crawl-capture --cdp-url http://127.0.0.1:9222 --list-sessions

4) Export by explicit session index:
bash crawl-capture \ --cdp-url http://127.0.0.1:9222 \ --cdp-session 2 \ --output ./state.json

Or let CLI selection guide you interactively:
bash crawl-capture \ --cdp-url http://127.0.0.1:9222 \ --list-sessions \ --select \ --output ./state.json

After capture/export, use the file for authenticated crawling:
bash crawl https://example.com/private --storage-state ./state.json ```

Explicit outcomes: - success (exit 0): storage state written. - timeout (exit 2): completion condition not reached in time (manual flow only). - abort (exit 130): browser/session closed before completion (manual flow only).

Safety notes: - Keep storage_state files out of version control. - Capture/export is intentionally isolated from standard crawl / crawl_site execution paths. - If multiple tabs share one browser context/profile, they share the same exported session state.

Environment Variables

VariableDefaultDescription
SEARXNG_URLhttp://localhost:8888SearXNG instance URL
SEARXNG_USERNAME(none)Optional basic auth username
SEARXNG_PASSWORD(none)Optional basic auth password

SearXNG Instance Requirements

SearXNG is a privacy-respecting metasearch engine that aggregates results from multiple search engines without tracking users. To use the search functionality of searxNcrawl, you need access to a SearXNG instance with:

  • JSON output format enabled – The instance must have JSON format enabled in its configuration (this is typically set in settings.yml under search.formats).
  • Network accessibility – The instance must be reachable from where you run searxNcrawl.

You can either self-host a SearXNG instance or use a public one. For reliable results, self-hosting is recommended as public instances may have rate limits or restricted API access.

Configuration File Search Order

The CLI tools (crawl, search) look for .env files in this order:

  1. Current directory - ./.env
  2. User config - ~/.config/searxncrawl/.env

If no .env is found and .env.example exists in the package, it will be automatically copied to ~/.config/searxncrawl/.env as a starting point.

Quick setup for global CLI usage:

```bash

Option 2: Export environment variable

export SEARXNG_URL=http://your-searxng:8888


#### CORS Configuration (HTTP Transport)

When using HTTP transport, browser-based MCP clients may need CORS (Cross-Origin Resource Sharing) headers. Use `--cors-origins` to enable them:
bash

MCP Harness Configuration

Add to your MCP client configuration (examples include Zed, opencode, antigravity, VS Code, Claude Code, Codex, OpenClaw, etc.):

{
  "mcpServers": {
    "crawler": {
      "command": "python",
      "args": ["-m", "crawler.mcp_server"],
      "cwd": "/path/to/searxNcrawl",
      "env": {
        "SEARXNG_URL": "http://your-searxng-instance:8888"
      }
    }
  }
}

Or with uv:

{
  "mcpServers": {
    "crawler": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/searxNcrawl", "python", "-m", "crawler.mcp_server"],
      "env": {
        "SEARXNG_URL": "http://your-searxng-instance:8888"
      }
    }
  }
}

OpenClaw Configuration

OpenClaw is a popular autonomous AI agent (150k+ GitHub stars) that supports MCP natively. To integrate searxNcrawl with OpenClaw, add the following to your OpenClaw MCP config file (~/.clawdbot/mcp.json or openclaw.json):

Python with venv:

{
  "searxNcrawl": {
    "command": "python",
    "args": ["-m", "crawler.mcp_server"],
    "cwd": "/path/to/searxNcrawl",
    "env": {
      "SEARXNG_URL": "http://your-searxng-instance:8888"
    }
  }
}

With uv (no manual venv needed):

{
  "searxNcrawl": {
    "command": "uv",
    "args": ["run", "--directory", "/path/to/searxNcrawl", "python", "-m", "crawler.mcp_server"],
    "env": {
      "SEARXNG_URL": "http://your-searxng-instance:8888"
    }
  }
}

Docker HTTP endpoint:

If you prefer running searxNcrawl via Docker, start the server with:

docker compose up --build

Then configure OpenClaw to connect to the HTTP endpoint at http://localhost:9555/mcp.

Once configured, OpenClaw will have access to the crawl, crawl_site, and search tools.

Configuration

The default configuration is optimized for documentation sites. For advanced customization:

```python from crawler import crawl_page_async, build_markdown_run_config, RunConfigOverrides

Custom configuration

config = build_markdown_run_config( RunConfigOverrides( delay_before_return_html=1.0, # Wait longer for JS mean_delay=1.0, # Delay between requests scan_full_page=True, ) )

doc = await crawl_page_async("https://example.com", config=config) ```

CLI Tools

  • crawl - Crawl pages from the command line
  • crawl-capture - Manual login capture + CDP session list/select/export
  • search - Search the web via SearXNG
  • Global installation - Available system-wide after pip install -e .

Python API

Output as JSON (includes metadata and references)

crawl https://example.com --json

STDIO transport (for MCP harnesses such as Zed, opencode, antigravity, VS Code, Claude Code, Codex, OpenClaw, etc.)

python -m crawler.mcp_server

🎯 aiskill88 AI 点评 A 级 2026-05-28

功能齐全,代码质量较好

⚡ 核心功能

👥 适合人群

Claude Desktop / Claude Code 用户AI 工具开发者需要扩展 AI 能力的专业人士自动化工程师

🎯 使用场景

  • 在 Claude Desktop 对话中直接调用本地工具,实现 AI 与系统的深度联动
  • 通过自然语言驱动复杂的多步骤自动化任务,代替繁琐手动操作
  • 将多个 MCP 工具组合使用,构建个人专属 AI 工作站

⚖️ 优点与不足

✅ 优点
  • +MIT 协议,可免费商用
  • +标准化 MCP 协议,生态互联性强
  • +与 Claude 官方生态无缝对接
  • +即插即用,配置简单快捷
⚠️ 不足
  • 依赖 Claude 客户端,非 Claude 用户无法使用
  • MCP 协议仍在持续演进,接口可能变更
  • 需要一定的配置步骤
⚠️ 使用须知

AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。

建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。

📄 License 说明

✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。

🔗 相关工具推荐

🧩 你可能还需要
基于当前 Skill 的能力图谱,自动补全的工具组合

❓ 常见问题 FAQ

参考项目文档和示例代码
💡 AI Skill Hub 点评

经综合评估,SearxNcrawl 在MCP工具赛道中表现稳健,质量良好。如果你已有明确的使用需求,可以直接上手体验;如果还在评估阶段,建议对比同类工具后再做决策。

⬇️ 获取与下载
⬇ 下载源码 ZIP

✅ MIT 协议 · 可免费商用 · 直接从 aiskill88 服务器下载,无需跳转 GitHub

📚 深入学习 SearxNcrawl
查看分步骤安装教程和完整使用指南,快速上手这款工具
🌐 原始信息
原始名称 searxNcrawl
原始描述 开源MCP工具:MCP Server and CLI Tools for searxing and fetching websites。⭐105 · Python
Topics mcppythonweb爬虫
GitHub https://github.com/DasDigitaleMomentum/searxNcrawl
License MIT
语言 Python
🔗 原始来源
🐙 GitHub 仓库  https://github.com/DasDigitaleMomentum/searxNcrawl

收录时间:2026-05-28 · 更新时间:2026-05-28 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。