📄 工具详情 ⚙️ 安装教程 📚 使用教程

能力标签

🔌 MCP 🤖 Agent 🔄 工作流 👁 OCR 🐳 Docker 💻 CLI 🔗 REST API 🧬 Embedding 📚 RAG 🖼 视觉

🔌

MCP工具

档案代理

Q: Archive-Agent 如何安装和开始使用？

访问 Archive-Agent 的 GitHub 仓库或官方网站，按照 README 文档中的步骤安装依赖并运行。通常需要 Python 3.8+ 或 Node.js 16+ 基础环境。

Q: Archive-Agent 是否免费？许可证是什么？

Archive-Agent 完全免费，采用 GPL-3.0 许可证开源发布，任何人都可以免费使用、修改和分发。

Q: Archive-Agent 适合哪些用户使用？

Archive-Agent 主要面向有一定技术基础的用户，包括开发者、数据分析师、AI 工程师等专业人士。

Q: Archive-Agent 的社区活跃度和项目维护状况如何？

Archive-Agent 在 GitHub 上已获得 59 个 Star，处于积极发展阶段，社区在持续扩大。

基于 Python · 让 AI 助手直接操作你的系统与工具

英文名：Archive-Agent

⭐ 59 Stars 🍴 9 Forks 💻 Python 📄 GPL-3.0 🏷 AI 7.5分

7.5AI 综合评分

lm-studiomcpocrollama

⬇ 下载源码（GPL） ⚙️ 配置说明

✦ AI Skill Hub 推荐

AI Skill Hub 推荐使用：档案代理是一款优质的MCP工具。AI 综合评分 7.5 分，在同类工具中表现稳健。如果你正在寻找可靠的MCP工具解决方案，这是一个值得深入了解的选择。

📚 深度解析

档案代理是一款基于 MCP（Model Context Protocol）标准协议的 AI 工具扩展。MCP 协议由 Anthropic 开发并开源，旨在建立 AI 模型与外部工具之间的标准化通信接口，目前已被 Claude Desktop、Claude Code、Cursor 等主流 AI 工具采纳。

通过安装档案代理，你的 AI 助手将获得额外的工具调用能力，可以用自然语言直接操控该工具的功能，无需学习复杂的命令行语法。MCP 工具的核心价值在于"一次配置，永久增强"——配置完成后，每次与 AI 对话时都可以无缝调用这些工具。

在技术实现上，MCP 工具通过标准的 JSON-RPC 协议与 AI 客户端通信，工具的功能以"工具列表"的形式暴露给 AI 模型，AI 可以按需调用。档案代理提供了结构化的工具调用接口，使 AI 模型能够精确地理解和使用每个功能点，显著降低 AI 在工具使用上的错误率。

与传统的 API 集成相比，MCP 工具的优势在于无需编写代码——用户只需在配置文件中添加几行 JSON，即可让 AI 获得全新能力。AI Skill Hub 将档案代理评为 AI 评分 7.5 分，属于同类工具中的优质选择。

📋 工具概览

档案代理是一款遵循 MCP（Model Context Protocol）标准协议的 AI 工具扩展。通过 MCP 协议，它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务，实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用，都可以通过自然语言在 AI 对话中直接触发，极大提升生产效率。

GitHub Stars

⭐ 59

开发语言

Python

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

GPL-3.0

AI 综合评分

7.5 分

工具类型

MCP工具

Forks

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

📌 核心特色

通过标准 MCP 协议与 Claude、Cursor 等主流 AI 客户端深度集成
提供结构化工具调用接口，显著降低 AI 集成复杂度
支持 Claude Desktop 和 Claude Code 无缝接入，开箱即用
可与其他 MCP 工具组合叠加，构建完整 AI 工作站
轻量无侵入设计，不影响现有系统架构

🎯 主要使用场景

在 Claude Desktop 对话中直接调用本地工具，实现 AI 与系统的深度联动
通过自然语言驱动复杂的多步骤自动化任务，代替繁琐手动操作
将多个 MCP 工具组合使用，构建个人专属 AI 工作站

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# 方式一：通过 Claude Code CLI 一键安装
claude skill install https://github.com/shredEngineer/Archive-Agent

# 方式二：手动配置 claude_desktop_config.json
{
  "mcpServers": {
    "----": {
      "command": "npx",
      "args": ["-y", "archive-agent"]
    }
  }
}

# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json

📋 安装步骤说明

确认已安装 Node.js（v18 或以上版本）
打开 Claude Desktop 或 Claude Code 的 MCP 配置文件
按「交给 Agent 安装 → Claude Desktop」标签中的 JSON 配置填入 mcpServers 字段
保存配置文件并重启 Claude 客户端
重启后，在对话中即可使用本工具

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 安装后在 Claude 对话中直接使用
# 示例：
用户: 请帮我用 档案代理 执行以下任务...
Claude: [自动调用 档案代理 MCP 工具处理请求]

# 查看可用工具列表
# 在 Claude 中输入："列出所有可用的 MCP 工具"

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

// claude_desktop_config.json 配置示例
{
  "mcpServers": {
    "____": {
      "command": "npx",
      "args": ["-y", "archive-agent"],
      "env": {
        // "API_KEY": "your-api-key-here"
      }
    }
  }
}

// 保存后重启 Claude Desktop 生效

📑 README 深度解析真实文档完整度 64/100 含工作流图查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

简介

---

---

Install Archive Agent

Please install these requirements before proceeding:

Docker (for running Qdrant server)
Python >= 3.10 (core runtime) (usually already installed)

AI provider setup

Archive Agent lets you choose between different AI providers:

Remote APIs (higher performance and cost, less privacy):
OpenAI: Requires an OpenAI API key.
OpenRouter: Requires an OpenRouter API key. Access to 400+ models.

Local APIs (lower performance and cost, best privacy):
Ollama: Requires Ollama running locally.
LM Studio: Requires LM Studio running locally.

💡 Good to know: You will be prompted to choose an AI provider at startup; see: Run Archive Agent.

📌 Note: You can customize the specific models used by the AI provider in the Archive Agent settings. However, you cannot change the AI provider of an existing profile, as the embeddings will be incompatible; to choose a different AI provider, create a new profile instead.

OpenAI provider setup

If the OpenAI provider is selected, Archive Agent requires the OpenAI API key.

To export your OpenAI API key, replace sk-... with your actual key and run this once:

echo "export OPENAI_API_KEY='sk-...'" >> ~/.bashrc && source ~/.bashrc

This will persist the export for the current user.

💡 Good to know: OpenAI won't use your data for training.

OpenRouter provider setup

If the OpenRouter provider is selected, Archive Agent requires an OpenRouter API key.

OpenRouter provides a unified API to access 400+ models from many providers (OpenAI, Google, Anthropic, Meta, and more) through a single endpoint.

To export your OpenRouter API key, replace sk-or-... with your actual key and run this once:

echo "export OPENROUTER_API_KEY='sk-or-...'" >> ~/.bashrc && source ~/.bashrc

This will persist the export for the current user.

With the default Archive Agent Settings, these OpenRouter models are used:

Task	Default Model	Input/Output Cost
Chunk	`google/gemini-2.5-flash-lite`	$0.10 / $0.40 per M tokens
Rerank	`google/gemini-2.5-flash-lite`	$0.10 / $0.40 per M tokens
Query	`google/gemini-2.5-flash`	$0.30 / $2.50 per M tokens
Vision	`google/gemini-2.5-flash`	$0.30 / $2.50 per M tokens
Embed	`openai/text-embedding-3-large`	$0.13 per M tokens

💡 Good to know: You can customize the models in the Archive Agent settings. OpenRouter supports structured outputs, embeddings, and vision across many models. Browse all available models at openrouter.ai/models.

Ollama provider setup

If the Ollama provider is selected, Archive Agent requires Ollama running at http://localhost:11434.

How to install Ollama.

With the default Archive Agent Settings, these Ollama models are expected to be installed:

ollama pull llama3.1:8b             # for chunk/rerank/query
ollama pull llava:7b-v1.6           # for vision
ollama pull nomic-embed-text:v1.5   # for embed

💡 Good to know: Ollama also works without a GPU. At least 32 GiB RAM is recommended for smooth performance.

LM Studio provider setup

If the LM Studio provider is selected, Archive Agent requires LM Studio running at http://localhost:1234.

How to install LM Studio.

With the default Archive Agent Settings, these LM Studio models are expected to be installed:

meta-llama-3.1-8b-instruct              # for chunk/rerank/query
llava-v1.5-7b                           # for vision
text-embedding-nomic-embed-text-v1.5    # for embed

💡 Good to know: LM Studio also works without a GPU. At least 32 GiB RAM is recommended for smooth performance.

---

Quickstart on the command line (CLI)

For example, to track your documents and images, run this:

archive-agent include "~/Documents/**" "~/Images/**"
archive-agent update

To start the GUI, run this:

archive-agent

Or, to ask questions from the command line:

archive-agent query "Which files mention donuts?"

---

Developer's guide

Archive Agent was written from scratch for educational purposes (on either end of the software).

💡 Good to know: Tracking the test_data/ gets you started with some kind of test data.

Open current profile config in nano

To open the current profile's config (JSON) in the nano editor, run this:

archive-agent config

See Archive Agent settings for details.

Archive Agent settings

Archive Agent settings are organized as profile folders in ~/.archive-agent-settings/.

E.g., the default profile is located in ~/.archive-agent-settings/default/.

The currently used profile is stored in ~/.archive-agent-settings/profile.json.

📌 Note: To delete a profile, simply delete the profile folder. This will not delete the Qdrant collection (see Qdrant database).

Profile configuration

The profile configuration is contained in the profile folder as config.json.

💡 Good to know: Use the config CLI command to open the current profile's config (JSON) in the nano editor (see Open current profile config in nano).

💡 Good to know: Use the switch CLI command to switch to a new or existing profile (see Create or switch profile).

Key	Description
`config_version`	Config version
`mcp_server_host`	MCP server host (default `http://127.0.0.1`; set to `http://0.0.0.0` to expose in LAN)
`mcp_server_port`	MCP server port (default `8008`)
`ocr_strategy`	OCR strategy in [`DecoderSettings.py`](archive_agent/config/DecoderSettings.py)
`ocr_auto_threshold`	Minimum number of characters for `auto` OCR strategy to resolve to `relaxed` instead of `strict`
`image_ocr`	Image handling: `true` enables OCR, `false` disables it.
`image_entity_extract`	Image handling: `true` enables entity extraction, `false` disables it.
`chunk_lines_block`	Number of lines per block for chunking
`chunk_words_target`	Target number of words per chunk
`qdrant_server_url`	URL of the Qdrant server
`qdrant_collection`	Name of the Qdrant collection
`retrieve_score_min`	Minimum similarity score of retrieved chunks (`0`...`1`)
`retrieve_chunks_max`	Maximum number of retrieved chunks
`retrieve_knee_enable`	Adaptive cutoff for retrieval (`true` enables knee-based cutoff, `false` disables it)
`retrieve_knee_sensitivity`	Knee detection sensitivity (Kneedle `S` parameter; higher = more conservative)
`retrieve_knee_min_chunks`	Minimum number of chunks to keep when adaptive cutoff is applied
`rerank_chunks_max`	Number of top chunks to keep after reranking
`expand_chunks_radius`	Number of preceding and following chunks to prepend and append to each reranked chunk
`max_workers_ingest`	Maximum number of files to process in parallel, creating one thread for each file
`max_workers_vision`	Maxmimum number of parallel vision requests per file, creating one thread per request
`max_workers_embed`	Maxmimum number of parallel embedding requests per file, creating one thread per request
`ai_provider`	AI provider in [`ai_provider_registry.py`](archive_agent/ai_provider/ai_provider_registry.py)
`ai_server_url`	AI server URL
`ai_model_chunk`	AI model used for chunking
`ai_model_embed`	AI model used for embedding
`ai_model_rerank`	AI model used for reranking
`ai_model_query`	AI model used for queries
`ai_model_vision`	AI model used for vision (`""` disables vision)
`ai_vector_size`	Vector size of embeddings (used for Qdrant collection)
`ai_temperature_query`	Temperature of the query model (ignored for GPT-5)

📌 Note: When using GPT-5 (default as of Archive Agent v14.0.0), ai_temperature_query is ignored. GPT-5 reasoning effort and verbosity are currently not available in the configuration, but may be customized directly inside OpenAiProvider.py.

📌 Note: Since max_workers_vision and max_workers_embed requests are processed in parallel per file, and max_workers_ingest files are processed in parallel, the total number of requests multiplies quickly. Adjust according to your system resources and in alignment with your AI provider's rate limits.

How chunk references work

To ensure that every chunk can be traced back to its origin, Archive Agent maps the text contents of each chunk to the corresponding line numbers or page numbers of the source file.

Line-based files (e.g., .txt) use the range of line numbers as reference.
Page-based files (e.g., .pdf) use the range of page numbers as reference.

📌 Note: References are only approximate due to paragraph/sentence splitting/joining in the chunking process.

---

CLI command reference

Important modules

To get started, check out these epic modules:

Files are processed in archive_agent/data/FileData.py
The app context is initialized in archive_agent/core/ContextManager.py
The default config is defined in archive_agent/config/ConfigManager.py
The CLI commands are defined in archive_agent/__main__.py
The commit logic is implemented in archive_agent/core/CommitManager.py
The CLI verbosity is handled in archive_agent/core/CliManager.py
The GUI is implemented in archive_agent/core/GuiManager.py
The AI API prompts for chunking, embedding, vision, and querying are defined in archive_agent/ai/AiManager.py
The AI provider registry is located in archive_agent/ai_provider/ai_provider_registry.py

If you miss something or spot bad patterns, feel free to contribute and refactor!

📚 实用指南（长尾问题）

适合谁

需要让 Claude / Cursor 操作本地工具的 AI 工程师
构建多智能体协作系统的 Agent 开发者
构建企业知识库 / RAG 检索应用的团队
需要从图片、PDF 提取文字的文档自动化场景

最佳实践

配置 MCP 服务器时建议使用 stdio 传输 + JSON-RPC，避免暴露公网
生产部署优先使用 Docker Compose 隔离依赖，并挂载 volume 持久化数据
本地部署优先选 GGUF 量化模型，节省显存并保持响应速度
分块大小建议 256-512 tokens，向量库优选 pgvector 或 Qdrant
Agent 任务先做 dry-run 验证工具调用链，再开启自主执行

常见错误

API key 直接提交到 git 仓库（请用 .env 并加入 .gitignore）
MCP 配置路径拼错或权限不足，重启 Claude Desktop 才生效
容器内无法访问宿主机 localhost — 使用 host.docker.internal
embedding 模型与查询模型不一致导致检索失效
显存不足直接 OOM — 优先降低 context 或换更小的量化模型
Python 依赖冲突：建议用 venv / uv 隔离环境

部署方案

Docker：Archive-Agent 提供官方镜像，docker compose up 一键启动
CLI：直接 npm install -g / pip install，命令行调用
本地部署：CPU 8GB 起，GPU 推荐 16GB+ 显存
云端托管：可放在 Vercel / Railway / Fly.io 等 PaaS 平台

⚡ 核心功能

通过标准 MCP 协议与 Claude、Cursor 等主流 AI 客户端深度集成
提供结构化工具调用接口，显著降低 AI 集成复杂度
支持 Claude Desktop 和 Claude Code 无缝接入，开箱即用
可与其他 MCP 工具组合叠加，构建完整 AI 工作站
轻量无侵入设计，不影响现有系统架构

👥 适合谁

需要让 Claude / Cursor 操作本地工具的 AI 工程师
构建多智能体协作系统的 Agent 开发者
构建企业知识库 / RAG 检索应用的团队
需要从图片、PDF 提取文字的文档自动化场景

⭐ 最佳实践

配置 MCP 服务器时建议使用 stdio 传输 + JSON-RPC，避免暴露公网
生产部署优先使用 Docker Compose 隔离依赖，并挂载 volume 持久化数据
本地部署优先选 GGUF 量化模型，节省显存并保持响应速度
分块大小建议 256-512 tokens，向量库优选 pgvector 或 Qdrant

⚠️ 常见错误

API key 直接提交到 git 仓库（请用 .env 并加入 .gitignore）
MCP 配置路径拼错或权限不足，重启 Claude Desktop 才生效
容器内无法访问宿主机 localhost — 使用 host.docker.internal
embedding 模型与查询模型不一致导致检索失效

👥 适合人群

Claude Desktop / Claude Code 用户AI 工具开发者需要扩展 AI 能力的专业人士自动化工程师

🎯 使用场景

在 Claude Desktop 对话中直接调用本地工具，实现 AI 与系统的深度联动
通过自然语言驱动复杂的多步骤自动化任务，代替繁琐手动操作
将多个 MCP 工具组合使用，构建个人专属 AI 工作站

⚖️ 优点与不足

✅ 优点

+GPL-3.0 协议，可免费商用
+标准化 MCP 协议，生态互联性强
+与 Claude 官方生态无缝对接
+即插即用，配置简单快捷

⚠️ 不足

−依赖 Claude 客户端，非 Claude 用户无法使用
−MCP 协议仍在持续演进，接口可能变更
−需要一定的配置步骤

⚠️ 使用须知

AI Skill Hub 为第三方内容聚合平台，本页面信息基于公开数据整理，不对工具功能和质量作任何法律背书。

建议在沙箱或测试环境中充分验证后，再部署至生产环境，并做好必要的安全评估。

📄 License 说明

⚠️ GPL 3.0 — 强 Copyleft，衍生作品须开源，含专利保护条款，不可闭源使用。

🔗 相关工具推荐

📚 相关教程推荐

Cursor AI 编程完全指南：Rules 配置、Composer 使用、MCP 集成

帮助中心 · AI Skill Hub

MCP 工作流生产级配置方案：从开发环境到团队共享

📰 相关 AI 新闻

🍿 AI 圈相关吃瓜

配了5个 MCP 工具，Claude 一个都没用

AI 圈观察

Filesystem MCP 帮 Claude 找文件，找了整个 node_modules

AI 圈观察

AutoGPT 自主完成了任务：把我的文件夹全部重命名了

🗺️ 相关解决方案

ai-workflow-templates

ocr

document-ocr-pipeline

🧩 你可能还需要

基于当前 Skill 的能力图谱，自动补全的工具组合

技能寻求者

MCP · Agent · 工作流

Augustus

LLM安全测试框架，检测prompt注入、越狱等

MassGen多智能体系统

MCP · Agent · 工作流

natively-cluely-ai-assistant — Claude Skill 中文使用文档

免费开源的AI面试助手，实时转录，隐蔽模式，局部RAG，BYOK。无订阅，防止数据泄露。

开源MCP工具：MaverickMCP

MaverickMCP - Personal Stock Analysis MCP Server，帮助个人进行股票分析。

CrewAI 多代理协作平台

MCP · Agent · 工作流

❓ 常见问题 FAQ

Archive-Agent 是什么工具？−

Archive-Agent 是一款Python开发的AI辅助工具。开源MCP工具：Find your files with natural language and ask questions.。⭐59 · Python 主要应用场景包括：快速查找文件。

Archive-Agent 如何安装和开始使用？+

Archive-Agent 是否免费？许可证是什么？+

Archive-Agent 适合哪些用户使用？+

Archive-Agent 的社区活跃度和项目维护状况如何？+

MCP 是什么？和普通 API 有什么区别？+

我需要编程基础才能使用这个 MCP 工具吗？+

这个工具支持 Claude Code 吗？还是只有 Claude Desktop？+

💡 AI Skill Hub 点评

总体来看，档案代理是一款质量良好的MCP工具，在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态，建议收藏备用，结合自身场景选择合适时机引入使用。

⬇️ 获取与下载

⬇ 下载源码（GPL）

⚠️ 本工具使用 GPL-3.0 协议。您可以自由下载和使用，但衍生作品必须以相同协议开源，不可商业闭源。使用前请确认符合协议要求。

📚 深入学习档案代理

查看分步骤安装教程和完整使用指南，快速上手这款工具

⚙️ 安装教程 📚 使用教程

🌐 原始信息

原始名称	`Archive-Agent`
原始描述	开源MCP工具：Find your files with natural language and ask questions.。⭐59 · Python
Topics	`lm-studiomcpocrollama`
GitHub	https://github.com/shredEngineer/Archive-Agent
License	GPL-3.0
语言	Python

🔗 原始来源

🐙 GitHub 仓库 https://github.com/shredEngineer/Archive-Agent 🌐 官方网站 https://wilhelm.dev

收录时间：2026-05-27 · 更新时间：2026-05-30 · License：GPL-3.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。

档案代理

📚 深度解析

📋 工具概览

📖 中文文档

简介

Just getting started?

Install Archive Agent

AI provider setup

OpenAI provider setup

OpenRouter provider setup

Ollama provider setup

LM Studio provider setup

Quickstart on the command line (CLI)

Developer's guide

Open current profile config in nano

Archive Agent settings

Profile configuration

How chunk references work

CLI command reference

Important modules

⚡ 核心功能

👥 适合人群

🎯 使用场景

⚖️ 优点与不足

🔗 相关工具推荐

❓ 常见问题 FAQ

🤖 交给 Agent 安装 · 档案代理