AI Skill Hub 推荐使用:档案代理 是一款优质的MCP工具。AI 综合评分 7.5 分,在同类工具中表现稳健。如果你正在寻找可靠的MCP工具解决方案,这是一个值得深入了解的选择。
档案代理 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
档案代理 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
# 方式一:通过 Claude Code CLI 一键安装
claude skill install https://github.com/shredEngineer/Archive-Agent
# 方式二:手动配置 claude_desktop_config.json
{
"mcpServers": {
"----": {
"command": "npx",
"args": ["-y", "archive-agent"]
}
}
}
# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json
# 安装后在 Claude 对话中直接使用 # 示例: 用户: 请帮我用 档案代理 执行以下任务... Claude: [自动调用 档案代理 MCP 工具处理请求] # 查看可用工具列表 # 在 Claude 中输入:"列出所有可用的 MCP 工具"
// claude_desktop_config.json 配置示例
{
"mcpServers": {
"____": {
"command": "npx",
"args": ["-y", "archive-agent"],
"env": {
// "API_KEY": "your-api-key-here"
}
}
}
}
// 保存后重启 Claude Desktop 生效
---
---
Please install these requirements before proceeding:
Archive Agent lets you choose between different AI providers:
💡 Good to know: You will be prompted to choose an AI provider at startup; see: Run Archive Agent.
📌 Note: You can customize the specific models used by the AI provider in the Archive Agent settings. However, you cannot change the AI provider of an existing profile, as the embeddings will be incompatible; to choose a different AI provider, create a new profile instead.
If the OpenAI provider is selected, Archive Agent requires the OpenAI API key.
To export your OpenAI API key, replace sk-... with your actual key and run this once:
echo "export OPENAI_API_KEY='sk-...'" >> ~/.bashrc && source ~/.bashrc
This will persist the export for the current user.
💡 Good to know: OpenAI won't use your data for training.
If the OpenRouter provider is selected, Archive Agent requires an OpenRouter API key.
OpenRouter provides a unified API to access 400+ models from many providers (OpenAI, Google, Anthropic, Meta, and more) through a single endpoint.
To export your OpenRouter API key, replace sk-or-... with your actual key and run this once:
echo "export OPENROUTER_API_KEY='sk-or-...'" >> ~/.bashrc && source ~/.bashrc
This will persist the export for the current user.
With the default Archive Agent Settings, these OpenRouter models are used:
| Task | Default Model | Input/Output Cost |
|---|---|---|
| Chunk | google/gemini-2.5-flash-lite | $0.10 / $0.40 per M tokens |
| Rerank | google/gemini-2.5-flash-lite | $0.10 / $0.40 per M tokens |
| Query | google/gemini-2.5-flash | $0.30 / $2.50 per M tokens |
| Vision | google/gemini-2.5-flash | $0.30 / $2.50 per M tokens |
| Embed | openai/text-embedding-3-large | $0.13 per M tokens |
💡 Good to know: You can customize the models in the Archive Agent settings. OpenRouter supports structured outputs, embeddings, and vision across many models. Browse all available models at openrouter.ai/models.
If the Ollama provider is selected, Archive Agent requires Ollama running at http://localhost:11434.
With the default Archive Agent Settings, these Ollama models are expected to be installed:
ollama pull llama3.1:8b # for chunk/rerank/query
ollama pull llava:7b-v1.6 # for vision
ollama pull nomic-embed-text:v1.5 # for embed
💡 Good to know: Ollama also works without a GPU. At least 32 GiB RAM is recommended for smooth performance.
If the LM Studio provider is selected, Archive Agent requires LM Studio running at http://localhost:1234.
With the default Archive Agent Settings, these LM Studio models are expected to be installed:
meta-llama-3.1-8b-instruct # for chunk/rerank/query
llava-v1.5-7b # for vision
text-embedding-nomic-embed-text-v1.5 # for embed
💡 Good to know: LM Studio also works without a GPU. At least 32 GiB RAM is recommended for smooth performance.
---
For example, to track your documents and images, run this:
archive-agent include "~/Documents/**" "~/Images/**"
archive-agent update
To start the GUI, run this:
archive-agent
Or, to ask questions from the command line:
archive-agent query "Which files mention donuts?"
---
Archive Agent was written from scratch for educational purposes (on either end of the software).
💡 Good to know: Tracking the test_data/ gets you started with some kind of test data.
To open the current profile's config (JSON) in the nano editor, run this:
archive-agent config
See Archive Agent settings for details.
Archive Agent settings are organized as profile folders in ~/.archive-agent-settings/.
E.g., the default profile is located in ~/.archive-agent-settings/default/.
The currently used profile is stored in ~/.archive-agent-settings/profile.json.
📌 Note: To delete a profile, simply delete the profile folder. This will not delete the Qdrant collection (see Qdrant database).
The profile configuration is contained in the profile folder as config.json.
💡 Good to know: Use the config CLI command to open the current profile's config (JSON) in the nano editor (see Open current profile config in nano).
💡 Good to know: Use the switch CLI command to switch to a new or existing profile (see Create or switch profile).
| Key | Description |
|---|---|
config_version | Config version |
mcp_server_host | MCP server host (default http://127.0.0.1; set to http://0.0.0.0 to expose in LAN) |
mcp_server_port | MCP server port (default 8008) |
ocr_strategy | OCR strategy in [DecoderSettings.py](archive_agent/config/DecoderSettings.py) |
ocr_auto_threshold | Minimum number of characters for auto OCR strategy to resolve to relaxed instead of strict |
image_ocr | Image handling: true enables OCR, false disables it. |
image_entity_extract | Image handling: true enables entity extraction, false disables it. |
chunk_lines_block | Number of lines per block for chunking |
chunk_words_target | Target number of words per chunk |
qdrant_server_url | URL of the Qdrant server |
qdrant_collection | Name of the Qdrant collection |
retrieve_score_min | Minimum similarity score of retrieved chunks (0...1) |
retrieve_chunks_max | Maximum number of retrieved chunks |
retrieve_knee_enable | Adaptive cutoff for retrieval (true enables knee-based cutoff, false disables it) |
retrieve_knee_sensitivity | Knee detection sensitivity (Kneedle S parameter; higher = more conservative) |
retrieve_knee_min_chunks | Minimum number of chunks to keep when adaptive cutoff is applied |
rerank_chunks_max | Number of top chunks to keep after reranking |
expand_chunks_radius | Number of preceding and following chunks to prepend and append to each reranked chunk |
max_workers_ingest | Maximum number of files to process in parallel, creating one thread for each file |
max_workers_vision | Maxmimum number of parallel vision requests **per file**, creating one thread per request |
max_workers_embed | Maxmimum number of parallel embedding requests **per file**, creating one thread per request |
ai_provider | AI provider in [ai_provider_registry.py](archive_agent/ai_provider/ai_provider_registry.py) |
ai_server_url | AI server URL |
ai_model_chunk | AI model used for chunking |
ai_model_embed | AI model used for embedding |
ai_model_rerank | AI model used for reranking |
ai_model_query | AI model used for queries |
ai_model_vision | AI model used for vision ("" disables vision) |
ai_vector_size | Vector size of embeddings (used for Qdrant collection) |
ai_temperature_query | Temperature of the query model (ignored for GPT-5) |
📌 Note: When using GPT-5 (default as of Archive Agent v14.0.0), ai_temperature_query is ignored. GPT-5 reasoning effort and verbosity are currently not available in the configuration, but may be customized directly inside OpenAiProvider.py.
📌 Note: Since max_workers_vision and max_workers_embed requests are processed in parallel per file, and max_workers_ingest files are processed in parallel, the total number of requests multiplies quickly. Adjust according to your system resources and in alignment with your AI provider's rate limits.
To ensure that every chunk can be traced back to its origin, Archive Agent maps the text contents of each chunk to the corresponding line numbers or page numbers of the source file.
.txt) use the range of line numbers as reference..pdf) use the range of page numbers as reference.📌 Note: References are only approximate due to paragraph/sentence splitting/joining in the chunking process.
---
To get started, check out these epic modules:
archive_agent/data/FileData.pyarchive_agent/core/ContextManager.pyarchive_agent/config/ConfigManager.pyarchive_agent/__main__.pyarchive_agent/core/CommitManager.pyarchive_agent/core/CliManager.pyarchive_agent/core/GuiManager.pyarchive_agent/ai/AiManager.pyarchive_agent/ai_provider/ai_provider_registry.pyIf you miss something or spot bad patterns, feel free to contribute and refactor!
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
⚠️ GPL 3.0 — 强 Copyleft,衍生作品须开源,含专利保护条款,不可闭源使用。
总体来看,档案代理 是一款质量良好的MCP工具,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | Archive-Agent |
| 原始描述 | 开源MCP工具:Find your files with natural language and ask questions.。⭐59 · Python |
| Topics | lm-studiomcpocrollama |
| GitHub | https://github.com/shredEngineer/Archive-Agent |
| License | GPL-3.0 |
| 语言 | Python |
收录时间:2026-05-27 · 更新时间:2026-05-30 · License:GPL-3.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端