主权AI系统 是 AI Skill Hub 本期精选MCP工具之一。综合评分 7.5 分,整体质量较高。我们推荐使用将其纳入你的 AI 工具库,帮助提升工作效率。
主权AI系统 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
主权AI系统 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
# 方式一:通过 Claude Code CLI 一键安装
claude skill install https://github.com/h3rb3rn/moe-sovereign
# 方式二:手动配置 claude_desktop_config.json
{
"mcpServers": {
"--ai--": {
"command": "npx",
"args": ["-y", "moe-sovereign"]
}
}
}
# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json
# 安装后在 Claude 对话中直接使用 # 示例: 用户: 请帮我用 主权AI系统 执行以下任务... Claude: [自动调用 主权AI系统 MCP 工具处理请求] # 查看可用工具列表 # 在 Claude 中输入:"列出所有可用的 MCP 工具"
// claude_desktop_config.json 配置示例
{
"mcpServers": {
"__ai__": {
"command": "npx",
"args": ["-y", "moe-sovereign"],
"env": {
// "API_KEY": "your-api-key-here"
}
}
}
}
// 保存后重启 Claude Desktop 生效
| Resource | Minimum (solo) | Recommended (team) |
|---|---|---|
| OS | Debian 11+ / Ubuntu 22.04+ | Debian 13 (trixie) |
| RAM | 8 GB | 16 GB+ |
| CPU | 4 cores | 8 cores+ |
| Disk | 60 GB | 200 GB+ |
| GPU | None (API-only mode) | NVIDIA with CUDA, ≥ 8 GB VRAM |
| Docker | CE 24+ | Docker CE 27+ |
**+ Enterprise Stack (moe-codex)** | **+ 4 cores, + 8 GB RAM** | **+ 8 cores, + 16 GB RAM** |
The orchestrator runs on CPU. GPU VRAM is only needed on inference nodes (Ollama). The moe-codex Enterprise Data Stack (NiFi 4 GB, lakeFS 512 MB, Marquez 1.5 GB + 2× Postgres) adds significant overhead — plan a dedicated host or at least 8 GB additional RAM.
---
curl -sSL https://moe-sovereign.org/install.sh | bash
git clone https://github.com/h3rb3rn/moe-sovereign.git
cd moe-sovereign
cp .env.example .env
nano .env # Set credentials and inference server URLs
sudo docker compose up -d
curl http://localhost:8002/v1/models
| Endpoint | URL |
|---|---|
| **API** (OpenAI-compatible) | http://<host>:8002/v1 |
| **API** (Anthropic/Claude Code) | http://<host>:8002/v1/messages |
| **Admin UI** | http://<host>:8088 |
---
| Target | Status | Profile | Command |
|---|---|---|---|
| Docker Compose | **Tested** | team | docker compose up -d |
| LXC / Proxmox | **Tested** | solo | deploy/lxc/setup.sh |
| Podman (rootless) | **Tested** | team | curl -sSL https://raw.githubusercontent.com/h3rb3rn/moe-sovereign/main/install.sh \| bash |
| K3s / Kubernetes | Planned | enterprise | helm install moe charts/moe-sovereign |
| OpenShift | Untested | enterprise | helm install with openshift.enabled=true |
All targets use the same OCI image --- no code forks, no feature loss.
---
| Stage | Description |
|---|---|
| **1. Cache** | L0 query-hash (Valkey, 30 min TTL), L1 semantic similarity (ChromaDB, cosine < 0.15), and a conservative **knowledge-bypass** tier: similar-but-not-exact queries skip the LLM when the prior answer was high-confidence and still fresh (cosine < 0.25, confidence ≥ 0.85, within TTL) |
| **2. Planner** | Decomposes request into 1--4 subtasks with expert category assignment |
| **3. Experts** | T1 models (≤20B) screen with confidence gating; T2 (24--80B) engage only on low confidence |
| **4. Tools** | 28 MCP precision tools (math, subnet, date, legal, PPTX) via AST-whitelist --- zero hallucination |
| **5. GraphRAG** | Neo4j context enrichment with domain-scoped entity filters and trust-score decay. CAG layer intercepts static compliance domains (BAIT, VAIT, DORA, KRITIS) before the Neo4j query and injects pre-loaded authoritative text directly. Corrective RAG gate (Yan et al. 2024) scores each retrieved entity for query relevance and discards low-signal results before injection. Episode hints from past similar tasks are appended as routing context |
| **6. Judge** | Synthesises expert outputs, evaluates quality, retries on failure (up to 3 attempts) |
| **7. Agentic Re-Plan** | Lightweight gap detector checks completeness; if unresolved, injects findings into a new planner round (up to 3 agentic iterations) |
| **8. Ingest** | Validated knowledge flows back into Neo4j via Kafka for graph accumulation acceleration |
The orchestrator codebase is organised into focused packages. main.py is a thin entry point (~1 500 LOC) holding the FastAPI app, lifespan, middleware, and graph wiring. All domain logic lives in dedicated packages:
moe-infra/
├── main.py # FastAPI app, lifespan, middleware, graph wiring (~1 500 LOC)
├── config.py # All os.getenv() — typed config constants
├── state.py # Shared mutable globals (redis_client, _userdb_pool, …)
├── prompts.py # Static prompt text + routing detection regexes
├── metrics.py # Single Prometheus registry
├── parsing.py # Stateless parsers: JSON extraction, confidence, history truncation
├── context_budget.py # Per-model context-window estimation
│
├── routes/ # FastAPI APIRouters (one per concern)
│ ├── health.py # /health, /metrics
│ ├── watchdog.py # /api/watchdog/*, Starfleet feature toggles
│ ├── mission_context.py # /api/mission-context
│ ├── graph.py # /graph/*
│ ├── feedback.py # /v1/feedback, /v1/memory/ingest
│ ├── admin_*.py # Benchmark, ontology, stats admin endpoints
│ ├── models.py # /v1/models
│ ├── ollama_compat.py # /api/* (Ollama protocol)
│ └── anthropic_compat.py # /v1/messages, /v1/responses, /v1/chat/completions
│
├── services/ # Business logic — no FastAPI imports
│ ├── auth.py # OIDC + API key validation + budget enforcement
│ ├── tracking.py # Usage logging, request lifecycle, budget counters
│ ├── routing.py # Expert template + per-template prompt resolution
│ ├── templates.py # Expert template + Claude Code profile loading
│ ├── llm_instances.py # ChatOpenAI singletons (judge, planner, ingest, search)
│ ├── inference.py # Node selection, fallback chain, Thompson sampling
│ ├── helpers.py # Progress reports, semantic memory, self-evaluation
│ ├── skills.py # Server-side skill resolution + ADMIN_APPROVED hard-lock
│ ├── healer.py # Ontology gap-healer (one-shot + dedicated subprocess)
│ ├── kafka.py # Fire-and-forget Kafka publish helper
│ └── pipeline/ # OpenAI / Anthropic / Ollama / Responses API handlers
│ ├── chat.py # OpenAI chat completions
│ ├── anthropic.py # Anthropic Messages API + tool/MoE/reasoning handlers
│ ├── ollama.py # Ollama-protocol streaming wrappers
│ └── responses.py # OpenAI Responses API
│
├── graph/ # LangGraph node implementations
│ ├── router_nodes.py # cache_lookup, semantic_router, fuzzy_router, _route_cache
│ ├── tool_nodes.py # mcp_node, graph_rag_node, math_node_wrapper
│ ├── planner.py # planner_node + plan sanitization + topological levels
│ ├── expert.py # expert_worker (parallel expert execution)
│ ├── research.py # research_node + research_fallback + domain extraction
│ └── synthesis.py # merger_node, thinking_node, resolve_conflicts_node, critic_node
│
├── pipeline/
│ ├── __init__.py # LangGraph graph builder — assembles nodes into the pipeline DAG
│ └── state.py # AgentState TypedDict (67 fields across 3 categories)
│
├── web_search.py # SearXNG integration with domain-reliability scoring
├── math_node.py # SymPy-backed math node (solve, integrate, differentiate)
├── graph_rag/ # GraphRAG query, entity linking, ontology, corrections
├── federation/ # Push / pull federation client to MoE Libris hubs
├── mcp_server/ # 28 MCP precision tools (AST-whitelisted)
├── admin_ui/ # Admin backend: experts, users, budgets, cleanup manager
├── prompts/systemprompt/ # 15 expert system prompts (English, "Respond in German.")
├── tests/ # 195 unit + integration + smoke tests (all green)
└── benchmarks/ # Overnight benchmark suite, GAIA runner, result injection
The orchestrator started as an 11 190-line monolith in main.py. A 14-phase split (Q2 2026) decomposed it into the structure above without a single behavioural change — every phase ended with the full test suite green. See docs/ARCHITECTURE.md for the detailed module map.
---
This feature group requires the optionalmoe-codexenterprise stack (Apache NiFi, Marquez/OpenLineage, lakeFS). It is not part of themoe-sovereigncore and is deployed as a separate compose stack. See the moe-codex repository for setup instructions.
| Capability | Description | |
|---|---|---|
| **29** | OpenLineage Data Lineage (Marquez) | Five pipeline hook points (/v1/chat/completions, /v1/messages, /v1/responses, merger_node, kafka_ingest) emit OpenLineage 2.0.2 START/COMPLETE/FAIL events to a Marquez backend — fire-and-forget, no-op when MARQUEZ_URL is empty. Palantir Foundry-comparable lineage visibility for every MoE pipeline run |
| **30** | Enterprise Stack Dashboard | Admin UI /enterprise page surfaces NiFi, Marquez and lakeFS reachability with live latency probes, plus the most recent OpenLineage runs from Marquez. Auto-refreshes every 30 s; gracefully hides when INSTALL_ENTERPRISE_DATA_STACK=false |
| **31** | lakeFS Bundle Versioning | Every successful /graph/knowledge/import archives the JSON-LD bundle as a content-addressed commit on the moe-knowledge lakeFS repository — git-style audit log queryable via /api/enterprise/versioning/log, point-in-time bundle download via services.versioning.get_bundle_at() for rollback. Fire-and-forget; no-op when LAKEFS_ENDPOINT is empty |
| **32** | NiFi ETL Submission | Knowledge events (Kafka ingest + bundle import) are forwarded to a configurable NiFi ListenHTTP processor (NIFI_INGEST_URL), so downstream NiFi flows can fan out to S3/Solr/Elastic/Snowflake without orchestrator changes. JSON in body, MoE metadata as X-MoE-* FlowFile attributes; admin dashboard surfaces NiFi system diagnostics (uptime, heap, threads, version) at /api/enterprise/etl/status |
| **33** | Unified Data Catalog | Admin UI /catalog page aggregates datasets across all three back-ends in one searchable, source-filterable table — Marquez datasets per namespace, Neo4j entity-domain breakdown (entities/relations/syntheses), and lakeFS repositories with commit counts. Foundry-Catalog-equivalent cross-source browsing without leaving the admin UI |
| **34** | Branch-based Approval Workflow | POST /v1/graph/knowledge/import/pending stages a bundle on a lakeFS pending/<tag>-<ts> branch instead of Neo4j; admins review pending bundles in /approval, then approve (= Neo4j import + lakeFS merge to main) or reject (= branch delete). Adds an explicit gate before any external knowledge enters the live graph |
| **35** | Read-only Cypher Explorer | Admin UI /explorer page exposes an in-page Cypher editor restricted to read mode: regex-blacklist rejects CREATE/DELETE/SET/MERGE/REMOVE/DROP/ALTER/GRANT/REVOKE/FOREACH before the query reaches Neo4j, plus the driver runs in READ_ACCESS mode. Includes preset queries and a deep-link to the standalone Neo4j Browser |
| **36** | Data Health Drift Detection | Every successful knowledge-bundle import is wrapped in a stats snapshot — services/data_health.compute_drift() flags entity_dedup_suppressed, zero_entities_added, entity_count_shrank, entity_overshoot, relation_overshoot, relation_to_entity_explosion. Events land in Redis moe:data_health:events (capped 500) and surface on the Enterprise dashboard with severity pills (ok / info / warn / crit). Threshold tunable via DATA_HEALTH_DRIFT_THRESHOLD (default 0.3) |
| **37** | Embedded JupyterLite Notebook | Admin UI /notebook embeds JupyterLite (browser-only WebAssembly Jupyter) with JUPYTERLITE_URL configurable for self-hosted deployments. Includes copy-paste-ready snippets for the orchestrator API (export, pending-import, search, Cypher, lineage runs) — power-users can prototype against the live graph without installing a Python kernel anywhere |
| **38** | User Conversation Audit Log | Every authenticated API request is appended as a JSONL entry to ${MOE_DATA_ROOT}/user-audit-logs/{user_id}.jsonl — full prompt text, full response, routing metadata (model, mode, expert domains, cache hit, latency). Users access their own log via /user/audit-log with date/search filters, full-text expand, and CSV/JSON export. Retention is configurable per user (default 90 days, max 365 days); daily logrotate rotation with dateext; automatic cleanup via daily background job in moe-admin. |
| **39** | Learned Routing Gate | The retrieval gates (web research / knowledge graph) are decided by a contextual Thompson bandit (services/routing_bandit.py) instead of fixed fuzzy thresholds. Context = complexity level + discretised t-norm band; reward = request adequacy (an expert "cannot access the web" disclaimer marks research as needed; a judge-refined category marks weak graph grounding). A cost prior biases ties toward skipping retrieval to save inference. The fuzzy/complexity heuristic survives as both the context features and the cold-start fallback — until both arms of a (gate, context) reach ROUTING_BANDIT_MIN_DATAPOINTS, the heuristic decision is used unchanged, so routing never regresses below the fuzzy baseline while it learns. Metric: moe_routing_bandit_total{gate,action,source}. |
---
| Agent | Endpoint | Configuration |
|---|---|---|
| **Claude Code** | /v1/messages | export ANTHROPIC_BASE_URL=https://your-server |
| **Codex CLI** | /v1/responses | export OPENAI_BASE_URL=https://your-server |
| **OpenCode** | /v1/chat/completions | Provider config in config.toml |
| **Aider** | /v1/chat/completions | export OPENAI_BASE_URL=https://your-server/v1 |
| **Continue.dev** | /v1/chat/completions or /v1/responses | Add in .continue/config.json |
| **Open WebUI** | /v1/chat/completions | Add as OpenAI-compatible connection |
---
高质量的开源MCP工具,具有较高的实用价值
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。
经综合评估,主权AI系统 在MCP工具赛道中表现稳健,质量良好。如果你已有明确的使用需求,可以直接上手体验;如果还在评估阶段,建议对比同类工具后再做决策。
| 原始名称 | moe-sovereign |
| 原始描述 | 开源MCP工具:Self-hosted Compound AI System for sovereign environments. Features token-saving。⭐6 · Python |
| Topics | aimcpdigital-sovereignty |
| GitHub | https://github.com/h3rb3rn/moe-sovereign |
| License | Apache-2.0 |
| 语言 | Python |
收录时间:2026-06-02 · 更新时间:2026-06-02 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端