能力标签
CocoIndex智能工作流引擎
⚙️
Agent工作流

CocoIndex智能工作流引擎

基于 Python · 无代码搭建完整 AI 自动化流程
英文名:cocoindex
⭐ 9.9k Stars 🍴 773 Forks 💻 Python 📄 Apache-2.0 🏷 AI 8.5分
8.5AI 综合评分
AI智能体工作流引擎增量计算代码智能数据捕获
✦ AI Skill Hub 推荐

AI Skill Hub 强烈推荐:CocoIndex智能工作流引擎 是一款优质的Agent工作流。已获得 9.9k 颗 GitHub Star,AI 综合评分 8.5 分,在同类工具中表现稳健。如果你正在寻找可靠的Agent工作流解决方案,这是一个值得深入了解的选择。

📚 深度解析

CocoIndex智能工作流引擎 是一套完整的 AI Agent 自动化工作流方案。随着 AI 能力的不断提升,基于 Agent 的自动化工作流正在成为提升个人和团队效率的核心方式。区别于传统的 RPA 自动化(模拟鼠标键盘操作),AI Agent 工作流通过理解任务意图、动态规划执行路径,能够处理更复杂的非结构化任务。

CocoIndex智能工作流引擎 工作流的设计遵循"最小配置,最大复用"原则:核心逻辑已经封装好,用户只需配置自己的 API Key 和业务参数即可快速上手。工作流内置错误处理和重试机制,在网络波动或 API 限速等情况下仍能稳定运行,适合作为生产环境的自动化基础设施。

在实际部署时,建议先在测试环境中运行 3-5 次,验证各个环节的输出结果符合预期,再部署到生产环境。AI Skill Hub 评分 8.5 分,是同类 Agent 工作流中的精选推荐。

📋 工具概览

CocoIndex智能工作流引擎 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。

GitHub Stars
⭐ 9.9k
开发语言
Python
支持平台
Windows / macOS / Linux
维护状态
持续维护,定期更新
开源协议
Apache-2.0
AI 综合评分
8.5 分
工具类型
Agent工作流
Forks
773

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理,如需查看完整原始文档请访问底部「原始来源」。

CocoIndex智能工作流引擎 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。

📌 核心特色
  • 可视化 Agent 工作流编排,无需编写复杂代码
  • 支持多步骤自动化任务链,实现全流程无人值守
  • 与外部 API、数据库和第三方服务无缝集成
  • 内置错误处理与自动重试机制,保障稳定运行
  • 提供可复用的自动化模板,快速在同类场景部署
🎯 主要使用场景
  • 自动化日常重复性工作,将精力集中于创造性任务
  • 构建数据采集 → 处理 → 输出的完整自动化管线
  • 实现跨平台、跨系统的数据流转和业务协同
以下安装命令基于项目开发语言和类型自动生成,实际以官方 README 为准。
安装命令
# 方式一:pip 安装(推荐)
pip install cocoindex

# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install cocoindex

# 方式三:从源码安装(获取最新功能)
git clone https://github.com/cocoindex-io/cocoindex
cd cocoindex
pip install -e .

# 验证安装
python -c "import cocoindex; print('安装成功')"
📋 安装步骤说明
  1. 访问 GitHub 仓库获取工作流文件
  2. 在对应平台(Dify / Flowise / Make 等)中找到「导入工作流」功能
  3. 上传工作流文件
  4. 按照提示配置必要的环境变量和 API Key
  5. 运行测试确认流程正常后投入使用
以下用法示例由 AI Skill Hub 整理,涵盖最常见的使用场景。
常用命令 / 代码示例
# 命令行使用
cocoindex --help

# 基本用法
cocoindex input_file -o output_file

# Python 代码中调用
import cocoindex

# 示例
result = cocoindex.process("input")
print(result)
以下配置示例基于典型使用场景生成,具体参数请参照官方文档调整。
配置示例
# cocoindex 配置文件示例(config.yml)
app:
  name: "cocoindex"
  debug: false
  log_level: "INFO"

# 运行时指定配置文件
cocoindex --config config.yml

# 或通过环境变量配置
export COCOINDEX_API_KEY="your-key"
export COCOINDEX_OUTPUT_DIR="./output"
📑 README 深度解析 真实文档 完整度 8/100 查看 GitHub 原文 →
以下内容由系统直接从 GitHub README 解析整理,保留代码块、表格与列表结构。

简介

<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/enterprise-hero-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/enterprise-hero-light.svg"> <img src="https://cocoindex.io/blobs/github/homepage/enterprise-hero-light.svg" alt="Enterprise corpus — codebase, Slack, meeting notes, and documentation — flowing continuously through the CocoIndex incremental sync engine into a production AI agent with always-fresh context. Only the Δ (delta) is reprocessed on every change. Keywords: RAG pipeline, agent memory, enterprise retrieval, AI agent context, live indexing, retrieval-augmented generation, production LLM apps, streaming ETL, incremental ingestion." width="100%" draggable="false"/> </picture> </p> <h1 align="center">Your agents deserve <em>fresh context.</em></h1>

<p align="center"> <strong>Star us&nbsp;❤️&nbsp;→</strong>&nbsp;<a href="https://github.com/cocoindex-io/cocoindex" title="Star CocoIndex on GitHub — open-source incremental indexing framework for AI agents"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/star-btn-small-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/star-btn-small-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/star-btn-small-light.svg" alt="Star CocoIndex on GitHub — open-source Python framework for RAG, vector search, and live agent context" height="36" align="absmiddle"/></picture></a> &nbsp;·&nbsp; <a href="https://cocoindex.io" title="Visit cocoindex.io — the CocoIndex homepage"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/coco-inline-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/coco-inline-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/coco-inline-light.svg" alt="cocoindex.io — the CocoIndex homepage: incremental data pipelines for AI agents" height="36" align="absmiddle"/></picture></a> &nbsp;·&nbsp; <a href="https://cocoindex.io/docs" title="Read the CocoIndex documentation — guides, quickstart, connectors, transformations, and API reference"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/docs-inline-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/docs-inline-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/docs-inline-light.svg" alt="CocoIndex documentation — quickstart, connectors, ops, transformations, target stores, RAG and knowledge graph recipes" height="36" align="absmiddle"/></picture></a> &nbsp;·&nbsp; <a href="https://discord.com/invite/zpA9S2DR7s" title="Join the CocoIndex Discord — community chat, showcase, release notes, help and support"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/discord-inline-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/discord-inline-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/discord-inline-light.svg" alt="Join the CocoIndex Discord community — help, showcase, release notes, and live chat with maintainers" height="36" align="absmiddle"/></picture></a> </p>

<p align="center">CocoIndex turns codebases, meeting notes, inboxes, Slack, PDFs, and videos into live, continuously fresh context for your AI agents and LLM apps to reason over effectively — with minimal incremental processing. Get your production AI agent ready in 10 minutes with reliable, continuously fresh data — no stale batches, no context gap </p> <p align="center"> <b>Incremental</b> · only the delta &nbsp;·&nbsp; <b>Any scale</b> · parallel by default &nbsp;·&nbsp; <b>Declarative</b> · Python, 5 min </p>

stars downloads pypi python rust license discord

CI release links

</div>

<p align="center"><a href="https://trendshift.io/repositories/13939" target="_blank"><img src="https://trendshift.io/api/badge/repositories/13939" alt="cocoindex-io/cocoindex | Trendshift" width="250" height="55"/></a></p>

<br/>

Deutsch | English | Español | français | 日本語 | 한국어 | Português | Русский | 中文

</div>

<br/><br/>

Built with CocoIndex ❤️

<p align="center"> <a href="https://cocoindex.io/cocoindex-code" title="CocoIndex-code — flagship MCP server for AI coding agents: AST-aware, incremental, semantic code index. Claude Code and Cursor see your whole repo instantly."><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/cocoindex-code-hero-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/cocoindex-code-hero-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/cocoindex-code-hero-light.svg" alt="CocoIndex-code — flagship MCP server for AI coding agents. AST-aware incremental semantic code index that keeps live call graphs, symbols, vectors, and chunks fresh on every commit. 70% fewer tokens per turn, 80-90% cache hits on re-index, sub-second freshness. Supports Python, TypeScript, Rust, and Go. Features: Δ-only incremental processing, semantic search by meaning (not grep), call graphs and blast-radius analysis, global repo view for duplicates and architecture. Build coding agents (generate, refactor) and code-review agents (catch, approve). One install — Claude Code, Cursor, and other MCP-aware agents see your whole repository instantly. Keywords: MCP server, coding agent, code intelligence, AST chunking, semantic code search, call graph, vector embedding, repository context, Claude Code, Cursor, incremental indexing, blast radius." width="100%"/></picture></a> </p>

<p align="center"><a href="examples"><b>See all 20+ examples · updated every week →</b></a></p>

<br/>

Get started

pip install -U cocoindex

Declare what should be in your target — CocoIndex keeps it in sync forever, recomputing only the Δ.

import cocoindex as coco
from cocoindex.connectors import localfs, postgres
from cocoindex.ops.text import RecursiveSplitter

@coco.fn(memo=True)                          # ← cached by hash(input) + hash(code)
async def index_file(file, table):
    for chunk in RecursiveSplitter().split(await file.read_text()):
        table.declare_row(text=chunk.text, embedding=embed(chunk.text))

@coco.fn
async def main(src):
    table = await postgres.mount_table_target(PG, table_name="docs")
    table.declare_vector_index(column="embedding")
    await coco.mount_each(index_file, localfs.walk_dir(src).items(), table)

coco.App(coco.AppConfig(name="docs"), main, src="./docs").update_blocking()

<p align="center">Run once to backfill. Re-run anytime — only the changed files re-embed.</p>

<p align="center"> Building with an AI coding agent?<br/> Drop in our <a href="skills/cocoindex/"><b>CocoIndex skill</b></a> so your agent writes correct v1 code — concepts, APIs, patterns, all in one file.<br/> <sub>See <a href="https://cocoindex.io/docs/getting_started/ai_coding_agents/">Use with AI coding agents</a> for install steps.</sub> </p>

<p align="center"> <a href="https://cocoindex.io/docs/getting_started/quickstart" title="Full CocoIndex quickstart — install, declare sources and targets, run the incremental engine, set up vector search or knowledge graph in 5 minutes"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/quickstart-btn-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/quickstart-btn-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/quickstart-btn-light.svg" alt="Full quickstart — open-book icon linking to the CocoIndex documentation quickstart: pip install, declare sources and targets, run the incremental engine" height="36" align="absmiddle"/></picture></a> &nbsp;&nbsp; <a href="https://cocoindex.io/docs/programming_guide/core_concepts" title="Learn the CocoIndex core concepts — sources, targets, flows, incremental engine, lineage"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/learn-concept-btn-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/learn-concept-btn-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/learn-concept-btn-light.svg" alt="Learn the concept — lightbulb icon linking to the CocoIndex core-concepts guide: sources, targets, flows, incremental engine, and data lineage" height="36" align="absmiddle"/></picture></a> </p>

<p align="center"> <a href="https://github.com/cocoindex-io/cocoindex" title="Star CocoIndex on GitHub — open-source Python framework for live agent context"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/comm-github-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/comm-github-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/comm-github-light.svg" alt="Animated GitHub Star button for the cocoindex-io/cocoindex repository: a cursor clicks the star, it fills yellow, confetti bursts, the star count ticks up, and an 'Appreciate a star if you like it!' caption with a beating heart shows below the button" width="620"/></picture></a> </p>

<br/><br/>

React — for data engineering

<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/react4de-hero-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/react4de-hero-light.svg"> <img src="https://cocoindex.io/blobs/github/homepage/react4de-hero-light.svg" alt="React — for data engineering. The CocoIndex mental model: Target = F(Source). A persistent-state-driven dataflow where you declare the desired target state and the engine keeps it in sync with the latest source data and code, forever, at low latency and low cost. Source files (.py, .md, .pdf, .ts) flow through your Python transformation F into a live target dots-matrix index; only the Δ is reprocessed on every change, and every target dot traces back to its exact source byte. Four core properties: Python not a DAG (sky), declare target state (yellow bullseye), lineage end-to-end (coral connected dots), and incremental at any scale (mint Δ+1). Your code is as simple as the one-off version — the engine does the rest. Keywords: React for data engineering, declarative ETL, persistent state, data lineage, dataflow, Δ only, incremental indexing, CocoIndex." width="100%"/> </picture> </p>

<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/either-side-change-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/either-side-change-light.svg"> <img src="https://cocoindex.io/blobs/github/homepage/either-side-change-light.svg" alt="What happens when either side changes — CocoIndex tracks per-row provenance so the Δ propagates at minimum cost. Two scenarios shown in one illustration: (top) Source change — one file (b.md) is edited and only one target dot re-syncs (coral pulse). (bottom) Code change — the transformation function F is rewritten from v1 to v2 and only the dots whose outputs depend on the changed code re-run (amber/yellow pulses). Source on the left, F in the center (Python code block), target dots-matrix on the right. Keywords: incremental indexing, change data capture, delta processing, fine-grained invalidation, code-aware caching, hash-of-code invalidation, memoization, reproducible pipelines, incremental recomputation." width="100%"/> </picture> </p>

<p align="center"><a href="https://cocoindex.io/react-cocoindex"><b>See the React ↔ CocoIndex mental model →</b></a></p>

<br/><br/>

Incremental engine for long-horizon agents

<p align="center"> Data transformation for any engineer, designed for AI workloads —<br/> with a smart incremental engine for <em>always-fresh, explainable data.</em> </p>

<p align="center"> <a href="https://cocoindex.io/docs/programming_guide/core_concepts" title="Learn the CocoIndex core concepts — sources, targets, flows, incremental engine, lineage"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/learn-concept-btn-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/learn-concept-btn-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/learn-concept-btn-light.svg" alt="Learn the concept — purple button with a lightbulb icon linking to the CocoIndex core-concepts guide: sources, targets, flows, incremental engine, and data lineage" height="44" align="absmiddle"/></picture></a> </p>

<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/incremental-engine-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/incremental-engine-light.svg"> <img src="https://cocoindex.io/blobs/github/homepage/incremental-engine-light.svg" alt="CocoIndex's Python-native transformation flows connect 8 source categories (Codebases, Meeting Notes, Web · APIs, File System · Blob Stores, Databases, Message Queues, Images · Video, Voice · Transcripts) through the incremental engine out to 6 target stores (Relational DB, Data Warehouse, Vector DB, Graph DB, Message Queue, Feature Store). A flow.py code block (@coco.fn · def f(src): · chunks = split(src) · target.row(embed(chunks))) shows the shared pipeline; only the Δ is reprocessed — unchanged src hits the cache, changed src re-runs split() and Δ → re-embed. The persistent data-pipeline control plane runs eight always-on subsystems: live caching, pipeline catalog, version tracking, continuously learning, lineage, task scheduling, metrics collection, and failure management. Keywords: data pipeline, ETL, source connectors, vector database, graph database, incremental engine, streaming ingestion, caching, lineage, versioning, scheduling, metrics, retries." width="100%"/> </picture> </p>

<br/><br/>

Why incremental?

<p align="center">Your agents are only as good as the data they see.<br/>Batch pipelines drift stale. CocoIndex stays live — and only runs the Δ.</p>

<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/why-incremental-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/why-incremental-dark.svg"> <img src="https://cocoindex.io/blobs/github/homepage/why-incremental-dark.svg" alt="Why incremental? — one illustration combining the four core benefits of CocoIndex's incremental engine. Sub-second fresh (mint): a stopwatch ticking under a second, source changes propagate to the target in under a second so agents see the world as it is, not as it was yesterday. 10× cheaper at scale (yellow): a 10,000-row corpus block where only a thin Δ 0.1% column re-runs and 99.9% stays cached — you skip the other 99.9% of your corpus and pay a fraction of the compute, embedding, and LLM bill. Explainable by default (coral): a lineage thread links a source byte (handbook.md L42) to a target vector — every vector, row, or graph node in the target traces back to its exact source byte for debuggable, auditable, regulator-friendly AI pipelines. Production-grade (purple): a shield stamped with the Rust crab surrounded by retry loops, back-off dots, a DLQ tray, and a no-data-loss check — Rust core with retries, exponential back-off, dead-letter queues, and no-data-loss guarantees, production-ready for long-horizon AI agents. Keywords: incremental indexing, Δ-only reprocessing, sub-second freshness, low-latency RAG, cost-efficient embeddings, data lineage, retrieval-augmented generation, Rust core, retries, back-off, dead letters, no data loss, long-horizon agents." width="100%"/> </picture> </p>

<br/><br/>

What can you build?

<p align="center"><a href="examples" title="Browse all 20+ CocoIndex examples on GitHub — code, PDF, HN, knowledge graph, podcast, CSV-to-Kafka, image, and more"><b>See all 20+ examples · updated every week →</b></a></p>

<p align="center"><b>Working starters from <a href="examples">the examples tree</a> — clone, plug your source, ship.</b></p>

<p align="center"> <a href="examples/code_embedding" title="Real-time code index — walk a git repo, chunk source files with an AST-aware splitter, embed with sentence-transformers, and upsert to pgvector / LanceDB. Fully incremental: only files touched by the latest commit re-embed. Good for coding agents, code review, semantic find-by-meaning."><img src="https://cocoindex.io/blobs/github/homepage/example-code.svg" alt="Real-time code index — walk a git repo, AST-chunk source files, embed with sentence-transformers, upsert to pgvector / LanceDB, incremental on every commit. Keywords: code search, code embedding, semantic code retrieval, Python." width="70%"/></a> </p>

<p align="center"> <a href="examples/pdf_embedding" title="PDF → RAG index — ingest PDFs from local / S3 / Google Drive, extract text, chunk with a recursive splitter, embed each chunk, and upsert into pgvector / LanceDB with a vector index. Classic RAG stack, incremental — only edited PDFs re-embed."><img src="https://cocoindex.io/blobs/github/homepage/example-pdf.svg" alt="PDF → RAG index — ingest PDFs from local, S3, or GDrive, extract + chunk text, embed chunks, upsert to pgvector / LanceDB. Classic retrieval-augmented-generation stack, incremental. Keywords: RAG, document Q&A, PDF search, vector database." width="70%"/></a> </p>

<p align="center"> <a href="examples/hn_trending_topics" title="HN trending topics — fetch Hacker News threads via the Algolia API, recursively pull nested comments, LLM-extract typed topic lists per message with Gemini 2.5 Flash, and rank topics by weighted mention count (thread = 5 points, comment = 1 point)."><img src="https://cocoindex.io/blobs/github/homepage/example-hn-trending.svg" alt="HN trending topics — pull Hacker News threads via Algolia, recursively parse comments, LLM-extract topics with Gemini 2.5 Flash, rank by weighted hit count (thread=5, comment=1), store in Postgres. Incremental. Keywords: Hacker News, trending topics, LLM extraction, Gemini, Postgres, news intelligence, topic ranking." width="70%"/></a> </p>

<p align="center"> <a href="examples/conversation_to_knowledge" title="Conversation → knowledge graph — pull people, topics, decisions, and action items out of meeting transcripts, Slack, podcasts, or support calls with an LLM extractor, and upsert into Neo4j or Kuzu. Incremental: only changed turns re-extract."><img src="https://cocoindex.io/blobs/github/homepage/example-kg.svg" alt="Conversation → knowledge graph — LLM extracts people, topics, decisions, action items from transcripts and upserts into Neo4j / Kuzu. Live graph, incremental. Keywords: knowledge graph, entity extraction, meeting intelligence, agent memory." width="70%"/></a> </p>

<p align="center"> <a href="examples/multi_codebase_summarization" title="Multi-repo summarization — walk N git repositories, extract READMEs / public APIs / modules, LLM-summarize each one, and roll up into a single top-level summary. Incremental: only repos with new commits re-run."><img src="https://cocoindex.io/blobs/github/homepage/example-multicode.svg" alt="Multi-repo summarization — walk N git repos, extract structure, LLM-summarize per-repo + a rolled-up org summary, refresh on every push. Keywords: internal platform, developer experience, monorepo, SDK docs." width="70%"/></a> </p>

<p align="center"> <a href="examples/patient_intake_extraction_baml" title="Structured extraction — read messy forms, PDFs, invoices, or free-text and extract typed, schema-validated fields with BAML or DSPy, then write rows into Postgres or a warehouse. Incremental: only changed documents re-extract."><img src="https://cocoindex.io/blobs/github/homepage/example-intake.svg" alt="Structured extraction — BAML / DSPy typed schema extraction from forms, PDFs, intakes, invoices into Postgres / warehouse. Incremental. Keywords: ETL, LLM extraction, schema-first, patient intake, invoice processing, KYC, contracts." width="70%"/></a> </p>

<p align="center"> <a href="examples/conversation_to_knowledge" title="Podcast → knowledge graph — download YouTube podcast audio, transcribe with speaker diarization (Whisper / AssemblyAI), LLM-extract structured statements and entities per speaker, resolve duplicates across episodes with embeddings, and store the whole graph (speakers, statements, topics) in SurrealDB or Neo4j. Incremental."><img src="https://cocoindex.io/blobs/github/homepage/example-podcast.svg" alt="Podcast → knowledge graph — transcribe YouTube / Spotify audio with speaker diarization, LLM-extract speakers and statements, resolve entities across episodes, store in SurrealDB / Neo4j. Keywords: podcast, diarization, YouTube, Whisper, SurrealDB, knowledge graph, entity resolution." width="70%"/></a> </p>

<p align="center"> <a href="examples/csv_to_kafka" title="CSV → Kafka live — watch a folder of CSV files (local or S3) and publish each row as a JSON message keyed by its primary key to a Kafka topic on StreamNative / Confluent / self-hosted. Sub-second incremental — only changed rows publish."><img src="https://cocoindex.io/blobs/github/homepage/example-csv-kafka.svg" alt="CSV → Kafka live — watch a folder of CSV files, publish each row as a JSON message to a Kafka topic via CocoIndex's Kafka target connector. Incremental, sub-second, no producer loop. Keywords: Kafka, CDC, streaming, StreamNative, Confluent, CSV ingestion, event streaming." width="70%"/></a> </p>

<br/>

<p align="center"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/share-build-dark.svg"><source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/share-build-light.svg"><img src="https://cocoindex.io/blobs/github/homepage/share-build-light.svg" alt="Share what you build — a banner with a trail of tiny hearts rising from the bottom behind the text, inviting the CocoIndex community to share projects built with the framework" height="36" draggable="false"/></picture></p>

<p align="center">Building something with CocoIndex? <b>We want to see it.</b><br/>Tag <a href="https://x.com/cocoindex_io" title="Tag @cocoindex_io on X to showcase your CocoIndex project">@cocoindex_io</a> on X or drop a link in <a href="https://discord.com/invite/zpA9S2DR7s" title="Share your project in the CocoIndex Discord #showcase channel">#showcase</a> on Discord. We'll boost it. 🥥</p>

<br/><br/>

Community

Join the CocoIndex Discord community — live chat with maintainers and users, showcase your projects, get help building RAG pipelines and knowledge graphs Subscribe to the CocoIndex YouTube channel — video tutorials, live demos, architecture deep dives, and AI agent recipes Read the CocoIndex blog — engineering deep dives, release notes, RAG and knowledge graph tutorials, and case studies Follow @cocoindex_io on X (formerly Twitter) for release notes, demos, launches, and AI data pipeline updates

<br/><br/>

<p align="center"> <img src="https://cocoindex.io/blobs/github/homepage/we-love-contributors.svg" alt="We love Contributors — section title banner with a pulsing coral heart badge and cream twinkle sparkles. Every typo fix, new connector, and doc tweak makes CocoIndex better. Keywords: open-source contribution, pull request, typo fix, new connector, good first issue, Hacktoberfest, community, coconut heart." width="620"/> </p>

<p align="center"> <b>We are <em>so</em> excited to meet you.</b><br/> Every typo fix, new connector, doc tweak, or full-on rewrite makes CocoIndex better.<br/> Come hang out — big PRs and small ones, both welcome. </p>

<p align="center"> 📝 <a href="https://cocoindex.io/docs/contributing/guide"><b>Read the contributing guide</b></a> &nbsp;·&nbsp; 🐛 <a href="https://github.com/cocoindex-io/cocoindex/labels/good%20first%20issue"><b>good first issues</b></a> &nbsp;·&nbsp; 💬 <a href="https://discord.com/invite/zpA9S2DR7s"><b>Say hi on Discord</b></a> </p>

<br/><br/>

CocoIndex Enterprise

<p align="center"> <picture> <source media="(prefers-color-scheme: dark)" srcset="https://cocoindex.io/blobs/github/homepage/enterprise-scale-dark.svg"> <source media="(prefers-color-scheme: light)" srcset="https://cocoindex.io/blobs/github/homepage/enterprise-scale-light.svg"> <img src="https://cocoindex.io/blobs/github/homepage/enterprise-scale-light.svg" alt="CocoIndex Enterprise — built for enterprise scale. Four headline stats for PB-scale incremental indexing: PB corpus scale incrementally indexed (coral), 10× fewer LLM embedding calls vs. full recompute (yellow), 100% lineage coverage with every byte traceable (mint), Δ only the delta always (sky). Below, a wide 50×8 corpus matrix of 400 dim tiles represents a petabyte-scale store where a single coral Δ slice of 8 tiles re-runs while the other 99.9% stays cached. Keywords: enterprise RAG, petabyte-scale indexing, incremental compute, delta-only, lineage, parallel chunking, zero-copy, failure isolation." width="100%"/> </picture> </p>

Large corpus — built for enterprise scale.

<p align="center"> Incremental compute is the only way to keep large corpora fresh without re-embedding them every cycle.<br/> CocoIndex scales from a single repo to petabyte-scale stores — parallel by default, delta-only by design. </p>

<br/>

Process once. Reconcile forever.

<p align="center"> When a source changes, CocoIndex identifies the affected records, propagates the change<br/> across joins and lookups, updates the target, and retires stale rows —<br/> without touching anything that didn't change. </p>

<br/>

Built on a Rust engine.

<p align="center"> The core is Rust — production-grade from day zero.<br/> Parallel chunking, zero-copy transforms where possible, and failure isolation<br/> so one bad record doesn't stall the flow. </p>

<br/><br/>

<p align="center"> <a href="https://cocoindex.io/enterprise/" title="Explore CocoIndex Enterprise — PB-scale incremental data pipelines for AI agents"><img src="https://cocoindex.io/blobs/github/homepage/enterprise-btn.svg" alt="Explore CocoIndex Enterprise — bright blue pill button linking to cocoindex.io/enterprise, the PB-scale incremental data pipeline for AI agents" height="44" align="absmiddle"/></a> </p>

<br/><br/>

<p align="center"><sub>Apache 2.0 · © CocoIndex contributors 🥥</sub></p>

<img referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=7f27e85b-be3a-411a-b612-0b9d53711814&page=README.md" alt="" width="1" height="1" />

🎯 aiskill88 AI 点评 A 级 2026-05-20

CocoIndex是成熟的开源AI工作流方案,9.9K星体现强认可度。增量引擎设计独特,代码库智能功能前沿。文档完整,社区活跃,生产级别框架。

📚 实用指南(长尾问题)
适合谁
  • 需要让 Claude / Cursor 操作本地工具的 AI 工程师
  • 构建多智能体协作系统的 Agent 开发者
  • 构建企业知识库 / RAG 检索应用的团队
  • 跨境业务、多语言内容运营团队
  • 做语音类 AI 产品的开发者
最佳实践
  • 配置 MCP 服务器时建议使用 stdio 传输 + JSON-RPC,避免暴露公网
  • 分块大小建议 256-512 tokens,向量库优选 pgvector 或 Qdrant
  • Agent 任务先做 dry-run 验证工具调用链,再开启自主执行
常见错误
  • API key 直接提交到 git 仓库(请用 .env 并加入 .gitignore)
  • MCP 配置路径拼错或权限不足,重启 Claude Desktop 才生效
  • embedding 模型与查询模型不一致导致检索失效
  • Python 依赖冲突:建议用 venv / uv 隔离环境
部署方案
  • 云端托管:可放在 Vercel / Railway / Fly.io 等 PaaS 平台
相关搜索
cocoindex 中文教程cocoindex 安装报错怎么办cocoindex MCP 配置cocoindex Agent 工作流cocoindex 与同类工具对比cocoindex 最佳实践cocoindex 适合谁用

⚡ 核心功能

👥 适合谁
  • 需要让 Claude / Cursor 操作本地工具的 AI 工程师
  • 构建多智能体协作系统的 Agent 开发者
  • 构建企业知识库 / RAG 检索应用的团队
  • 跨境业务、多语言内容运营团队
⭐ 最佳实践
  • 配置 MCP 服务器时建议使用 stdio 传输 + JSON-RPC,避免暴露公网
  • 分块大小建议 256-512 tokens,向量库优选 pgvector 或 Qdrant
  • Agent 任务先做 dry-run 验证工具调用链,再开启自主执行
⚠️ 常见错误
  • API key 直接提交到 git 仓库(请用 .env 并加入 .gitignore)
  • MCP 配置路径拼错或权限不足,重启 Claude Desktop 才生效
  • embedding 模型与查询模型不一致导致检索失效
  • Python 依赖冲突:建议用 venv / uv 隔离环境

👥 适合人群

自动化工程师和运维人员项目经理和业务分析师希望减少重复性工作的专业人士数字化转型团队

🎯 使用场景

  • 自动化日常重复性工作,将精力集中于创造性任务
  • 构建数据采集 → 处理 → 输出的完整自动化管线
  • 实现跨平台、跨系统的数据流转和业务协同

⚖️ 优点与不足

✅ 优点
  • +GitHub 9.9k Star,社区高度认可
  • +Apache-2.0 协议,可免费商用
  • +大幅减少重复性人工操作
  • +可视化流程,清晰直观
  • +可扩展性强,支持复杂场景
⚠️ 不足
  • 初始配置和调试需投入一定时间
  • 强依赖外部服务的稳定性
  • 复杂场景需具备一定技术基础
⚠️ 使用须知

AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。

建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。

📄 License 说明

✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。

🔗 相关工具推荐

📰 相关 AI 新闻
🍿 AI 圈相关吃瓜
🗺️ 相关解决方案
🧩 你可能还需要
基于当前 Skill 的能力图谱,自动补全的工具组合

❓ 常见问题 FAQ

适合需要长时间运行的AI智能体任务,特别是涉及代码分析和增量数据处理的应用
💡 AI Skill Hub 点评

总体来看,CocoIndex智能工作流引擎 是一款质量优秀的Agent工作流,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。

⬇️ 获取与下载
⬇ 下载源码 ZIP

✅ Apache-2.0 协议 · 可免费商用 · 直接从 aiskill88 服务器下载,无需跳转 GitHub

📚 深入学习 CocoIndex智能工作流引擎
查看分步骤安装教程和完整使用指南,快速上手这款工具
🌐 原始信息
原始名称 cocoindex
原始描述 开源AI工作流:Incremental engine for long horizon agents 🌟 Star if you like it!。⭐9.9k · Python
Topics AI智能体工作流引擎增量计算代码智能数据捕获
GitHub https://github.com/cocoindex-io/cocoindex
License Apache-2.0
语言 Python
🔗 原始来源
🐙 GitHub 仓库  https://github.com/cocoindex-io/cocoindex 🌐 官方网站  https://cocoindex.io

收录时间:2026-05-19 · 更新时间:2026-05-30 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。

📺 订阅 AI Skill Hub Daily Telegram 频道
每天 8 条精选 AI Skill、MCP、Agent 与自动化工具推送
加入频道 →