AI Skill Hub 强烈推荐:cascadeflow n8n工作流 是一款优质的Agent工作流。已获得 1.3k 颗 GitHub Star,AI 综合评分 8.2 分,在同类工具中表现稳健。如果你正在寻找可靠的Agent工作流解决方案,这是一个值得深入了解的选择。
cascadeflow n8n工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
cascadeflow n8n工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
# 方式一:pip 安装(推荐)
pip install cascadeflow
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install cascadeflow
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/lemony-ai/cascadeflow
cd cascadeflow
pip install -e .
# 验证安装
python -c "import cascadeflow; print('安装成功')"
# 命令行使用
cascadeflow --help
# 基本用法
cascadeflow input_file -o output_file
# Python 代码中调用
import cascadeflow
# 示例
result = cascadeflow.process("input")
print(result)
# cascadeflow 配置文件示例(config.yml) app: name: "cascadeflow" debug: false log_level: "INFO" # 运行时指定配置文件 cascadeflow --config config.yml # 或通过环境变量配置 export CASCADEFLOW_API_KEY="your-key" export CASCADEFLOW_OUTPUT_DIR="./output"
<picture> <source media="(prefers-color-scheme: dark)" srcset="./.github/assets/CF_logo_bright.svg"> <source media="(prefers-color-scheme: light)" srcset="./.github/assets/CF_logo_dark.svg"> <img alt="cascadeflow Logo" src="./.github/assets/CF_logo_dark.svg" width="80%" style="margin: 20px auto;"> </picture>
Built with ❤️ by Lemony Inc. and the cascadeflow Community
One cascade. Hundreds of specialists.
New York | Zurich
⭐ Star us on GitHub if cascadeflow helps you save money!
| **Feature** | **Benefit** |
|---|---|
| 🎯 **Speculative Cascading** | Tries cheap models first, escalates intelligently |
| 💰 **40-85% Cost Savings** | Research-backed, proven in production |
| ⚡ **2-10x Faster** | Small models respond in <50ms vs 500-2000ms |
| ⚡ **Low Latency** | Sub-2ms framework overhead, negligible performance impact |
| 🔄 **Mix Any Providers** | OpenAI, Anthropic, Groq, Ollama, vLLM, Together + LiteLLM (optional) + LangChain integration |
| 👤 **User Profile System** | Per-user budgets, tier-aware routing, enforcement callbacks |
| ✅ **Quality Validation** | Automatic checks + semantic similarity (optional ML, ~80MB, CPU) |
| 🎨 **Cascading Policies** | Domain-specific pipelines, multi-step validation strategies |
| 🧠 **Domain Understanding** | 15 domains auto-detected (code, medical, legal, finance, math, etc.), routes to specialists |
| 🤖 **Drafter/Validator Pattern** | 20-60% savings for agent/tool systems |
| 🔧 **Tool Calling Support** | Universal format, works across all providers |
| 📊 **Cost Tracking** | Built-in analytics + OpenTelemetry export (vendor-neutral) |
| 🚀 **3-Line Integration** | Zero architecture changes needed |
| 🔁 **Agent Loops** | Multi-turn tool execution with automatic tool call, result, re-prompt cycles |
| 🧭 **Hermes Agent Routing** | Per-skill, task-complexity, and topic-aware subagent routing with observe-mode rollout |
| 📋 **Message & Tool Call Lists** | Full conversation history with tool_calls and tool_call_id preservation across turns |
| 🪝 **Hooks & Callbacks** | Telemetry callbacks, cost events, and streaming hooks for observability |
| 🏭 **Production Ready** | Streaming, batch processing, tool handling, reasoning model support, caching, error recovery, anomaly detection |
| 💳 **Budget Enforcement** | Per-run and per-user budget caps with automatic stop actions when limits are exceeded |
| 🔒 **Compliance Gating** | GDPR, HIPAA, PCI, and strict model allowlists — block non-compliant models before execution |
| 📊 **KPI-Weighted Routing** | Inject business priorities (quality, cost, latency, energy) as weights into every model decision |
| 🌱 **Energy Tracking** | Deterministic compute-intensity coefficients for carbon-aware AI operations |
| 🔍 **Decision Traces** | Full per-step audit trail: action, reason, model, cost, budget state, enforcement status |
| ⚙️ **Harness Modes** | off / observe / enforce — roll out safely with observe, then switch to enforce when ready |
---
@cascadeflow/n8n-nodes-cascadeflow<img src=".github/assets/CF_ts_color.svg" width="18" height="18" alt="TypeScript" style="vertical-align: middle;"/> TypeScript
npm install @cascadeflow/langchain @langchain/core @langchain/openai
<img src=".github/assets/CF_python_color.svg" width="18" height="18" alt="Python" style="vertical-align: middle;"/> Python
pip install cascadeflow langchain-openai
Migrate in 5min from direct Provider implementation to cost savings and full cost control and transparency.
Cost: $0.000113, Latency: 850ms
```python
<details open> <summary><b><img src=".github/assets/CF_ts_color.svg" width="18" height="18" alt="TypeScript" style="vertical-align: middle;"/> TypeScript - Drop-in replacement for any LangChain chat model</b></summary>
import { ChatOpenAI } from '@langchain/openai';
import { ChatAnthropic } from '@langchain/anthropic';
import { withCascade } from '@cascadeflow/langchain';
const cascade = withCascade({
drafter: new ChatOpenAI({ model: 'nous/hermes-flash' }), // $0.15/$0.60 per 1M tokens
verifier: new ChatAnthropic({ model: 'claude-sonnet-4-5' }), // $3/$15 per 1M tokens
qualityThreshold: 0.8, // 80% queries use drafter
});
// Use like any LangChain chat model
const result = await cascade.invoke('Explain quantum computing');
// Optional: Enable LangSmith tracing (see https://smith.langchain.com)
// Set LANGSMITH_API_KEY, LANGSMITH_PROJECT, LANGSMITH_TRACING=true
// Or with LCEL chains
const chain = prompt.pipe(cascade).pipe(new StringOutputParser());
</details>
<details> <summary><b><img src=".github/assets/CF_python_color.svg" width="18" height="18" alt="Python" style="vertical-align: middle;"/> Python - Drop-in replacement for any LangChain chat model</b></summary>
```python from langchain_openai import ChatOpenAI from langchain_anthropic import ChatAnthropic from cascadeflow.integrations.langchain import CascadeFlow
cascade = CascadeFlow( drafter=ChatOpenAI(model="nous/hermes-flash"), # $0.15/$0.60 per 1M tokens verifier=ChatAnthropic(model="claude-sonnet-4-5"), # $3/$15 per 1M tokens quality_threshold=0.8, # 80% queries use drafter )
<img src=".github/assets/CF_python_color.svg" width="20" height="20" alt="Python" style="vertical-align: middle;"/> Python Examples:
<details open> <summary><b>Basic Examples</b> - Get started quickly</summary>
| Example | Description | Link |
|---|---|---|
| **Basic Usage** | Simple cascade setup with OpenAI models | [View](./examples/basic_usage.py) |
| **Preset Usage** | Use built-in presets for quick setup | [View](https://docs.cascadeflow.ai/developers/providers-and-presets) |
| **Tool Execution** | Function calling and tool usage | [View](./examples/tool_execution.py) |
| **Streaming Text** | Stream responses from cascade agents | [View](./examples/streaming_text.py) |
| **Cost Tracking** | Track and analyze costs across queries | [View](./examples/cost_tracking.py) |
| **Agentic Multi-Agent** | Multi-turn tool loops & agent-as-a-tool delegation | [View](./examples/agentic_multi_agent.py) |
| **Multi-Step Cascade** | Multi-step agent loops with tool calls | [View](./examples/multi_step_cascade.py) |
</details>
<details> <summary><b>Harness & Enforcement</b> - Budget, compliance, and agent governance</summary>
| Example | Description | Link |
|---|---|---|
| **Budget Enforcement** | Budget caps with stop actions in enforce mode | [View](./examples/enforcement/basic_enforcement.py) |
| **User Budget Tracking** | Per-user budget enforcement and tracking | [View](./examples/user_budget_tracking.py) |
| **Guardrails** | Safety and content guardrails | [View](./examples/guardrails_usage.py) |
| **Rate Limiting** | Rate limiting for cascades | [View](./examples/rate_limiting_usage.py) |
| **User Profile Usage** | User-specific routing and configurations | [View](./examples/user_profile_usage.py) |
| **Stripe Integration** | Billing integration with budget enforcement | [View](./examples/enforcement/stripe_integration.py) |
</details>
<details> <summary><b>Framework Integrations</b> - Harness with LangChain, OpenAI Agents, CrewAI, PydanticAI, Google ADK, Hermes Agent</summary>
| Example | Description | Link |
|---|---|---|
| **LangChain Harness** | cascadeflow harness with LangChain callback handler | [View](./examples/integrations/langchain_harness.py) |
| **OpenAI Agents Harness** | cascadeflow harness with OpenAI Agents SDK | [View](./examples/integrations/openai_agents_harness.py) |
| **CrewAI Harness** | cascadeflow harness with CrewAI hooks | [View](./examples/integrations/crewai_harness.py) |
| **PydanticAI Harness** | cascadeflow cascade Model with PydanticAI agents | [View](./examples/integrations/pydantic_ai_harness.py) |
| **Google ADK Harness** | cascadeflow harness with Google ADK plugin | [View](./examples/integrations/google_adk_harness.py) |
| **LangChain Basic** | Simple LangChain cascade setup | [View](./examples/langchain_basic_usage.py) |
| **LangChain LCEL Pipeline** | LCEL chains with cascade routing | [View](./examples/langchain_lcel_pipeline.py) |
| **LangGraph Multi-Agent** | LangGraph multi-agent orchestration | [View](./examples/langchain_langgraph_multi_agent.py) |
</details>
<details> <summary><b>Advanced Examples</b> - Production, providers & customization</summary>
| Example | Description | Link |
|---|---|---|
| **Production Patterns** | Best practices for production deployments | [View](./examples/production_patterns.py) |
| **Multi-Provider** | Mix multiple AI providers in one cascade | [View](./examples/multi_provider.py) |
| **Reasoning Models** | Use reasoning models (o1/o3, Claude Sonnet 4, DeepSeek-R1) | [View](./examples/reasoning_models.py) |
| **Streaming Tools** | Stream tool calls and responses | [View](./examples/streaming_tools.py) |
| **Batch Processing** | Process multiple queries efficiently | [View](./examples/batch_processing.py) |
| **FastAPI Integration** | Integrate cascades with FastAPI | [View](./examples/fastapi_integration.py) |
| **Edge Device** | Run cascades on edge devices with local models | [View](./examples/edge_device.py) |
| **vLLM Example** | Use vLLM for local model deployment | [View](./examples/vllm_example.py) |
| **Multi-Instance Ollama** | Run draft/verifier on separate Ollama instances | [View](./examples/multi_instance_ollama.py) |
| **Custom Cascade** | Build custom cascade strategies | [View](./examples/custom_cascade.py) |
| **Custom Validation** | Implement custom quality validators | [View](./examples/custom_validation.py) |
| **Semantic Quality Detection** | ML-based domain and quality detection | [View](./examples/semantic_quality_domain_detection.py) |
| **Cost Forecasting** | Forecast costs and detect anomalies | [View](./examples/cost_forecasting_anomaly_detection.py) |
</details>
<img src=".github/assets/CF_ts_color.svg" width="20" height="20" alt="TypeScript" style="vertical-align: middle;"/> TypeScript Examples:
<details open> <summary><b>Basic Examples</b> - Get started quickly</summary>
| Example | Description | Link |
|---|---|---|
| **Basic Usage** | Simple cascade setup (Node.js) | [View](./packages/core/examples/nodejs/basic-usage.ts) |
| **Tool Calling** | Function calling with tools (Node.js) | [View](./packages/core/examples/nodejs/tool-calling.ts) |
| **Multi-Provider** | Mix providers in TypeScript (Node.js) | [View](./packages/core/examples/nodejs/multi-provider.ts) |
| **Reasoning Models** | Use reasoning models (o1/o3, Claude Sonnet 4, DeepSeek-R1) | [View](./packages/core/examples/nodejs/reasoning-models.ts) |
| **Cost Tracking** | Track and analyze costs across queries | [View](./packages/core/examples/nodejs/cost-tracking.ts) |
| **Semantic Quality** | ML-based semantic validation with embeddings | [View](./packages/core/examples/nodejs/semantic-quality.ts) |
| **Streaming** | Stream responses in TypeScript | [View](./packages/core/examples/streaming.ts) |
| **Tool Execution** | Tool execution engine and result handling | [View](./packages/core/examples/nodejs/tool-execution.ts) |
| **Streaming Tools** | Stream tool calls with event detection | [View](./packages/core/examples/nodejs/streaming-tools.ts) |
| **Agentic Multi-Agent** | Multi-turn tool loops & multi-agent orchestration | [View](./packages/core/examples/nodejs/agentic-multi-agent.ts) |
</details>
<details> <summary><b>Advanced Examples</b> - Production, edge & LangChain</summary>
| Example | Description | Link |
|---|---|---|
| **Production Patterns** | Production best practices (Node.js) | [View](./packages/core/examples/nodejs/production-patterns.ts) |
| **Multi-Instance Ollama** | Run draft/verifier on separate Ollama instances | [View](./packages/core/examples/nodejs/multi-instance-ollama.ts) |
| **Multi-Instance vLLM** | Run draft/verifier on separate vLLM instances | [View](./packages/core/examples/nodejs/multi-instance-vllm.ts) |
| **Browser/Edge** | Vercel Edge runtime example | [View](./packages/core/examples/browser/vercel-edge/) |
| **LangChain Basic** | Simple LangChain cascade setup | [View](./packages/langchain-cascadeflow/examples/basic-usage.ts) |
| **LangChain Cross-Provider** | Haiku → GPT-5 with PreRouter | [View](./packages/langchain-cascadeflow/examples/cross-provider-escalation.ts) |
| **LangChain LangSmith** | Cost tracking with LangSmith | [View](./packages/langchain-cascadeflow/examples/langsmith-tracing.ts) |
| **LangChain Cost Tracking** | Compare cascadeflow vs LangSmith cost tracking | [View](./packages/langchain-cascadeflow/examples/cost-tracking-providers.ts) |
| **LangGraph Multi-Agent** | LangGraph multi-agent orchestration | [View](./packages/langchain-cascadeflow/examples/langgraph-multi-agent.ts) |
| **LangChain Tool Risk Gating** | Tool routing based on risk and complexity | [View](./packages/langchain-cascadeflow/examples/tool-risk-gating.ts) |
</details>
📂 View All Python Examples → | View All TypeScript Examples →
Three tiers of integration — zero-change observability to full policy control:
Tier 1: Zero-change observability ```python import cascadeflow cascadeflow.init(mode="observe")
**Tier 2: Scoped runs with budget**python with cascadeflow.run(budget=0.50, max_tool_calls=10) as session: result = await agent.run("Analyze this dataset") print(session.summary()) # cost, latency, energy, steps, tool calls print(session.trace()) # full decision audit trail
**Tier 3: Decorated agents with policy**python @cascadeflow.agent(budget=0.20, compliance="gdpr", kpi_weights={"quality": 0.6, "cost": 0.3, "latency": 0.1}) async def my_agent(query: str): return await llm.complete(query) ```
---
| Dimension | External Proxy | cascadeflow Harness |
|---|---|---|
| **Scope** | HTTP request boundary | Inside agent execution loop |
| **Dimensions** | Cost only | Cost + quality + latency + budget + compliance + energy |
| **Latency overhead** | 10-50ms network RTT | <5ms in-process |
| **Business logic** | None | KPI weights and targets |
| **Enforcement** | None (observe only) | stop, deny_tool, switch_model |
| **Auditability** | Request logs | Per-step decision traces |
cascadeflow is a library and agent harness — an intelligent AI model cascading package that dynamically selects the optimal model for each query or tool call through speculative execution. It's based on the research that 40-70% of queries don't require slow, expensive flagship models, and domain-specific smaller models often outperform large general-purpose models on specialized tasks. For the remaining queries that need advanced reasoning, cascadeflow automatically escalates to flagship models if needed.
<details> <summary><b>Use Cases</b></summary>
allow, switch_model, deny_tool, stop — based on current context and policy state. Closes the gap between analytics and execution.ℹ️ Note: SLMs (under 10B parameters) are sufficiently powerful for 60-70% of agentic AI tasks. Research paper
</details>
---
cascadeflow 是由 Lemony Inc. 与社区共同打造的高性能 AI 模型调度框架。它通过创新的级联机制,让开发者能够以单一的入口调用数百个专业化模型,旨在为复杂的 AI 工作流提供更智能、更经济的解决方案。无论是在纽约还是苏黎世,cascadeflow 都在帮助开发者实现高效的模型管理。
cascadeflow 核心特性在于其独特的 Speculative Cascading(投机级联)技术:系统会优先尝试成本更低的轻量化模型,并根据需求智能升级至高性能模型。这一机制已在生产环境中得到验证,能够为开发者节省 40%-85% 的成本,同时通过小模型的极速响应(<50ms)实现 2-10 倍的性能提升,大幅优化延迟体验。
cascadeflow 支持多种集成方式。对于 n8n 用户,可在 Settings → Community Nodes 中搜索 `@cascadeflow/n8n-nodes-cascadeflow` 进行安装。对于开发者,TypeScript 环境下可通过 npm 安装 `@cascadeflow/langchain` 相关包;Python 环境下则直接使用 pip 安装 `cascadeflow` 和 `langchain-openai` 即可快速接入。
cascadeflow 提供了极简的迁移体验。通过使用 `withCascade` 包装器,开发者可以在 5 分钟内将现有的 LangChain chat model 实现无缝替换为 cascadeflow 版本,从而在不改变原有业务逻辑的前提下,立即获得成本控制、透明度提升以及更优的响应速度。
用户可以通过配置环境变量来增强可观测性。例如,通过设置 `LANGSMITH_API_KEY`、`LANGSMITH_PROJECT` 和 `LANGSMITH_TRACING` 等参数,可以轻松开启 LangSmith 追踪功能,实现对 AI 调用链路的深度监控与调试。
cascadeflow 的 Harness API 提供三个层级的集成深度:Tier 1 支持零代码改动的观测模式(observe mode),自动追踪所有 OpenAI/Anthropic SDK 调用;Tier 2 支持带预算的范围运行(scoped runs),允许开发者设置 budget 和 max_tool_calls,并获取详细的成本、延迟及决策审计轨迹;Tier 3 则提供完全的策略控制。
cascadeflow是AI工作流领域的创新框架,以成本-质量权衡为核心卖点,代码质量良好且更新活跃。架构设计合理,对构建高效智能体系统有实际价值。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
总体来看,cascadeflow n8n工作流 是一款质量优秀的Agent工作流,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | cascadeflow |
| 原始描述 | 开源n8n工作流:Cascading runtime for AI agents. Optimize cost, latency, quality, and policy dec。⭐1.3k · Python |
| Topics | AI智能体工作流编排成本优化n8n集成Python框架 |
| GitHub | https://github.com/lemony-ai/cascadeflow |
| License | MIT |
| 语言 | Python |
收录时间:2026-05-13 · 更新时间:2026-05-16 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端