rag-from-scratch — AI Agent 工作流中文教程 是 AI Skill Hub 本期精选Agent工作流之一。已获得 1.4k 颗 GitHub Star,综合评分 8.5 分,整体质量较高。我们强烈推荐将其纳入你的 AI 工具库,帮助提升工作效率。
rag-from-scratch — AI Agent 工作流中文教程 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
rag-from-scratch — AI Agent 工作流中文教程 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
# 方式一:npm 全局安装 npm install -g rag-from-scratch # 方式二:npx 直接运行(无需安装) npx rag-from-scratch --help # 方式三:项目依赖安装 npm install rag-from-scratch # 方式四:从源码运行 git clone https://github.com/pguso/rag-from-scratch cd rag-from-scratch npm install npm start
# 命令行使用
rag-from-scratch --help
# 基本用法
rag-from-scratch [options] <input>
# Node.js 代码中使用
const rag_from_scratch = require('rag-from-scratch');
const result = await rag_from_scratch.run(options);
console.log(result);
# rag-from-scratch 配置说明 # 查看配置选项 rag-from-scratch --config-example > config.yml # 常见配置项 # output_dir: ./output # log_level: info # workers: 4 # 环境变量(覆盖配置文件) export RAG_FROM_SCRATCH_CONFIG="/path/to/config.yml"
Demystify Retrieval-Augmented Generation (RAG) by building it yourself - step by step. No black boxes. No cloud APIs. Just clear explanations, simple examples, and local code you fully understand.
This project follows the same philosophy as AI Agents from Scratch: make advanced AI concepts approachable for developers through minimal, well-explained, real code.
---
Retrieval-Augmented Generation (RAG) enhances language models by giving them access to external knowledge. Instead of asking the model to “remember” everything, you let it retrieve relevant context before generating a response.
Pipeline: 1. Knowledge Requirements, define questions and data needs. 2. Data Loading, import and structure your documents. 3. Text Splitting & Chunking, divide data into manageable pieces. 4. Embedding, turn chunks into numerical vectors. 5. Vector Store, save and index embeddings for fast retrieval. 6. Retrieval, fetch the most relevant context for a given query. 7. Post-Retrieval Re-Ranking, re-order results to prioritize the best context. 8. Query Preprocessing & Embedding Normalization, clean and standardize input vectors for consistency. 9. Augmentation, merge retrieved context into the model’s prompt. 10. Generation, produce grounded answers using a local LLM.
---
The following core components and examples are currently available:
Examples and tutorials: - 00_how_rag_works - Minimal RAG simulation to understand the concept - 01_intro_to_llms - Getting started with local LLMs (node-llama-cpp basics, building LLM wrapper) - 02_data_loading - Loading and preprocessing raw text data - 03_text_splitting_and_chunking - Splitting long text into manageable chunks - 04_intro_to_embeddings - Text similarity basics and generating embeddings - 05_building_vector_store - In-memory store, nearest neighbor search, metadata filtering - 06_retrieval_strategies/01_basic_retrieval - Basic retrieval and similarity scoring - 06_retrieval_strategies/02_query_preprocessing - Query normalization and cleaning before retrieval - 06_retrieval_strategies/03_hybrid_search - Combining vector and keyword (e.g. BM25) search - 06_retrieval_strategies/04_multi_query_retrieval - Query decomposition (LLM), parallel retrieval, RRF and weighted fusion, deduplication - 06_retrieval_strategies/05_query_rewriting - Normalization, heuristic and LLM rewrite (Qwen 3-1 via node-llama-cpp), intent classification
Library: Loaders, text splitters, embeddings, vector stores, retrievers, chains, prompts (see Project Structure).
The following topics will be added step by step in the coming weeks and months:
Retrieval strategies: - Result ranking and scoring - Post-retrieval reranking
Prompt engineering for RAG: - Context stuffing techniques - Citation and source attribution prompts - Context compression
RAG in production: - Error handling and fallbacks - Streaming responses - End-to-end RAG pipeline examples
Evaluation and optimization: - Retrieval metrics (precision, recall, MRR) - Generation quality metrics - End-to-end evaluation frameworks
Advanced features: - Observability and performance monitoring - Caching strategies for repeated queries - Metadata and structured data handling - Graph database integration (e.g. kuzu) - Multi-modal RAG
Templates and guides: - Complete starter templates (simple RAG, API server, chatbot) - Higher-level tutorials and best practices
Note: This is an educational project focused on building understanding from the ground up. Each new topic will be introduced with clear explanations, minimal examples, and thoroughly commented code. The goal is not to rush through features, but to ensure every concept is deeply understood before moving to the next.
---
node-llama-cpp)kuzuInstall dependencies:
npm install
node 00_how_rag_works/example.js
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
经综合评估,rag-from-scratch — AI Agent 工作流中文教程 在Agent工作流赛道中表现稳健,质量优秀。如果你已有明确的使用需求,可以直接上手体验;如果还在评估阶段,建议对比同类工具后再做决策。
| 原始名称 | rag-from-scratch |
| 原始描述 | Demystify RAG by building it from scratch. Local LLMs, no black boxes - real understanding of embeddings, vector search, retrieval, and context-augmented generation. |
| Topics | agentsai-agentseducationalllmnode-llama-cppnodejsrag |
| GitHub | https://github.com/pguso/rag-from-scratch |
| License | MIT |
| 语言 | JavaScript |
收录时间:2026-05-22 · 更新时间:2026-05-22 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端