← BACK

V5.1 Simplified: Python First, LLM LastV5.1 精简化方案:Python优先,LLM最后

AUTOMATE EVERYTHING DETERMINISTIC · LLM ONLY FOR GENUINE INTELLIGENCE

1
摄取层:Web抓取 / 视频转录 100% 自动化
LLM成本 → $0
✕ V5.0 手动/插件
手动触发Web Clipper
手动启动YouTube Transcriber
✓ Python自动化
requests+BS4 定时抓取RSS/网页
yt-dlp+Whisper 本地视频转录
watchdog 文件变更监听
P
PII脱敏:纯规则引擎 Microsoft Presidio
LLM成本 → $0
# pii_scrub.py — 无LLM调用 from presidio_analyzer import AnalyzerEngine from presidio_anonymizer import AnonymizerEngine analyzer, anonymizer = AnalyzerEngine(), AnonymizerEngine() def scrub_pii(text): results = analyzer.analyze(text=text, language="zh", entities=["PHONE_NUMBER", "EMAIL", "PERSON", "ID_NUMBER"]) return anonymizer.anonymize(text=text, analyzer_results=results).text
知识漏斗:F1+F2用向量替代LLM分类 消除LLM调用
节省 ~60% 总成本
# funnel.py — 无LLM,纯向量数学 from sentence_transformers import SentenceTransformer from sklearn.metrics.pairwise import cosine_similarity model = SentenceTransformer("nomic-ai/nomic-embed-text-v1") # 三条主线向量(一次生成,永久复用) GOAL_VECTORS = model.encode([ "property maintenance B2B SaaS Hasiki", "Spain Barcelona real estate investment", "financial alpha investment edge market signal", ]) def score_item(item): # F3时效性: datetime比较 — 纯Python age = (datetime.now() - item["ts"]).days if age > 7: return 0.0 # F1+F2向量相似度 — 0 LLM调用 vec = model.encode([item["title"] + " " + item["text"][:500]]) return float(cosine_similarity(vec, GOAL_VECTORS).max())
Index自动维护 + Vector DB watchdogChromaDB
LLM成本 → $0
# vault_watcher.py — 监听Obsidian变更 from watchdog.observers import Observer class VaultHandler(FileSystemEventHandler): def on_modified(self, event): if event.src_path.endswith(".md"): coll.upsert(ids=[path], documents=[text]) # 向量化 rebuild_index() # 重建Index
🧠
保留LLM的三件事(无法替代)
精准调用,不浪费
① 增量构建
漏斗通过的~12%内容提炼
每日批量 1次,Haiku
② 深度研究产出
商业计划/投资模型/决策
按需,Sonnet/Opus
③ 结构化关联
跨笔记语义连接
每周批处理 1次
$
Token成本追踪 Python装饰器
自动记录,无额外成本
# cost_tracker.py — 每次LLM调用自动记录 MONTHLY_BUDGET = 20.0 def track_cost(model): # Python装饰器 def decorator(fn): def wrapper(*args, **kwargs): resp = fn(*args, **kwargs) usd = (resp.usage.input_tokens/1e6)*PRICE[model][0] csv.writer(open("cost_log.csv","a")).writerow([...]) if sum_monthly_cost() > MONTHLY_BUDGET * 0.8: print(f"⚠ 月成本已达上限{MONTHLY_BUDGET}") return resp return wrapper return decorator