1
摄取层:Web抓取 / 视频转录 100% 自动化LLM成本 → $0
✕ V5.0 手动/插件
手动触发Web Clipper
手动启动YouTube Transcriber
→
✓ Python自动化
requests+BS4 定时抓取RSS/网页
yt-dlp+Whisper 本地视频转录
watchdog 文件变更监听
P
PII脱敏:纯规则引擎 Microsoft PresidioLLM成本 → $0
# pii_scrub.py — 无LLM调用
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
analyzer, anonymizer = AnalyzerEngine(), AnonymizerEngine()
def scrub_pii(text):
results = analyzer.analyze(text=text, language="zh",
entities=["PHONE_NUMBER", "EMAIL", "PERSON", "ID_NUMBER"])
return anonymizer.anonymize(text=text, analyzer_results=results).text
▽
知识漏斗:F1+F2用向量替代LLM分类 消除LLM调用节省 ~60% 总成本
# funnel.py — 无LLM,纯向量数学
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1")
# 三条主线向量(一次生成,永久复用)
GOAL_VECTORS = model.encode([
"property maintenance B2B SaaS Hasiki",
"Spain Barcelona real estate investment",
"financial alpha investment edge market signal",
])
def score_item(item):
# F3时效性: datetime比较 — 纯Python
age = (datetime.now() - item["ts"]).days
if age > 7: return 0.0
# F1+F2向量相似度 — 0 LLM调用
vec = model.encode([item["title"] + " " + item["text"][:500]])
return float(cosine_similarity(vec, GOAL_VECTORS).max())
⟳
Index自动维护 + Vector DB watchdogChromaDBLLM成本 → $0
# vault_watcher.py — 监听Obsidian变更
from watchdog.observers import Observer
class VaultHandler(FileSystemEventHandler):
def on_modified(self, event):
if event.src_path.endswith(".md"):
coll.upsert(ids=[path], documents=[text]) # 向量化
rebuild_index() # 重建Index
🧠
保留LLM的三件事(无法替代)精准调用,不浪费
① 增量构建
漏斗通过的~12%内容提炼
每日批量 1次,Haiku
② 深度研究产出
商业计划/投资模型/决策
按需,Sonnet/Opus
③ 结构化关联
跨笔记语义连接
每周批处理 1次
$
Token成本追踪 Python装饰器自动记录,无额外成本
# cost_tracker.py — 每次LLM调用自动记录
MONTHLY_BUDGET = 20.0
def track_cost(model): # Python装饰器
def decorator(fn):
def wrapper(*args, **kwargs):
resp = fn(*args, **kwargs)
usd = (resp.usage.input_tokens/1e6)*PRICE[model][0]
csv.writer(open("cost_log.csv","a")).writerow([...])
if sum_monthly_cost() > MONTHLY_BUDGET * 0.8:
print(f"⚠ 月成本已达上限{MONTHLY_BUDGET}")
return resp
return wrapper
return decorator