能力标签

🔄 工作流 💻 CLI 🔗 REST API 🧬 Embedding 📚 RAG 🎙 STT ✨ GPT

🛠

AI工具

霍维克

基于 JavaScript · 开源免费，本地部署，数据完全自主可控

英文名：Hoovik

⭐ 6 Stars 🍴 21 Forks 💻 JavaScript 📄 MIT 🏷 AI 8.0分

8.0AI 综合评分

ai分布式系统情感识别FastAPILLMJavaScript

🌐 访问官网

✦ AI Skill Hub 推荐

经 AI Skill Hub 精选评估，霍维克获评「强烈推荐」。这款AI工具在功能完整性、社区活跃度和易用性方面表现出色，AI 评分 8.0 分，适合有一定技术背景的用户使用。

📚 深度解析

霍维克是一款基于 JavaScript 的开源工具，在 GitHub 上收获 0k+ Star，是ai、分布式系统、情感识别、FastAPI领域中的优质开源项目。开源工具的最大优势在于代码完全透明，你可以审计每一行代码的安全性，也可以根据自身需求进行二次开发和定制。

**为什么要使用开源工具而非商业 SaaS？**
对于个人开发者和有隐私需求的用户，本地部署的开源工具意味着数据不离本机，不受第三方服务商的数据政策约束。同时，开源工具通常没有使用次数限制和月度费用，一次安装即可长期使用，对于高频使用场景的总拥有成本（TCO）远低于订阅制商业工具。

**安装与环境准备**
霍维克依赖 JavaScript 运行环境。建议通过 pyenv（Python）或 nvm（Node.js）管理 JavaScript 版本，避免全局环境污染。对于新手用户，推荐先创建虚拟环境（python -m venv venv && source venv/bin/activate），再安装依赖，这样即使出现问题也可以随时删除虚拟环境重新开始，不影响系统稳定性。

**社区与维护**
GitHub Issue 和 Discussion 是获取帮助的最快渠道。在提问前建议先检查 Closed Issues（已关闭的问题），大多数常见问题都已有解答。遇到 Bug 时，提供 pip list 的输出、完整错误堆栈和最小可复现示例，能显著提高开发者响应速度。AI Skill Hub 将持续追踪霍维克的版本更新，及时通知重要功能变化。

📋 工具概览

分布式会议智能平台，支持WebRTC点对点视频

霍维克是一款基于 JavaScript 开发的开源工具，专注于 ai、分布式系统、情感识别等核心功能。作为 GitHub 开源项目，它拥有活跃的社区支持和持续的版本迭代，代码完全透明可审计，支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流，都能提供稳定可靠的解决方案。

GitHub Stars

⭐ 6

开发语言

JavaScript

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

MIT

AI 综合评分

8.0 分

工具类型

AI工具

Forks

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

分布式会议智能平台，支持WebRTC点对点视频

📌 核心特色

开源免费，支持本地部署，数据完全自主可控
活跃的 GitHub 开源社区，持续迭代更新
提供详细文档和使用示例，新手友好
支持自定义配置，灵活适配不同使用环境
可作为基础组件集成进现有技术栈或进行二次开发

🎯 主要使用场景

本地部署运行，保护数据隐私，满足合规要求
自定义集成到现有系统，扩展技术栈能力
作为开源基础组件进行商业化二次开发

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# 方式一：npm 全局安装
npm install -g hoovik

# 方式二：npx 直接运行（无需安装）
npx hoovik --help

# 方式三：项目依赖安装
npm install hoovik

# 方式四：从源码运行
git clone https://github.com/AnupamKumar-1/Hoovik
cd Hoovik
npm install
npm start

📋 安装步骤说明

访问 GitHub 仓库页面
按照 README 文档完成依赖安装
根据系统环境完成初始化配置
参考官方示例或文档开始使用
遇到问题可在 GitHub Issues 中查找解答

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 命令行使用
hoovik --help

# 基本用法
hoovik [options] <input>

# Node.js 代码中使用
const hoovik = require('hoovik');

const result = await hoovik.run(options);
console.log(result);

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

# hoovik 配置说明
# 查看配置选项
hoovik --config-example > config.yml

# 常见配置项
# output_dir: ./output
# log_level: info
# workers: 4

# 环境变量（覆盖配置文件）
export HOOVIK_CONFIG="/path/to/config.yml"

📑 README 深度解析真实文档完整度 52/100 含工作流图查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

简介

If Hoovik has been useful, please give it a ⭐ — it takes 2 seconds and means the world.

Hoovik demo

</div>

---

🔧 Key Technical Highlights

Area	What was built
WebRTC signalling	SDP/ICE relay over Socket.IO; Redis adapter fans events across 3 pm2 processes; distributed join lock (`SET NX PX 10000` + Lua CAS) serialises concurrent joins
Multimodal emotion inference	MediaPipe (136 landmarks + blendshapes + head pose) + Wav2Vec2 → `EmotionTransformer` + XGBoost (temp-calibrated) + per-modality IsolationForest anomaly detection → EMA (α=0.65); graceful `both/audio_only/video_only` modality fallback; ~300–500 ms P50
Browser media pipeline	`AudioWorklet` + `AnalyserNode` for RMS-gated noise detection; `MediaRecorder` per participant; SSRC-based active speaker with RMS fallback
Async transcript pipeline	HTTP 202 immediately; background: ffmpeg → Whisper (`small`) → segment merging → DistilRoBERTa per-segment emotion → `build_intelligent_summary` → HTTP POST callback (3 retries: 5 s → 15 s → 30 s on network/5xx; 4xx not retried)
Multi-process backend	3 pm2 instances via `@socket.io/redis-adapter`; participant map as Redis Hash (`HSET`/`HDEL` per event); no in-process room state
Auth & rate limiting	JWT + HttpOnly refresh token rotation; Redis Lua INCR+EXPIRE per-IP and per-username; account lockout after 10 failed logins (900 s TTL); uniform `401` prevents username enumeration
AI summary	`generateAiSummaryService` accepts `emotionData`/`emotionNames` from browser; `buildGroqPrompt` annotates each Whisper segment with matched live facial/audio emotion via `buildSpeakerLiveMap`; returns `discrepancies[]` and `live_dominant_emotion` per speaker; Groq model `llama-3.1-8b-instant`; rate-limited 2× per 2 hours
RAG pipeline	Transcripts chunked (segment-based or sliding-window, 600 tokens, 100 overlap) → Nomic `nomic-embed-text-v1.5` embeddings cached in Redis (7-day TTL) → BullMQ background indexing → MongoDB `$vectorSearch` + MMR reranking (`λ=0.6`, top-5) → Groq `llama-3.3-70b-versatile` with 30-message session history; SSE streaming
Redis test suite	25 tests covering distributed cache, locks, rate limiting, pub/sub, batch ops, reconnection recovery; CI runs 20 via `npm run test:redis:ci`

---

Quick start

chmod +x dev.sh   # one-time
./dev.sh          # starts all 4 services with colour-coded output

Prefix	Service	Port
`FRONTEND`	React SPA	`3000`
`BACKEND`	Node.js / Express	`8000`
`EMOTION`	FastAPI emotion inference	`5002`
`TRANSCRIPT`	FastAPI transcription	`5001`

Start MongoDB and Redis first. Python venvs must exist at emotion_service/venv and transcript_service/venv — dev.sh invokes them directly via ./emotion_service/venv/bin/python and ./transcript_service/venv/bin/python. Ctrl+C sends SIGINT and kills all child processes cleanly. Windows: dev.sh is a bash script. Use WSL2 (recommended), Git Bash, or start each service manually in four separate terminals — see docs/CONTRIBUTING.md for the PowerShell commands.

Health endpoints

GET /health   → 200 OK if all models loaded
GET /ready    → 200 OK if service is accepting connections
GET /stats    → live performance dashboard (browser)
GET /stats/json → machine-readable P50/P90/P95 + participant count

</details>

<details> <summary>📝 Transcript Service — AI summaries & insights delivered post-meeting</summary>

Hoovik's post-meeting pipeline — every meeting is automatically transcribed, per-segment emotion is classified, and a Groq LLM generates a structured summary with a discrepancy report that flags where what someone said didn't match how they felt.

AI Summary API

POST /api/v1/transcripts/:id/summary
Content-Type: application/json

{ "emotionData": {...}, "emotionNames": {...} }

Response:

{
  "summary": "...",
  "key_points": ["..."],
  "discrepancies": [
    {
      "speaker": "Alice",
      "segment": "That timeline works for me.",
      "nlp_emotion": "positive",
      "live_emotion": "stressed"
    }
  ],
  "insights": {
    "dominant_emotion": "neutral",
    "emotion_distribution": { "neutral": 60, "joy": 25, "anger": 15 },
    "speaker_stats": {
      "Alice": { "turns": 12, "dominant_emotion": "neutral", "word_count": 342 }
    },
    "top_topics": ["deadline", "budget", "Q3"],
    "speaking_pace_wpm": 148
  }
}

Note: After the meeting ends, the frontend polls for transcript availability using exponential backoff — delays of 5 s → 10 s → 20 s → 40 s (±20% jitter), then repeating at 40-second intervals up to a 10-minute wall clock cap. No fixed polling interval or fixed attempt count is used. Summary generation is rate-limited to 2 requests per 2 hours per transcript.

Pipeline entry point

POST /process_meeting
Content-Type: multipart/form-data

audio_files[]: <blob>        # one file per speaker
meeting_code: "ABC123"
speaker_map: {"alice": "Alice"}   # filename-base → display name
x-host-secret: <secret>
x-user-token: <jwt>          # optional

Returns HTTP 202 immediately. Processing happens in the background.

🎯 aiskill88 AI 点评 A 级 2026-06-01

创新性的分布式会议智能平台

📚 实用指南（长尾问题）

适合谁