经 AI Skill Hub 精选评估,AI工作站 获评「推荐使用」。这款Agent工作流在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 7.5 分,适合有一定技术背景的用户使用。
AI工作站 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
AI工作站 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
# 方式一:pip 安装(推荐)
pip install guaardvark
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install guaardvark
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/guaardvark/guaardvark
cd guaardvark
pip install -e .
# 验证安装
python -c "import guaardvark; print('安装成功')"
# 命令行使用
guaardvark --help
# 基本用法
guaardvark input_file -o output_file
# Python 代码中调用
import guaardvark
# 示例
result = guaardvark.process("input")
print(result)
# guaardvark 配置文件示例(config.yml) app: name: "guaardvark" debug: false log_level: "INFO" # 运行时指定配置文件 guaardvark --config config.yml # 或通过环境变量配置 export GUAARDVARK_API_KEY="your-key" export GUAARDVARK_OUTPUT_DIR="./output"
<p align="center"> <img src="docs/screenshots/og-image.jpg" alt="Guaardvark — Secure Offline AI Platform" width="640"> </p>
A full creative-professional AI workstation, all running locally:
Generation - Video (Text-to-Video, Image-to-Video) — Wan 2.2, CogVideoX 2B/5B, SVD-XT. No workflow graph required: paste a list of prompts, pick a model and resolution, hit go. The queue handles the rest while you start the next batch. - Audio Studio — music generation (ACE-Step, full songs with vocals or instrumental), sound-effect lab (Stable Audio Open), neural voice (Chatterbox + Kokoro), and 6 Piper voice profiles out of the box. - Voice Cloning — gated behind an explicit consent prompt before any clone is created or used. - Image generation — Stable Diffusion via Diffusers with batch queue, face restoration, anatomy and detail controls. - Image + Video Upscaling — 4K and 8K via HAT-L, RealESRGAN family, NMKD-Superscale, Foolhardy Remacri. Two-pass mode for maximum quality. Frame-by-frame video processing. - Batch CSV Generator — generate unique web pages, post content, or structured data from a CSV using your indexed knowledge base as ground truth. Marketing copy, product pages, unique-content campaigns at scale. - File Generation — code, text, docs, images, video, audio in one queue.
Editing - Video Editor — Shotcut-lite timeline with three lanes (video / text / audio), drag-and-drop from the media library, real text overlay rendering via ffmpeg, visual trim sliders, keyboard shortcuts, one-step undo. - Video Text Overlay — standalone tool for the simpler one-off case.
Agents & Automation - Autonomous screen agents — agents see a real virtual desktop (Xvfb :99), move the mouse, click, type, navigate browsers, and verify their own actions. - AgentBrain — three-tier neural routing: Reflex (<100ms), Instinct (1–3s), Deliberation (5–30s). - Agent Training System — visual hand-eye-coordination teaching: bracket a session with Begin/End Lesson, walk the agent through a flow with thumbs-up pearls, the system distills a structured replayable lesson with parameterized steps. - Agent Memory + Learning — system-message persistent knowledge that survives reboots, recipe induction from successful tasks (Agent Workflow Memory pattern), vision-actionable knowledge with no cached pixel coordinates. - Agent Swarms — up to 20 parallel coding agents, each in an isolated git worktree on its own branch. Dependency-ordered merging. Flight Mode (fully offline). Backends: Claude Code, Cline/OpenClaw via local Ollama. - Agents · Agent Tools · Virtual Agent Screen — explorable surfaces for each capability, with a draggable VNC viewer that works on any page. - Voice Chat — Whisper.cpp transcribes, the agent thinks, Piper speaks. Toggle with /voice. - Outreach System — supervised AI for social-media engagement (Reddit, Discord, Twitter/X, Facebook) grounded in your indexed knowledge. Full detail below. - Self-Improvement — detects test failures, dispatches an agent to read the offending code and fix it, verifies, broadcasts to other instances. Optional Anthropic-API guardian review. - Auto Researcher — autonomous RAG-pipeline optimizer that experiments with parameters, keeps wins, reverts losses.
Workflow Surfaces - File Manager — drag from your real desktop into the in-app File Manager. Color-code files, copy & paste, drag-and-drop reorganize. Folder / List / Media views. Right-click menus (copy, paste, delete, recursive index). Files attach to clients, projects, websites, notes, or code repos. - Notes Manager · Media Manager · Project Management · Client Management · Websites Management — consistent grid+detail UI for the working surfaces a small business actually uses. Cross-linked: documents attach to projects attach to clients attach to websites. - Dashboard — live status grid: model health, GPU usage, RAG state, agent activity, plugin states. - Code Editor — Monaco-based IDE with right-click "explain", "fix", "generate" via the AI assistant. - Code Analyzer · Code Repos — repo-level understanding and per-repo indexing. - Task Scheduler — cron-style scheduling for any agent task or generation job. - Rules & Prompts — import/export rules and prompts as a portable bundle.
Integration - ComfyUI Backend — managed as a plugin, used as the execution layer for advanced video pipelines. - WordPress Connectivity — push generated content directly into a WordPress site via a companion plugin. Functional today; ships with security disclaimers and a finishing-pass on the roadmap before the plugin moves out of beta.
Platform - Plugin System — every heavy capability (ComfyUI, Vision Pipeline, Audio Foundry, Upscaling, Discord, Swarm) is a managed plugin with health monitoring, port-based orphan cleanup, and a System Resource Orchestrator that arbitrates VRAM between them so two big models don't fight for the GPU. - CPU Offload for models that don't fit in VRAM. - GPU + CPU Resource Monitor — live, always visible. - Interconnector / Cluster — install Guaardvark on multiple local machines, master/client architecture with approval workflows, automatic load balancing across the fleet, hardware profile auto-detection. - Model Management — download voice/video/image models from HuggingFace with progress tracking. Quick-switch between local Ollama models. Quick-switch embedding models grouped by parameter count. - Backup & Restore — granular or full system backup, schema-migration-aware restore, cross-version compatible. - Advanced Settings — debugging toggles, RAG knobs, cache controls, diagnostic tools, test runners, self-improvement controls — exposed in the UI, not hidden behind a "config files only" wall.
<p align="center"> <img src="docs/screenshots/guaardvark-demo.gif" alt="Guaardvark Demo" width="100%"> </p>
<p align="center"> <img src="docs/screenshots/swarm-demo.gif" alt="Agent Swarm — parallel Claude Code agents across isolated git worktrees" width="100%"> <br> <em>Agent Swarm — parse a plan, spawn parallel agents in isolated git worktrees, resolve the dependency DAG, merge back to main.</em> </p>
git checkout, inspects venv / requirements.txt / Alembic head / package.json and re-syncs only what differs between branchesschema_sync.py is the authoritative schema source; saves you from "I just switched branches and now nothing works"| Dependency | Version | Notes |
|---|---|---|
| Python | 3.12+ | Backend |
| Node.js | 20+ | Frontend build |
| PostgreSQL | 14+ | Auto-installed |
| Redis | 5.0+ | Auto-installed |
| Ollama | latest | Local LLM inference |
| CUDA GPU | 8GB+ VRAM | 16GB recommended for video generation |
pip install guaardvark
The CLI connects to a running Guaardvark instance or launches a lightweight embedded server automatically.
---
git clone https://github.com/guaardvark/guaardvark.git
cd guaardvark
./start.sh
First run handles everything: Python venv, Node dependencies, PostgreSQL, Redis, Ollama, Whisper.cpp, database migrations, frontend build, and all services. Requires your system password once for PostgreSQL setup.
| Service | URL |
|---|---|
| Web UI | http://localhost:5173 |
| API | http://localhost:5000 |
| Health Check | http://localhost:5000/api/health |
./start.sh # Full startup with health checks
./start.sh --fast # Skip dependency checks
./start.sh --test # Health diagnostics
./start.sh --plugins # Start all enabled plugins
./stop.sh # Stop all services
| Feature | Minimum | Recommended |
|---|---|---|
| Chat + RAG | 4GB | 8GB |
| Image generation | 6GB | 12GB |
| Wan 2.2 video | 11GB | 16GB |
| CogVideoX-5B video | 16GB | 20GB |
| Upscaling | 0.5GB | 2–4GB |
---
| Dashboard | Code Editor |
|---|---|
|  |  |
| Media Library | Video Generation |
|---|---|
|  |  |
| Plugins | Swarm Plan Editor |
|---|---|
|  |  |
| Settings — RAG | Settings — Memory |
|---|---|
|  |  |
---
Five specialized agents collaborate to turn a one-line idea into a finished video. Built on the Swarm Orchestrator, so every role runs in parallel where possible and merges back deterministically.
| Role | What It Does |
|---|---|
| **Screenwriter** | Generates the script + scene breakdown from a logline |
| **Casting** | Assigns characters to LoRAs (via the LoRA Trainer plugin) or stock characters |
| **Cinematographer** | Produces a shot list with camera moves, framing, and lens choices |
| **Storyboard** | Generates keyframe images for every shot via the image pipeline |
| **Editor** | Assembles the generated clips into a finished video via the Video Editor |
The LoRA Trainer plugin ships alongside — train character/environment/prop LoRAs from reference images on your local GPU (bf16, ~46 MB per LoRA) and route them automatically to the Casting agent.
State-of-the-art video generation running entirely on your GPU. No cloud APIs, no per-minute billing, no content restrictions.
| Video Generation | Plugin System |
|---|---|
|  |  |
| Model | Type | Max Duration | Native Resolution | VRAM |
|---|---|---|---|---|
| **Wan 2.2 (14B MoE)** | Text-to-Video | 5s (81 frames @ 16fps) | 832x480 | 11GB |
| **CogVideoX-5B** | Text-to-Video | 6s (49 frames @ 8fps) | 720x480 | 16GB |
| **CogVideoX-2B** | Text-to-Video | 6s (49 frames @ 8fps) | 720x480 | 12GB |
| **CogVideoX-5B I2V** | Image-to-Video | 6s (49 frames @ 8fps) | 720x480 | 16GB |
| **SVD XT** | Text-to-Video | 3.5s (25 frames @ 7fps) | 512x512 | <8GB |
plugin.json is a static manifest (same bytes on every machine); live state (enabled, auto_start, config) lives in data/plugin_state.json (gitignored). Toggling from the UI writes only to runtime state — the manifest never mutatesGuaardvark 是一个安全的离线 AI 平台,提供了一个全面的创作和专业的 AI 工作站,所有功能都运行在本地。
Guaardvark 包括以下功能:视频(文本到视频、图像到视频)生成、音频工作室(音乐生成、歌曲生成等)等。
Guaardvark 的环境依赖包括 Python 3.12+、Node.js 20+、PostgreSQL 14+、Redis 5.0+、Ollama 等。
可以通过 pip 安装 Guaardvark,命令为 `pip install guaardvark`。
使用 Guaardvark 的步骤包括克隆仓库、切换分支、运行 `start.sh` 脚本等。
Guaardvark 提供了多种配置选项,包括调试开关、RAG 控制器、缓存控制器等。
Guaardvark 提供了一个完整的工作流和模块系统,包括 Film Crew(电影制作流程)、视频生成管道等。
高质量AI工作流项目,具有较强的实用价值
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
AI Skill Hub 点评:AI工作站 的核心功能完整,质量良好。对于自动化工程师和运维人员来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | guaardvark |
| 原始描述 | 开源AI工作流:The self-hosted AI workstation. Autonomous screen agents, 3-tier neural routing,。⭐24 · Python |
| Topics | aiai-agentsflaskpython |
| GitHub | https://github.com/guaardvark/guaardvark |
| License | MIT |
| 语言 | Python |
收录时间:2026-05-29 · 更新时间:2026-05-30 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端