🛠

AI工具

Arkiv媒体资产管理器

基于 Python · 开源免费，本地部署，数据完全自主可控

英文名：arkiv

⭐ 31 Stars 🍴 4 Forks 💻 Python 📄 MIT 🏷 AI 7.8分

7.8AI 综合评分

媒体管理AI搜索元数据视频编辑本地优先

📚 深度解析

Arkiv媒体资产管理器是一款基于 Python 的开源工具，在 GitHub 上收获 0k+ Star，是媒体管理、AI搜索、元数据、视频编辑领域中的优质开源项目。开源工具的最大优势在于代码完全透明，你可以审计每一行代码的安全性，也可以根据自身需求进行二次开发和定制。

**为什么要使用开源工具而非商业 SaaS？**
对于个人开发者和有隐私需求的用户，本地部署的开源工具意味着数据不离本机，不受第三方服务商的数据政策约束。同时，开源工具通常没有使用次数限制和月度费用，一次安装即可长期使用，对于高频使用场景的总拥有成本（TCO）远低于订阅制商业工具。

**安装与环境准备**
Arkiv媒体资产管理器依赖 Python 运行环境。建议通过 pyenv（Python）或 nvm（Node.js）管理 Python 版本，避免全局环境污染。对于新手用户，推荐先创建虚拟环境（python -m venv venv && source venv/bin/activate），再安装依赖，这样即使出现问题也可以随时删除虚拟环境重新开始，不影响系统稳定性。

**社区与维护**
GitHub Issue 和 Discussion 是获取帮助的最快渠道。在提问前建议先检查 Closed Issues（已关闭的问题），大多数常见问题都已有解答。遇到 Bug 时，提供 pip list 的输出、完整错误堆栈和最小可复现示例，能显著提高开发者响应速度。AI Skill Hub 将持续追踪 Arkiv媒体资产管理器的版本更新，及时通知重要功能变化。

📋 工具概览

本地优先的媒体资产管理工具，集成AI语义搜索和专业相机元数据读取功能。支持DaVinci Resolve集成，适合视频编辑、摄影师和内容创作者快速组织和检索海量媒体文件。

Arkiv媒体资产管理器是一款基于 Python 开发的开源工具，专注于媒体管理、AI搜索、元数据等核心功能。作为 GitHub 开源项目，它拥有活跃的社区支持和持续的版本迭代，代码完全透明可审计，支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流，都能提供稳定可靠的解决方案。

GitHub Stars

⭐ 31

开发语言

Python

支持平台

Windows / macOS / Linux

维护状态

轻量级项目，按需更新

开源协议

MIT

AI 综合评分

7.8 分

工具类型

AI工具

Forks

4

📖 中文文档

以下内容由 AI Skill Hub 根据项目信息自动整理，如需查看完整原始文档请访问底部「原始来源」。

本地优先的媒体资产管理工具，集成AI语义搜索和专业相机元数据读取功能。支持DaVinci Resolve集成，适合视频编辑、摄影师和内容创作者快速组织和检索海量媒体文件。

Arkiv媒体资产管理器是一款基于 Python 开发的开源工具，专注于媒体管理、AI搜索、元数据等核心功能。作为 GitHub 开源项目，它拥有活跃的社区支持和持续的版本迭代，代码完全透明可审计，支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流，都能提供稳定可靠的解决方案。

📌 核心特色

开源免费，支持本地部署，数据完全自主可控
活跃的 GitHub 开源社区，持续迭代更新
提供详细文档和使用示例，新手友好
支持自定义配置，灵活适配不同使用环境
可作为基础组件集成进现有技术栈或进行二次开发

🎯 主要使用场景

本地部署运行，保护数据隐私，满足合规要求
自定义集成到现有系统，扩展技术栈能力
作为开源基础组件进行商业化二次开发

以下安装命令基于项目开发语言和类型自动生成，实际以官方 README 为准。

安装命令

# 方式一：pip 安装（推荐）
pip install arkiv

# 方式二：虚拟环境安装（推荐生产环境）
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate
pip install arkiv

# 方式三：从源码安装（获取最新功能）
git clone https://github.com/vulture-s/arkiv
cd arkiv
pip install -e .

# 验证安装
python -c "import arkiv; print('安装成功')"

📋 安装步骤说明

访问 GitHub 仓库页面
按照 README 文档完成依赖安装
根据系统环境完成初始化配置
参考官方示例或文档开始使用
遇到问题可在 GitHub Issues 中查找解答

以下用法示例由 AI Skill Hub 整理，涵盖最常见的使用场景。

常用命令 / 代码示例

# 命令行使用
arkiv --help

# 基本用法
arkiv input_file -o output_file

# Python 代码中调用
import arkiv

# 示例
result = arkiv.process("input")
print(result)

以下配置示例基于典型使用场景生成，具体参数请参照官方文档调整。

配置示例

# arkiv 配置文件示例（config.yml）
app:
  name: "arkiv"
  debug: false
  log_level: "INFO"

# 运行时指定配置文件
arkiv --config config.yml

# 或通过环境变量配置
export ARKIV_API_KEY="your-key"
export ARKIV_OUTPUT_DIR="./output"

📑 README 深度解析真实文档完整度 87/100 查看 GitHub 原文 →

以下内容由系统直接从 GitHub README 解析整理，保留代码块、表格与列表结构。

arkiv

Open-source AI metadata layer for DIT workflows — Resolve-native, CJK-first.

🌐 English | 繁體中文

arkiv sits between your media drive and DaVinci Resolve: it ingests your footage, attaches AI-generated metadata (transcript, vision tags, atmosphere, energy, edit position), and surfaces clips via semantic search in any language — Chinese, Japanese, or English. The Resolve plugin lets you search, import with clip color, and drop frame markers without leaving the NLE.

Designed for solo DITs and small crews who own their data: local-first, self-hosted, MIT license, no cloud dependency.

---

Features

Semantic search — query in natural language (Chinese/English/Japanese)
Chat RAG over your video library — 5-intent assistant for compilation searches, refinement, similarity, analytics, and general questions with persisted conversation memory
AI transcription — Whisper large-v3-turbo + Silero VAD + LLM polish (Apple Silicon MLX / NVIDIA CUDA)
4-layer anti-hallucination guard — VAD silence filter → no_speech threshold → blank/repeat filter → LLM correction
Frame analysis — qwen3-vl:8b vision descriptions with brand/object recognition
2-phase pipeline — transcribe first, unload LLM, then vision (avoids VRAM conflict on 12GB GPUs)
Rating system — GOOD / NG / Review with notes + clip color in Resolve
Tag system — auto (AI) + manual tags with autocomplete
DaVinci Resolve UI — dark theme, 3-panel layout, filmstrip, waveform
Export — SRT, VTT, TXT, EDL (drop-frame TC), FCPXML 1.8 (FCPX + DaVinci compatible)
DaVinci Resolve metadata CSV — /api/export/metadata-csv endpoint exports clip metadata (Camera/Lens/ISO/Shutter/Aperture/GPS/CreateDate) ready for Resolve's File → Import Metadata from CSV. Plugin auto-prompts after import
ExifTool integration — auto-extracts 12 fields per clip (Make/Model/LensModel/GPS/ColorSpace/ISO/Shutter/Aperture/FocalLength/CreateDate). Sidecar-aware for Sony XAVC .XML, iPhone Keys group, Blackmagic Cam app per-vendor lens tags. Auto-detects exiftool binary on Windows (winget/scoop/chocolatey/Program Files)
EDL reel name — uses ExifTool ReelName with safe fallback to filename stem (8-char CMX3600 compat, control-char sanitized)
HEVC/ProRes browser proxy — auto-builds H.264 proxy on demand for browser playback (Phase 7.7g)
Tauri native app — desktop app with native file/folder dialogs (Windows panic hook surfaces Rust crashes to stderr)
DaVinci Resolve plugin — search, import with clip color, add frame markers
ASC MHL v2 hash manifests — mhl.py create / verify CLI emits real urn:ASC:MHL:v2.0 with xxh3 / md5 / sha1 / sha256 / c4, directory + structure root hashes, chained ascmhl_chain.xml. Interop-verified with ASC reference impl 1.2 — drop-in for Silverstack / MediaVerify / Hedge / YoYotta workflows
Multi-destination offload — offload.py --src <SD> --dst <A> --dst <B> does chunked parallel copy + per-file hash verify + 3× retry on mismatch + atomic rename + sidecar-aware (XAVC / ARRI / RED / iPhone Live Photo). Resumable JSON state file — kill mid-copy and pending files pick up exactly where they stopped. Emits per-dst MHL v2
Camera report CSV — camera_report.py writes 20-col DIT-spec CSV (Reel / TC / Camera / Lens / ISO / Shutter / Aperture / WB / FPS / Codec / ...) for Resolve's File → Import Metadata from CSV. Day-summary footer aggregates clip count + runtime by camera / by card

Prerequisites

Dependency	macOS (brew)	Linux (apt)	Windows
Python 3.9+	`brew install python`	`sudo apt install python3 python3-venv`	[python.org](https://python.org)
FFmpeg 6.0+	`brew install ffmpeg`	`sudo apt install ffmpeg`	[ffmpeg.org](https://ffmpeg.org/download.html)
Ollama	`brew install ollama`	[ollama.com/download](https://ollama.com/download)	[ollama.com/download](https://ollama.com/download)

DaVinci Resolve Plugin extra (macOS): Resolve requires the official Python 3.10 Framework installer (.pkg) from python.org — Homebrew Python is not recognized. Install path: /Library/Frameworks/Python.framework/Versions/3.10/. Restart Resolve after install; Py3 should appear in Console and scripts load via Workspace > Scripts.

Windows (PowerShell) — UTF-8 required for CJK search

$env:PYTHONUTF8=1; uvicorn server:app --host 0.0.0.0 --port 8501

API search (requires server running)

Install — macOS (brew + pip)

brew install python ffmpeg ollama
git clone https://github.com/vulture-s/arkiv.git
cd arkiv
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
pip install mlx-whisper          # Apple Silicon (Metal GPU)
ollama pull bge-m3 && ollama pull qwen3-vl:8b && ollama pull qwen2.5:14b
python health.py

Install — Linux (pip)

```bash sudo apt install python3 python3-venv ffmpeg git clone https://github.com/vulture-s/arkiv.git cd arkiv python3 -m venv .venv && source .venv/bin/activate pip install -r requirements.txt pip install faster-whisper torch # NVIDIA CUDA GPU

pip install faster-whisper # CPU fallback

ollama pull bge-m3 && ollama pull qwen3-vl:8b && ollama pull qwen2.5:14b python health.py ```

Install — Windows (pip, PowerShell)

```powershell

Install Python 3.9+, FFmpeg, and Ollama manually first, then:

git clone https://github.com/vulture-s/arkiv.git cd arkiv python -m venv .venv .\.venv\Scripts\activate pip install -r requirements.txt pip install faster-whisper torch # NVIDIA CUDA GPU

pip install faster-whisper # CPU fallback

ollama pull bge-m3; ollama pull qwen3-vl:8b; ollama pull qwen2.5:14b $env:PYTHONUTF8=1; python health.py ```

Install — Docker (all platforms)

```bash git clone https://github.com/vulture-s/arkiv.git cd arkiv docker compose up -d

Step 2 — Build search index

python embed.py

Docker

docker exec arkiv-arkiv-1 bash smoke-test.sh --platform docker ```

The test has two phases: Health Check (environment) and API Smoke Test (server endpoints).

Quick Start

Screenshots

ARKIV UI

Option A: Web UI — browse, search, rate, and tag in the browser

```bash

Option B: CLI only — ingest and search without opening a browser

Both options use the same database. You can mix and match — ingest via CLI, then browse in Web UI, or vice versa. Note: Do not run CLI and Web UI ingest at the same time. SQLite does not support concurrent writes — run one at a time.

```bash

Ingest options

python ingest.py --dir ./media --limit 10 # process first 10 files only python ingest.py --dir ./media --skip-vision # skip AI frame descriptions python ingest.py --dir ./media --refresh # re-process already-indexed files

Index options

python embed.py --rebuild # drop and rebuild from scratch

Configuration

Copy .env.example to .env and customize:

Variable	Default	Description
`ARKIV_DB_PATH`	`./media.db`	SQLite database path
`ARKIV_CHROMA_PATH`	`./chroma_db`	ChromaDB vector store
`ARKIV_THUMBNAILS_DIR`	`./thumbnails`	Thumbnail output dir
`ARKIV_OLLAMA_URL`	`http://localhost:11434`	Ollama API endpoint
`ARKIV_EMBED_MODEL`	`bge-m3`	Embedding model — do not change after indexing (see note below)
`ARKIV_VISION_MODEL`	`qwen3-vl:8b`	Vision model for frame descriptions
`ARKIV_CHAT_MODEL`	`qwen2.5:14b`	Chat model — answers and (by default) intent classification
`ARKIV_INTENT_MODEL`	(= `ARKIV_CHAT_MODEL`)	Optional faster model for intent classification only; must be installed
`ARKIV_WHISPER_MODEL`	`mlx-community/whisper-large-v3-turbo` (macOS) / `large-v3-turbo` (other)	Whisper model
`ARKIV_CUSTOM_VOCABULARY`	(empty)	Comma-separated hotwords (names/jargon) fed to Whisper's `initial_prompt`
`ARKIV_VOCABULARY_FILE`	(empty → `.arkiv/vocabulary.txt` if present)	Newline-delimited hotword file (one term/line, `#` comments); merged with the above
`ARKIV_EXIFTOOL_PATH`	(empty — auto-detect)	Path to exiftool binary (optional)
`ARKIV_FFMPEG_PATH`	(empty — auto-detect)	Path to ffmpeg binary (optional; set on headless Windows where only a WinGet alias shim is on PATH)
`ARKIV_FFPROBE_PATH`	(empty — auto-detect)	Path to ffprobe binary (optional; same as above)
`ARKIV_HOST`	`0.0.0.0`	Server bind address
`ARKIV_PORT`	`8501`	Server port

Embedding model is locked to your index. The vector store is built with one embedding model (bge-m3, 1024-dim). Changing ARKIV_EMBED_MODEL after you have indexed media makes new query vectors incompatible with stored ones — search results degrade silently. To switch models, re-index from scratch. Hardware floor for chat: qwen2.5:14b needs ~9 GB and runs alongside the embedding model. Plan for ~12–16 GB free RAM/VRAM on the Ollama host. On tighter machines, set ARKIV_CHAT_MODEL=qwen2.5:7b (~4.7 GB) for a lighter default.

API Authentication

All /api/* endpoints require a Bearer token with the correct scope. Scope-based tokens let you split a fleet by machine role: read-only review stations can use videos_read or media_read, ingest machines can use ingest_write, and admin machines can manage tokens.

First-time bootstrap:

export ARKIV_ADMIN_BOOTSTRAP_TOKEN=$(openssl rand -base64 32)
python server.py

On first startup, the server seeds a single admin token from that env var. Use it once to create per-machine tokens, then unset it and revoke the bootstrap token.

Create and manage tokens directly with the CLI:

python arkiv_token.py create --name "PC-dev" --scopes videos_read,videos_write --ip-allowlist 127.0.0.1/32,100.64.0.0/10 --expires-in 90
python arkiv_token.py list
python arkiv_token.py show <token-id>
python arkiv_token.py revoke <token-id>

Use the token in requests:

curl -H "Authorization: Bearer <token>" http://localhost:8501/api/media

Available scopes: videos_read, videos_write, media_read, collections_read, collections_write, projects_read, projects_write, ingest_write, chat_read, chat_write, admin

Chat API — RAG over your video library

Ask natural-language questions about your archive. The classifier routes each prompt to one of five handlers:

Intent	Example	What it does
`compilation`	"Give me all sunset shots from May"	Semantic search → ranked scene list
`refinement`	"Only the indoor ones"	Filters the previous result, in-conversation
`similarity`	"Similar to scene 42"	Vector nearest-neighbours to a reference clip
`analytics`	"How many hours did I shoot this month?"	Aggregate query over the library
`general`	"What can you help me with?"	Plain LLM chat, no search

Conversation history (last 10 messages) is threaded into each follow-up, so refinement acts on what the previous turn returned.

Model requirement: chat uses ARKIV_CHAT_MODEL (default qwen2.5:14b) for both intent classification and answers — a single ollama pull qwen2.5:14b covers it. Only set ARKIV_INTENT_MODEL to a smaller model (e.g. qwen2.5:7b-instruct) if that model is actually installed on the Ollama host. If the model is missing, /api/chat returns a clear "run ollama pull …" message instead of a 500.

Prerequisite — ingest + index first: chat queries your indexed library, not a standalone chatbot. Ingest media (Step 1) and build the index with python embed.py (Step 2) before chatting. compilation / refinement / similarity need the vector index; analytics needs ingested media only; general is the only intent that works on an empty library. On an empty/unindexed library chat does not error — it just returns "0 results".

```bash

FAQ

Q: Which Whisper backend should I use? - macOS with Apple Silicon: mlx-whisper (fastest, uses Metal GPU) - NVIDIA GPU: faster-whisper + torch (CUDA acceleration) - CPU only: faster-whisper (slower but works everywhere)

Q: Do I need Ollama running? Yes, for semantic search (embedding) and optional frame descriptions. Run ollama serve before starting arkiv.

Q: How do I add media? Use the + button in the Media Pool sidebar, or run python ingest.py --dir /path/to/media from CLI.

Q: Can I use this without Docker? Yes — the native Python install is the primary workflow. Docker is optional for deployment.

Q: What file formats are supported? Video: .mp4, .mov, .mkv, .avi, .webm, .m4v, .mts 360: .insv (Insta360), .360 (GoPro Max) — indexed as raw fisheye Audio: .wav, .mp3, .m4a, .aac, .flac, .ogg Camera metadata (make/model/lens/timecode) is read from embedded EXIF and Sony XAVC NRT sidecar XML — so FX30/FX-series footage keeps its identity.