经 AI Skill Hub 精选评估,Vex代码搜索 获评「推荐使用」。这款MCP工具在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 7.5 分,适合有一定技术背景的用户使用。
Vex代码搜索 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
Vex代码搜索 是一款遵循 MCP(Model Context Protocol)标准协议的 AI 工具扩展。通过 MCP 协议,它可以让 Claude、Cursor 等主流 AI 客户端直接访问和操作外部工具、数据源和服务,实现 AI 能力的无缝扩展。无论是文件操作、数据库查询还是 API 调用,都可以通过自然语言在 AI 对话中直接触发,极大提升生产效率。
# 方式一:通过 Claude Code CLI 一键安装
claude skill install https://github.com/tenatarika/vex
# 方式二:手动配置 claude_desktop_config.json
{
"mcpServers": {
"vex----": {
"command": "npx",
"args": ["-y", "vex"]
}
}
}
# 配置文件位置
# macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
# Windows: %APPDATA%/Claude/claude_desktop_config.json
# 安装后在 Claude 对话中直接使用 # 示例: 用户: 请帮我用 Vex代码搜索 执行以下任务... Claude: [自动调用 Vex代码搜索 MCP 工具处理请求] # 查看可用工具列表 # 在 Claude 中输入:"列出所有可用的 MCP 工具"
// claude_desktop_config.json 配置示例
{
"mcpServers": {
"vex____": {
"command": "npx",
"args": ["-y", "vex"],
"env": {
// "API_KEY": "your-api-key-here"
}
}
}
}
// 保存后重启 Claude Desktop 生效
Fast hybrid structural + semantic code search. Vector + index.
Why Vex? · How It Compares · Installation · Quick Start · Commands · Configuration · How Search Works · Benchmarks · Supported Languages · Integration · Testing · Architecture
$ vex search "TelemetryProcessor" # 4ms — find symbol definitions
$ vex search "timeout retry" # BM25 finds rare body terms
$ vex show "TelemetryProcessor" # extract just the class body (not the whole file)
$ vex search "handle alert" --semantic # find by meaning, not just name
$ vex pattern 'fn $NAME($$$) -> Result' # AST pattern matching (like ast-grep)
$ vex usages "Config" # who references this symbol? (+ --strict on Rust/TS/Python/C#/C++)
$ vex implementations "BaseService" # who extends/implements this?
$ vex callers "process_event" # who calls this function? (~4ms; covers module-scope + Python/Java decorators)
$ vex similar "PaymentService" # semantically close symbols
$ vex duplicates --threshold 0.95 # near-duplicate pairs
$ vex check "Foo" "Bar" "Baz" # fast existence check
$ vex bundle --mode symbol --symbol Foo # NEW (v1.9): body + callers + callees + similar in 1 call
vex similar "PaymentService" --limit 5 --min-score 0.7 --explain
vex search "payment processing" --semantic
RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_index_reader -- -max_total_time=120 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_refs_fst -- -max_total_time=60 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_symbol_fst -- -max_total_time=60 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_bloom_load -- -max_total_time=60 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_pattern_parser -- -max_total_time=60 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_manifest_load -- -max_total_time=60 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_marker_load -- -max_total_time=60 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_tokenize_document -- -max_total_time=60 RUSTUP_TOOLCHAIN=nightly cargo fuzz run fuzz_hash_index_load -- -max_total_time=60 ```
Nine fuzz targets cover the reader's unsafe paths plus every text / sidecar parser that takes adversarial input:
| Target | What it fuzzes | Surface |
|---|---|---|
fuzz_index_reader | Arbitrary bytes as .vex file | header(), symbol(), vector(), read_string(), file_paths() |
fuzz_refs_fst | Arbitrary FST + posting bytes | RefReader::find(), find_by_prefix() |
fuzz_symbol_fst | Arbitrary FST + posting bytes | SymbolFstReader::find(), find_fuzzy(), search_with_fallback() |
fuzz_bloom_load (v1.12.0) | Arbitrary index.bloom sidecar | SymbolBloom::load, then may_contain probes |
fuzz_pattern_parser (v1.12.0) | Arbitrary UTF-8 as a pattern string | parse_composite_pattern (metavars, && / ||, quoted segments) |
fuzz_manifest_load (v1.12.0) | Arbitrary JSON as manifest.json | Manifest::load |
fuzz_marker_load (v1.13.0) | Arbitrary text as <onnx>.sha256.marker | verify_with_marker parser + decision tree |
fuzz_tokenize_document (v1.13.0) | Arbitrary UTF-8 as BM25 input | tokenize_document (post share-owning-String refactor) |
fuzz_hash_index_load (v1.14.1) | Arbitrary bytes as index.hashes sidecar | hash_index::load (VEXH magic, MAX_COUNT guard, truncation) |
Most recent system-wide audit (v1.14.1, 2026-06-05): 5,792,231 total iterations across all 9 targets in ~9 min wall-clock, zero crashes / panics / AddressSanitizer hits / leaks. Plus a focused 3,000,000- iter run on fuzz_hash_index_load alone — clean. Coverage saturated for the small binary-header parsers (bloom, marker, hash_index); JSON / grammar parsers still surfacing new features at saturation (fuzz_manifest_load reached cov:1355 ft:4191 / 1311 corpus, fuzz_pattern_parser cov:551 ft:3693 / 1210 corpus).
Fuzzing has found and fixed five real defects across the project life:
- v1.x: out-of-bounds read on crafted symbol_count, misaligned pointer dereference on odd symbols_offset, unchecked section offsets exceeding file size (binary reader hardening). - v1.12.0: SymbolBloom::load accepted a sidecar with n_bits = 0 + k_num = 0 whose consistency guard passed but later panicked inside bloomfilter::Bloom::check on hash % 0. Fix: reject degenerate sizes during load. - v1.12.0: SymbolBloom::load accepted k_num up to ~2.1B, which made every may_contain call loop for 110+ seconds (DoS, not a panic). Fix: cap k_num <= MAX_K_NUM = 64 at load time.
The v1.13.0 / v1.14.1 additions found no defects in fresh code — the review-driven MAX_COUNT guards on hash_index::save / load were added as defence-in-depth before the fuzzer ran (rust-reviewer + code-reviewer flagged the truncating as u32 cast on save), and the sustained 3M / 5.8M iteration runs confirmed they hold.
```bash
brew tap tenatarika/tap && brew install vex
cargo build --release -p vex-mcp
cargo install cargo-fuzz
```bash
vex usages "IndexReader"
References stored in an FST (Finite State Transducer) — zero-copy lookup from mmap with prefix search support.
$ rg -w "PreAggregatedConfig" . ./models.py:3602:class PreAggregatedConfig(models.Model): ./models.py:3610: pre_aggregated_config = PreAggregatedConfig.objects.get(...) ./serializers.py:48:from .models import PreAggregatedConfig ./tests.py:12: config = PreAggregatedConfig(...) ... (16 more lines)
vex bundle --mode symbol --symbol PaymentService # body + callers + callees + similar vex bundle --mode pr-impact --base origin/main # changed symbols + transitive callers + tests vex bundle --mode project --top-n 30 # top-N by reverse call-graph indegree
Create a .vex.toml in your project root to customize vex behavior:
vex init # generates .vex.toml with commented defaults
```toml
vex search "Foo" --format json ```
Pin a different default in .vex.toml via format = "text" if you want the verbose multi-line view at the terminal.
Every --format json subcommand wraps its payload in the Phase 13 envelope. Single shape, easy to detect via protocol_version:
{
"protocol_version": "v1",
"capabilities": { /* see `vex capabilities` */ },
"_meta": { "vex.dev/index_age_ms": 1200, "ttlMs": 30000, "cacheScope": "project" },
"results": [ /* the actual data, shape depends on the subcommand */ ]
}
Pre-v1.11 only search and bundle returned this envelope; the other ~14 subcommands (show, usages, pattern, grep, implementations, callers, callees, paths, reachable, check, similar, duplicates, diff, outline, index, update, status, eval) emitted bare arrays / objects. Migration: pre-1.11 jq '.[0].name' or data[0]['name'] now needs jq '.results[0].name' / data['results'][0]['name']. Detect the envelope via response.get('protocol_version') == 'v1' to support both shapes during a rollout window.
{ "mcpServers": { "vex": { "command": "/path/to/vex-mcp", "env": { "VEX_ROOT": "/path/to/your/project" } } } } ```
MCP Tools (23): - search — 3-way hybrid (structural + BM25 + semantic); accepts filter / include / exclude / kind / context_path / no_bm25 / --why / metadata filters / diff-scope (since / since_branched / changed_only) - find_symbol — exact name lookup - find_similar — semantic search by free-form description - similar — nearest neighbors of an existing symbol (explain adds Jaccard + diff); diff-scope - duplicates — near-duplicate symbol pairs (explain shows what differs); diff-scope - show — extract symbol body from source; Phase 13.3 truncation flags (signature_only / head / no_body / collapsed, mutually exclusive) - outline — file structure - usages — find all references to a symbol; filter / --strict / --why - grep — regex content search - pattern — AST pattern matching with metavar back-references; diff-scope; --why - implementations — find types extending a base class/trait/interface (incl. generics); diff-scope - callers / callees — direct callgraph navigation (fast path via persistent index); diff-scope - paths — enumerate caller chains between two functions - reachable — transitive callers of a target - diff — symbol-level diff between a git revision and the working tree - check — fast symbol existence check - bundle — unified multi-source bundle (mode: symbol | pr-impact | project), Phase 13 envelope - eval — ranking-evaluation harness (bench / min_ndcg), MCP defaults json: true so agents get a structured EvalReport - capabilities — machine-readable capability matrix (protocol_version, signals, bundle_modes, etc.) - index / update — build/rebuild index - status — index statistics
MCP ↔ CLI parity (v1.10): the schemas now mirror the CLI surface for every path-aware tool. Glob filters (include / exclude), substring filter, kind boost, context_path proximity hint, no_bm25, Phase 13.3 truncation, diff-scope, and no_stale_check are exposed everywhere the CLI accepts them — agents no longer need to drop to bash for "Rust files under crates/api/ since main"-style scoping.
The schemas follow a canonical vocabulary (query / symbol / symbols / path / pattern / filter / include / exclude); pre-v1.7 aliases (name, file, names, etc.) still work and emit _meta.deprecated_args: [...] in the JSON-RPC response. Malformed JSON-RPC input now returns the spec-compliant -32700 Parse error response (v1.9.2 fix) with a 512-codepoint echo of the offending line in the data field; broken-pipe / EOF on stdin cleanly shuts down the server instead of dropping in-flight tool calls. See docs/MCP-SCHEMA.md.
vex implementations "Iterator"
vex pattern 'interface $N || class $N' --lang typescript
The recommended way to integrate vex with Claude Code is via CLAUDE.md rules (see below). Vex runs as a CLI tool — Claude Code calls it directly via Bash, no MCP server needed.
Setup:
```bash
```bash
Add this to your project's CLAUDE.md to make Claude Code use vex instead of grep:
```markdown
cargo test # 1973 tests — unit, integration, property-based, adversarial
cargo clippy -- -D warnings # zero warnings policy
Test coverage includes: - Per-language grammar regression (NEW): tests/<lang>_query_test.rs for all 19 supported languages — catches ABI mismatches and AST node renames when a tree-sitter grammar crate is upgraded - Binary format: roundtrip, corrupted/truncated/wrong-version rejection, out-of-bounds access, string pool dedup, empty index - Adversarial format: 20 crafted index tests — overflow offsets, bad magic/version, alignment attacks, truncated records - Vectors: write/read roundtrip for 384-dim f32 embeddings - FST: refs FST roundtrip, prefix search, symbol FST exact/prefix/fuzzy search - Search: structural, fuzzy (Levenshtein), RRF fusion, reranking with kind/path/proximity boosts - Reranking stress: NaN/Infinity/zero scores, 10K results, edge context paths - Property-based (proptest): rerank preserves length, sorted output, no NaN/negative scores, fusion commutativity - Incremental update: unchanged reuse, deleted removal, file rename, symbol move between files, empty file - Concurrency: parallel index/update (lock serialization), concurrent readers, read during reindex - Multi-language: Rust, Python, Go, Kotlin, TypeScript, C++, cross-language same-name, wrong extension, 1K-symbol file, deep nesting, error recovery - Unicode: BOM, mixed CRLF, unicode identifiers, null bytes, empty/whitespace files - Path edges: spaces in paths, deep nesting (20 levels), symlinks, absolute vs relative, Windows backslashes - Callgraph: callers/callees for Rust, Python, Go, TypeScript, Java - Persistent call graph (v1.5): format v4 roundtrip, callers/callees FST lookup, dedup, same-name-across-files isolation, same-name-within-file disambiguation, incremental update preserves edges, fallback to live scan for v3 - Similar/duplicates (v1.5): self-exclusion, threshold filtering, canonical pair dedup, body-length filter, empty-index handling - Pluggable embedder (v1.5): registry lookup, mismatch detection (incl. back-compat for pre-9.1 manifests), config + CLI priority, writer variable vector_dim - BM25 channel (v1.5): writer/reader roundtrip, pipeline emission, IDF discrimination, short-doc preference, 3-way RRF with Hybrid labeling, MatchType tagging, unicode tokens - Staleness: git HEAD comparison, dirty file detection, mtime fallback
| **vex** | **ripgrep** | **ast-index** | **ast-grep** | **Serena** | |
|---|---|---|---|---|---|
| **What it searches** | Symbol definitions | All text | Symbol definitions | AST patterns | Symbols (via LSP) |
| **Requires indexing?** | Yes (20ms-600ms+) | No | Yes | No | No |
| **Search speed** | **~4ms** (pre-built FST) | 75-120ms (disk scan) | 22-60ms (SQLite) | ~30ms (scan) | LSP-dependent |
| **Semantic search** | HNSW + embeddings | -- | -- | -- | -- |
| **Pattern matching** | fn $NAME($$$) | regex only | -- | fn $NAME($$$) | regex only |
| **Index size** | **5 MB** / 20K syms | no index | 190 MB / 20K syms | no index | no index |
| **Token efficiency** | **6-88x** fewer than rg | baseline | ~3x fewer than rg | N/A | N/A |
| **Symbol body extraction** | vex show | -- | -- | -- | -- |
| **Languages** | 19 | any | 10+ | 10+ | 40+ (LSP) |
| **Refactoring** | -- | -- | -- | -- | rename, move, inline |
| **Runtime deps** | none | none | none | none | Python + LSP |
Note: vex search speed assumes a pre-built index. Ripgrep and ast-grep require no upfront indexing and work immediately on any directory. The tradeoff is amortized: if you search the same codebase many times (typical in agent workflows), the one-time indexing cost pays for itself.
Best for: fast symbol search in AI agent workflows where token efficiency matters. Not a replacement for LSP-based tools (no refactoring, no go-to-definition in dependencies).
| Query | vex | ast-index | rg -w | vex vs rg |
|---|---|---|---|---|
| Query A | **4.9 ms** | 9.5 ms | 54.2 ms | **11x** |
| Query B | **4.6 ms** | 9.5 ms | 8.9 ms | **1.9x** |
| Query C | **4.5 ms** | 9.2 ms | 8.6 ms | **1.9x** |
| Query D | **5.0 ms** | 12.1 ms | 9.3 ms | **1.9x** |
| Query | vex | ast-index | rg -w | vex vs rg | Results (def/text) |
|---|---|---|---|---|---|
| Symbol 1 | **6.0 ms** | 59.7 ms | 84.6 ms | **14x** | 1 / 4 |
| Symbol 2 | **3.7 ms** | 44.5 ms | 78.5 ms | **21x** | 2 / 5 |
| Symbol 3 | **3.9 ms** | 22.7 ms | 76.7 ms | **20x** | 1 / 20 |
| Symbol 4 | **3.8 ms** | 43.1 ms | 77.5 ms | **21x** | 1 / 2 |
| Symbol 5 | **3.6 ms** | 33.7 ms | 77.3 ms | **21x** | 1 / 22 |
| Symbol 6 | **3.8 ms** | 43.3 ms | 76.9 ms | **20x** | 1 / 8 |
| Symbol 7 | **4.0 ms** | 42.5 ms | 74.9 ms | **19x** | 1 / 6 |
| Symbol 8 | **3.7 ms** | 42.8 ms | 78.4 ms | **21x** | 1 / 2 |
Key takeaway: vex search is constant ~4 ms (FST O(query_len)), regardless of project size — but this assumes a pre-built index. The comparison with ripgrep is not apples-to-apples: rg scans raw text with no indexing, while vex looks up a pre-built index. The real advantage is amortized: vex returns only symbol definitions (precise, token-efficient), while rg returns all text occurrences (noisy, expensive in LLM contexts).
Semantic search embeds the query via ONNX (~55ms) then searches stored vectors. HNSW (usearch) replaces brute-force O(N) scan with O(log N) approximate nearest neighbor search:
| Symbols | Brute-force | HNSW | Speedup |
|---|---|---|---|
| 333 | ~3 ms | ~3 ms | 1x |
| 11K | ~8 ms | ~3 ms | **2.3x** |
| 20K | ~11 ms | ~3 ms | **4x** |
| 100K (projected) | ~55 ms | ~3 ms | **~18x** |
HNSW stays constant ~3ms regardless of index size. Brute-force grows linearly. Total semantic search latency is dominated by ONNX embedding (~55ms), so end-to-end speedup is modest for small codebases but critical at scale.
| Mode | Latency |
|---|---|
| Structural only | ~4 ms |
| Hybrid (structural + semantic) | ~58 ms (HNSW) / ~66 ms (brute-force) |
Vex 是一个 Rust 项目,用于分析和理解代码结构。它提供了多种功能,包括语义相似度分析、代码搜索和代码重构。
Vex 的主要功能包括语义相似度分析、代码搜索、代码重构和集成开发环境 (IDE) 支持。它可以帮助开发者快速找到代码中的问题和优化点。
Vex 需要 Rust 1.88 或更高版本来编译和运行。它还需要一个 MCP 服务器来提供语义相似度分析和代码搜索功能。
可以使用 Homebrew 安装 Vex:brew tap tenatarika/tap && brew install vex。也可以从源码编译 Vex:cargo build --release -p vex-mcp。
Vex 的使用方法包括使用命令行工具来分析和理解代码结构。例如,可以使用 vex usages 命令来找到一个符号的所有使用位置。
Vex 的配置文件是 .vex.toml,位于项目根目录下。它可以用来自定义 Vex 的行为和功能。
Vex 提供了多个 API,包括 find implementations、pattern matching 和 Claude Code 集成等。这些 API 可以帮助开发者快速找到代码中的问题和优化点。
Vex 的工作流包括代码分析、代码重构和集成开发环境 (IDE) 支持。它可以帮助开发者快速找到代码中的问题和优化点,并且可以提高开发效率和质量。
Vex 的常见问题包括如何安装 Vex、如何使用 Vex 等。还包括如何解决 Vex 的常见问题和错误。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
AI Skill Hub 点评:Vex代码搜索 的核心功能完整,质量良好。对于Claude Desktop / Claude Code 用户来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | vex |
| 原始描述 | 开源MCP工具:Hybrid structural + semantic code search for LLMs — compact output, MCP server, 。⭐7 · Rust |
| Topics | mcpastclaudecode-searchrust |
| GitHub | https://github.com/tenatarika/vex |
| License | MIT |
| 语言 | Rust |
收录时间:2026-06-05 · 更新时间:2026-06-06 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端