AI Skill Hub 推荐使用:AProver: AI工作流 是一款优质的Agent工作流。AI 综合评分 7.5 分,在同类工具中表现稳健。如果你正在寻找可靠的Agent工作流解决方案,这是一个值得深入了解的选择。
AProver是开源的AI工作流,集成了LLM代理和BMC,用于自动化代码验证。它提供了一个强大的工具,帮助开发者验证和改进AI生成的代码。
AProver: AI工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
AProver是开源的AI工作流,集成了LLM代理和BMC,用于自动化代码验证。它提供了一个强大的工具,帮助开发者验证和改进AI生成的代码。
AProver: AI工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
# 方式一:pip 安装(推荐)
pip install aprover
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install aprover
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/agentic-prover/aprover
cd aprover
pip install -e .
# 验证安装
python -c "import aprover; print('安装成功')"
# 命令行使用
aprover --help
# 基本用法
aprover input_file -o output_file
# Python 代码中调用
import aprover
# 示例
result = aprover.process("input")
print(result)
# aprover 配置文件示例(config.yml) app: name: "aprover" debug: false log_level: "INFO" # 运行时指定配置文件 aprover --config config.yml # 或通过环境变量配置 export APROVER_API_KEY="your-key" export APROVER_OUTPUT_DIR="./output"
<picture> <source media="(prefers-color-scheme: dark)" srcset="assets/logo-dark.svg"> <img alt="AProver" src="assets/logo.svg" height="80"> </picture>
AProver — Agentic Prover for AI-Generated Code — is a suite of LLM-driven formal verification agents. The first agent — BMC-Agent — is a prototype of agentic model checking: an architecture that pairs an LLM agent (for specification generation, counterexample classification, and spec refinement) with a sound bounded model checking backend. The agent handles semantic reasoning; the solver provides formal guarantees within the unwinding bound.
BMC-Agent supports two source languages with two solver backends, selected automatically by the source file's extension: C via CBMC, and Rust via Kani. The pipeline, classifier, refinement loop, and confidence tiers are shared; the parser and harness generator dispatch per language.
The design principle is agents propose, conventional tools dispose: every soundness-relevant decision the LLM produces passes through a conventional check (CBMC query, SMT soundness guard, or runtime confirmation) before affecting the verification verdict.
git clone https://github.com/agentic-prover/aprover
cd aprover
uv sync
For a zero-configuration experience, web/ contains a chat front-end that lets visitors run AProver by talking to it. The page streams pipeline progress (parse → spec → CBMC → classify → report) live as the agent works.
```bash ANTHROPIC_API_KEY=sk-... uv run uvicorn web.server:app --port 7860
```bash export ANTHROPIC_API_KEY=your_key_here
| File | Description |
|---|---|
simple_driver.c | Ring-buffer device — off-by-one in rb_write |
sensor_hub.c | CEGAR demo: spurious counterexample → refinement → real bug |
block_device.c | Integer overflow in blk_seek |
memory_allocator.c | Null dereference in alloc_free |
cross_file_demo/ | Cross-file confirmed_system_entry: null fn-pointer in libmath.c traced to system_entry in main.c |
vibeos/repo/kernel/ | Full VibeOS kernel — 13 confirmed realistic bugs (see table above) |
uv run bmc-agent verify-dir \ --source-dir examples/cross_file_demo \ --driver cross_file_demo \ --output artifacts/cross_file_demo ```
Artifacts — generated specs, CBMC harnesses, raw solver output, counterexample classifications, and bug reports — are written under --output and can be inspected or diffed.
All settings are available as environment variables or Config dataclass fields.
| Variable | Default | Purpose |
|---|---|---|
BMC_AGENT_LLM_MODEL | claude-sonnet-4-6 | LLM model |
BMC_AGENT_CBMC_PATH | cbmc | CBMC binary path |
BMC_AGENT_CBMC_UNWIND | 4 | Loop unwinding bound |
BMC_AGENT_CBMC_TIMEOUT | 120 | Solver timeout per function (seconds) |
BMC_AGENT_MAX_REFINEMENT_ITERS | 5 | Maximum CEGAR refinement iterations |
BMC_AGENT_ENABLE_DUAL_SPEC | true | Generate each spec twice, flag disagreements |
BMC_AGENT_ENABLE_SPEC_QUALITY | false | Phase 4 spec-quality checks (mutation, coverage) |
BMC_AGENT_SKIP_REFINEMENT | false | FilteringOnly mode: classify but don't refine |
BMC_AGENT_ENABLE_DYNAMIC_VALIDATION | false | Phase 3 S3: compile + run a GCC harness |
BMC_AGENT_DYNAMIC_VALIDATION_TIMEOUT | 30 | GCC harness run timeout (seconds) |
BMC_AGENT_DYNAMIC_CC_PATH | gcc | C compiler for dynamic harness |
BMC_AGENT_ENABLE_REALISM_CHECK | false | Phase 3 S4: LLM realism audit on every REAL_BUG finding |
BMC_AGENT_ENABLE_REALISM_THINKING | false | Use extended thinking in the realism checker (higher quality, slower) |
BMC_AGENT_CBMC_UNSIGNED_OVERFLOW_CHECK | false | Pass --unsigned-overflow-check to CBMC — detects integer overflow bugs (e.g. calloc nmemb*size wrap, CWE-190) |
BMC_AGENT_CBMC_REAL_LIBC | false | Skip Python-side preprocessing and let CBMC see the real libc headers; required for sources that include stdio.h / stdlib.h directly (OpenSSL, libxml2, llm.c, …) |
BMC_AGENT_STRICT_DSL | false | Forbid natural-language clauses in pre/post; pushes prose into the JSON reasoning field. Required for parser-state-heavy code where the LLM otherwise defaults to prose |
BMC_AGENT_RAW_BYTES | false | Treat single const char * parameters as raw N-byte buffers (no NUL constraint). Required for wire-format readers that may read past strlen |
BMC_AGENT_INFER_FIELD_VALIDITY | false | For struct-pointer parameters, init primitive-pointer fields (float *, int *, double *, …) as "NULL OR malloc'd backing buffer" instead of leaving them nondet. Prevents the harness from exploring "non-NULL but invalid" states that crash memset(field, …) despite a correct if (field != NULL) guard in source. Target audience: ML / numerics codebases (llm.c, ggml) whose struct fields are typed float * and aren't NUL-terminated strings |
BMC_AGENT_INFER_ARRAY_PARAM_BOUNDS | false | For top-level primitive-pointer parameters (size_t *, int *, float *, …), size the harness backing array from the maximum literal subscript in the function body (capped by BMC_AGENT_INFER_ARRAY_PARAM_BOUNDS_MAX, default 64). Prevents the harness from emitting a 1-element backing for functions that write a fixed-size parameter table (e.g. llm.c's fill_in_parameter_sizes writing param_sizes[0..15]) |
BMC_AGENT_SCALE_DOWN | false | Scale-down mode for ML / numerics kernels. Bounds ML parametric-size value parameters (B, T, C, NH, V, Vp, OC, …) to [0, BMC_AGENT_SCALE_DOWN_SIZE] (default 4) via __CPROVER_assume, auto-enables INFER_ARRAY_PARAM_BOUNDS, and sizes top-level primitive-pointer backing buffers to SCALE_DOWN_SIZE³. Required for matmul/attention/layernorm kernels which would otherwise time out exploring arbitrarily-large B*T*C inner loops |
BMC_AGENT_SAFETY_ONLY | false | Restrict the spec-generation prompt so postconditions express only memory safety, range bounds, and NaN/Inf-freedom — no functional / algebraic correctness claims. Pairs naturally with BMC_AGENT_SCALE_DOWN for ML kernels: the result is a clean "memory-safe + no-NaN" verdict instead of a vacuous functional-spec attempt that times out |
BMC_AGENT_KANI_PATH | kani | Kani binary path (Rust backend) |
BMC_AGENT_KANI_UNWIND | 4 | Kani loop unwinding bound |
BMC_AGENT_KANI_TIMEOUT | 120 | Kani solver timeout per harness (seconds) |
BMC_AGENT_KANI_SLICE_BOUND | 4 | Bounded length used for &[T] slice / Vec<T> backing arrays in Kani harnesses |
BMC_AGENT_KANI_REAL_CRATE | false | Run Kani via cargo kani --tests --harness inside the real crate root (multi-crate workspaces) instead of as a standalone kani harness.rs invocation |
BMC_AGENT_ENABLE_FEEDBACK_LOOP | false | Enable the self-improvement loop: (a) developer code-changes, (b) function-spec invariant tightening, (c) project-wide invariant inference, with in-sweep iteration |
BMC_AGENT_ENABLE_FLAG_SELECTION | false | Let the LLM select CBMC flags per function based on observed properties |
BMC_AGENT_SKIP_REFINEMENT=true is the FilteringOnly ablation: running the same input with and without this flag measures whether the refinement loop adds value beyond simple counterexample filtering.
uv run bmc-agent verify --source path/to/your_module.rs \ --driver your_module \ --output artifacts/
AProver是一个有潜力的AI工作流,提供了一个强大的工具用于自动化代码验证和改进AI生成的代码。然而,它还需要进一步的开发和测试。
该工具使用 NOASSERTION 协议,商用场景请仔细阅读协议条款,必要时咨询法律意见。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
📄 NOASSERTION — 请查阅原始协议条款了解具体使用限制。
总体来看,AProver: AI工作流 是一款质量良好的Agent工作流,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | aprover |
| 原始描述 | 开源AI工作流:AProver: Agentic Prover for AI-Generated Code — LLM agents + BMC for automated v。⭐9 · Python |
| Topics | workflowbounded-model-checkingcbmcformal-verificationllm-agentsprogram-verification |
| GitHub | https://github.com/agentic-prover/aprover |
| License | NOASSERTION |
| 语言 | Python |
收录时间:2026-05-28 · 更新时间:2026-05-30 · License:NOASSERTION · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端