AI Skill Hub 强烈推荐:LLM安全评估框架 是一款优质的AI工具。AI 综合评分 8.0 分,在同类工具中表现稳健。如果你正在寻找可靠的AI工具解决方案,这是一个值得深入了解的选择。
LLM安全评估框架 是一款基于 Python 开发的开源工具,专注于 ai安全、对抗性攻击、网络安全 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
LLM安全评估框架 是一款基于 Python 开发的开源工具,专注于 ai安全、对抗性攻击、网络安全 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install llm-security-assessment-framework
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install llm-security-assessment-framework
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/Coff0xc/LLM-Security-Assessment-Framework
cd LLM-Security-Assessment-Framework
pip install -e .
# 验证安装
python -c "import llm_security_assessment_framework; print('安装成功')"
# 命令行使用
llm-security-assessment-framework --help
# 基本用法
llm-security-assessment-framework input_file -o output_file
# Python 代码中调用
import llm_security_assessment_framework
# 示例
result = llm_security_assessment_framework.process("input")
print(result)
# llm-security-assessment-framework 配置文件示例(config.yml) app: name: "llm-security-assessment-framework" debug: false log_level: "INFO" # 运行时指定配置文件 llm-security-assessment-framework --config config.yml # 或通过环境变量配置 export LLM_SECURITY_ASSESSMENT_FRAMEWORK_API_KEY="your-key" export LLM_SECURITY_ASSESSMENT_FRAMEWORK_OUTPUT_DIR="./output"
FORGEDAN is a report-oriented LLM security assessment framework based on the paper FORGEDAN: An Evolutionary Framework for Jailbreaking Aligned Large Language Models. The project now focuses on producing reproducible assessment deliverables: YAML-driven test suites, deterministic scanners and scorers, evidence matrices, risk registers, coverage summaries, schema-validated report packs, QA handoff receipts, and ZIP archives that can be verified after copying or sharing.
The framework still includes evolutionary jailbreak attacks, model adapters, WebScan utilities, a REST API, and a Vue dashboard. The primary project goal, however, is assessment report production and handoff confidence, not a commercial security platform.
中文对照:FORGEDAN 现在的核心定位是“生成可审计、可复核、可交接的 LLM 安全评估报告”。它保留越狱攻击、模型适配器、WebScan、REST API 和 Vue 仪表盘,但项目重点不是商业化平台,而是让评估团队能稳定产出带证据、带风险登记、带 QA 回执、带归档校验的报告交付物。
Use this wording for the GitHub repository sidebar:
Report-first LLM security assessment framework for reproducible red-team suites, evidence packs, QA receipts, schemas, and archive verification.
中文:
面向报告交付的 LLM 安全评估框架,用于生成可复现红队套件、证据包、QA 回执、Schema 合约和可校验归档。
Suggested topics:
llm-security, ai-red-team, prompt-injection, jailbreak, owasp-llm, mcp-security, agent-security, security-reporting, risk-register, audit-evidence, json-schema, pytest, python
---

GitHub 仓库侧栏建议使用:
面向报告交付的 LLM 安全评估框架,用于生成可复现红队套件、证据包、QA 回执、Schema 合约和可校验归档。
对应英文:
Report-first LLM security assessment framework for reproducible red-team suites, evidence packs, QA receipts, schemas, and archive verification.
建议 Topics:
llm-security, ai-red-team, prompt-injection, jailbreak, owasp-llm, mcp-security, agent-security, security-reporting, risk-register, audit-evidence, json-schema, pytest, python
| Category | Features | 中文对照 |
|---|---|---|
| **Report Suites** | YAML suite definitions, inline or imported cases, replay caches, deterministic seeds, policy gates, preflight readiness checks | YAML 套件、内联/导入用例、响应缓存、确定性种子、策略门禁、运行前预检 |
| **Report Artifacts** | Markdown/HTML reports, executive summaries, evidence CSVs, case matrices, risk registers, coverage summaries, release notes, public bundle indexes | Markdown/HTML 报告、执行摘要、证据矩阵、用例矩阵、风险登记、覆盖率摘要、发布说明、交付索引 |
| **Evidence Integrity** | JSON Schemas, artifact manifests, SHA256/size checks, cross-artifact consistency checks, redacted-publication leak checks | JSON Schema、制品清单、SHA256/大小校验、跨制品一致性校验、脱敏发布泄漏检查 |
| **Handoff QA** | QA receipt JSON/Markdown, acceptance criteria, reviewer decisions, owner/due-date tracking, strict handoff CI gates | QA 回执、验收准则、评审决策、风险 owner/到期日、严格交接 CI 门禁 |
| **Assessment Coverage** | Prompt injection, jailbreak roleplay, system prompt leakage, secrets/PII exposure, Agent/MCP/tool policy risk, model artifact and serialization signals | Prompt Injection、越狱角色扮演、系统提示泄漏、敏感信息/PII、Agent/MCP/工具策略风险、模型制品与序列化信号 |
| **Baseline Engine** | FORGEDAN, AutoDAN, PAIR, GCG, Crescendo, TAP, model adapters, WebScan, CLI, REST API, Vue dashboard | FORGEDAN/AutoDAN/PAIR/GCG/Crescendo/TAP、模型适配器、WebScan、CLI、REST API、Vue 仪表盘 |
| 能力 | English | 中文说明 |
|---|---|---|
| Report Suites | YAML suite definitions, imported cases, replay caches, deterministic seeds, policy gates, preflight checks | YAML 套件、导入用例、响应缓存、确定性种子、策略门禁、运行前预检 |
| Report Artifacts | Markdown/HTML reports, evidence CSVs, risk registers, coverage summaries, release notes | Markdown/HTML 报告、证据矩阵、风险登记、覆盖率摘要、发布说明 |
| Evidence Integrity | JSON Schemas, manifests, SHA256/size checks, cross-artifact consistency | JSON Schema、制品清单、SHA256/大小校验、跨制品一致性校验 |
| Handoff QA | QA receipts, acceptance criteria, reviewer decisions, owner/due-date tracking | QA 回执、验收准则、评审决策、风险 owner 与到期日 |
| Assessment Coverage | Prompt injection, jailbreak framing, secret/PII exposure, Agent/MCP/tool risk, model artifact signals | Prompt Injection、越狱框架、敏感信息/PII、Agent/MCP/工具风险、模型制品信号 |
| Baseline Engine | FORGEDAN, AutoDAN, PAIR, GCG, Crescendo, TAP, model adapters, WebScan, API, dashboard | 多种攻击算法、模型适配器、WebScan、API 和仪表盘能力 |
pip install -e ".[web]"
pip install -e ".[all]"
```bash
pip install -e ".[dev]"
cd frontend && npm install
cd frontend
npm run dev # Dev server with hot reload
npm run build # Production build → dist/
---
```bash git clone https://github.com/Coff0xc/LLM-Security-Assessment-Framework.git cd LLM-Security-Assessment-Framework
pip install -e .
models: mock:test-model: prompt_usd_per_1k_tokens: 0.01 completion_usd_per_1k_tokens: 0.02 source: example-provider-pricing-sheet
For repeatable report reruns, suites can write and replay model responses from
a local JSON cache. Cache keys are derived from the model name and prompt
SHA256, so raw prompt bodies are not stored in the cache file. Cached entries do
store model outputs and usage metadata, so treat the cache as restricted report
evidence. Paths are resolved relative to the suite YAML file. Pair the cache
with `random_seed` when you want repeated evolutionary prompts to replay
deterministically across CLI runs.
yaml response_cache_file: .cache/smoke-response-cache.json random_seed: 1337
Reports include a Response Cache section with hits, misses, stored entries, and
whether the cache file was updated during the run.
Suites can import MCP server/tool manifests and turn each tool description into
a deterministic report case. The importer recursively reads `tools` arrays from
JSON or YAML manifests, names cases as `mcp-tool-*`, and keeps the manifest
source visible in report scope and `suite-config.json`. Imported cases also
carry structured provenance metadata (`source_type`, `manifest_file`,
`tool_name`, server trust fields, annotation keys, annotation hashes, and
`description_sha256`) in `suite-cases.jsonl` and `suite-case-matrix.csv`.
Nested MCP `annotations` fields are included in the normalized tool text so
malicious metadata buried under schemas or metadata blocks is still reportable.
Each imported MCP case also carries a heuristic `server_trust_score`, and the
Markdown/HTML report includes an MCP Trust Summary with tier counts, highest
score, affected cases, server names, and the score model rationale used to
interpret each tier.
yaml mcp_manifest_file: examples/mcp-server-manifest.json mcp_manifest_case_category: mcp-manifest
MCP manifests can also be gated by server trust tier. When
`allowed_mcp_trust_tiers` is configured, imported MCP cases with missing or
unapproved `server.trust.tier` metadata become policy violations while the run
still writes the full report pack for reviewer evidence.
yaml policy: allowed_mcp_trust_tiers: - internal - approved
MCP trust scores and rationale can be calibrated with a local JSON/YAML policy
file. The policy file is included in Source Inventory with SHA256 and byte size,
and the custom score model flows into case metadata plus the MCP Trust Summary.
yaml mcp_trust_policy_file: mcp-trust-policy.yml yaml
下面的截图来自 examples/ready-for-handoff-suite.yml 生成的真实报告交付链路,展示当前项目最核心的报告包、QA 回执和归档校验能力。



The screenshots below are generated from examples/ready-for-handoff-suite.yml and show the repository's current report-delivery workflow.
```bash cp .env.example .env
```
forgedan suite preflight examples/ready-for-handoff-suite.yml --strict --output reports/preflight-ready forgedan suite run examples/ready-for-handoff-suite.yml --output reports/suite-ready forgedan suite validate-report reports/suite-ready/suite-result.json forgedan suite verify-bundle reports/suite-ready/suite-manifest.json forgedan suite qa-report reports/suite-ready/suite-manifest.json --output reports/suite-ready/qa --strict-handoff forgedan suite archive reports/suite-ready/suite-manifest.json --output reports/suite-ready/handoff.zip forgedan suite verify-archive reports/suite-ready/handoff.zip
forgedan run --quick -g "test prompt" -m mock:test
forgedan web # Backend at :5000 cd frontend && npm run dev # Frontend at :5173 → open http://localhost:5173
python -c " from forgedan import ForgeDAN_Engine, ForgeDanConfig from forgedan.adapters import ModelAdapterFactory
adapter = ModelAdapterFactory.create_from_string('mock:test-model') engine = ForgeDAN_Engine(ForgeDanConfig(max_iterations=3, population_size=3)) engine.set_target_llm(adapter.generate_sync) result = engine.run('{goal}', 'test goal', 'target output') print(f'Success: {result.success}, Fitness: {result.best_fitness:.4f}') " ```
The first command sequence is the recommended smoke path for this repository: it runs the no-model preflight, generates a complete report pack, validates the machine-readable artifacts, checks manifest integrity, writes a QA receipt, and packages the deliverable into a ZIP that can be verified after handoff.
---
forgedan run -g "goal" -m "provider:model" # Run attack
forgedan run --quick -g "goal" # Quick demo (3 iterations)
forgedan test -m "provider:model" # Test model connectivity
forgedan suite run examples/smoke-suite.yml # Run a reproducible YAML suite with prompt/response scans
forgedan suite run examples/smoke-suite.yml --run-id-dir # Archive under output/<run_id> to avoid overwrites
forgedan suite run examples/agent-tool-suite.yml # Generate an Agent/MCP/RAG report pack
forgedan suite run examples/tool-policy-suite.yml # Generate an expected policy-fail tool-policy pack
forgedan suite run examples/mcp-manifest-suite.yml # Import MCP tool metadata into report cases
forgedan suite run examples/mcp-trust-policy-suite.yml # Generate an expected policy-fail MCP trust pack
forgedan suite run examples/mcp-trust-calibrated-suite.yml # Apply a local MCP trust score policy file
forgedan suite run examples/model-artifact-suite.yml # Import local model artifacts into report evidence
forgedan suite run examples/model-serialization-suite.yml # Scan local model serialization files without loading them
forgedan suite run examples/coverage-policy-suite.yml # Enforce required coverage gates
forgedan suite run examples/duplicate-evidence-suite.yml # Demonstrate duplicate evidence grouping
forgedan suite run examples/report-metadata-suite.yml # Generate a report with formal assessment metadata
forgedan suite run examples/acceptance-criteria-suite.yml # Add report acceptance criteria/sign-off gates
forgedan suite run examples/ready-for-handoff-suite.yml # Generate a fully passing report handoff sample
forgedan suite run examples/review-decision-suite.yml # Document accepted risk / reviewer decisions
forgedan suite run examples/risk-register-owner-suite.yml # Pre-fill risk register owner/status/due date
forgedan suite run examples/cost-pricing-suite.yml # Estimate report usage cost from suite pricing inputs
forgedan suite run examples/custom-scorer-suite.yml # Run a suite-defined reusable contains scorer
forgedan suite run examples/cached-response-suite.yml # Replay repeat report runs from a local response cache
forgedan suite preflight examples/ready-for-handoff-suite.yml --output reports/preflight # Check report readiness before model execution
forgedan suite compare base.json curr.json # Compare two suite-result.json artifacts
forgedan suite taxonomy --json # Export report finding taxonomy
forgedan suite schemas --json # Export report artifact schema references
forgedan suite validate-report suite-result.json # Validate a report artifact contract
forgedan suite verify-bundle suite-manifest.json # Verify report pack checksums and schemas
forgedan suite archive suite-manifest.json --output handoff.zip # Zip a verified report or comparison pack
forgedan suite verify-archive handoff.zip # Verify archive checksums, schemas, and suite cross-artifact consistency
forgedan suite qa-report suite-manifest.json # Write JSON/Markdown QA handoff receipt
forgedan suite qa-report suite-manifest.json --strict-handoff # Fail when readiness is not passed
forgedan report --input logs/attacks/ # Generate report
forgedan web # Launch web dashboard
forgedan defense generate --input logs/ # Generate defense training data
forgedan info # Show framework info
forgedan distributed coordinator # Start distributed coordinator
Before spending time or provider budget on a report run, `forgedan suite preflight <suite.yml>` performs a no-model readiness audit. It checks that the suite has formal report metadata, handoff acceptance criteria, risk-register owner/due-date defaults, policy/coverage gates, deterministic replay controls, valid scorer names, source inventory provenance, MCP trust policy when MCP manifests are imported, and an explicit model-serialization scope note when heuristic artifact scanning is used. Add --output <dir> to write suite-preflight.json and suite-preflight.md; suite run also includes the same preflight artifacts in every generated report pack. The JSON artifact is covered by schemas/suite-preflight.schema.json and can be validated with forgedan suite validate-report <dir>/suite-preflight.json. The command exits non-zero only for failed checks by default; add --strict to also fail on review_required items, or --json to print the audit for CI/archive scripts.
Suites can keep cases inline or load a reusable case file:
cases_file: examples/cases/prompt-injection-mini.jsonl
scorers:
- target_prefix
- refusal
- response_safety
Suites can also define lightweight reusable deterministic scorers. A contains scorer records whether the model response includes required reviewer-facing text and stores the scorer output in each case result, the case matrix, and score summary.
scorers:
- refusal_phrase_present
scorer_definitions:
- name: refusal_phrase_present
type: contains
text: cannot help
When a suite imports cases_file, mcp_manifest_file, or model_artifact_files, reports include a Source Inventory section with each input path, SHA256, byte size, and generated case count. The same inventory is stored in suite-config.json for audit replay and handoff checks, and the report schemas require it in both suite-config.json and the embedded suite-result.json configuration snapshot. validate-report also checks that the report Source Inventory counts match its entries and that the embedded suite configuration snapshot matches the report section.
Suites can include formal report metadata for assessment handoff. These fields flow into suite-config.json, suite-result.json, and the Markdown/HTML report metadata section. The public redacted report replaces client, author, and reviewer names with stable placeholders.
report_metadata:
assessment_id: LLM-REPORT-001
report_title: LLM Security Assessment Report
client: Example Corp
authors:
- Security Assessment Team
reviewers:
- Report QA Lead
classification: Confidential
assessment_start: "2026-05-01"
assessment_end: "2026-05-31"
Suites can also define report acceptance criteria for QA and sign-off workflows. Each item is carried into suite-config.json, suite-result.json, Markdown/HTML reports, and the QA receipt. A failed item blocks ready_for_handoff, while review_required keeps the acceptance section visible without marking the bundle as fully accepted. Acceptance criteria whose IDs match handoff checklist items, such as residual-risk-owner-signoff, raw-artifact-handling, and limitations-reviewed, can also turn those QA receipt items from review_required into passed when the criterion is marked passed or accepted_risk.
acceptance_criteria:
- id: evidence-reviewed
title: Evidence matrix reviewed
status: passed
owner: QA Lead
evidence: suite-evidence.csv
notes: Evidence rows sampled against the Markdown report.
- id: residual-risk-owner-signoff
title: Residual risk owner sign-off complete
status: review_required
owner: Risk Owner
evidence: suite-risk-register.json
notes: Awaiting final residual risk owner approval.
- id: raw-artifact-handling
title: Raw artifact handling reviewed
status: passed
owner: QA Lead
evidence: Raw prompts and responses restricted to authorized reviewers.
- id: limitations-reviewed
title: Limitations reviewed
status: passed
owner: QA Lead
evidence: Report limitations match the scoped assessment.
Policy failures can be paired with reviewer decisions without changing policy_passed. This gives the report pack a decision log for accepted risk, approvals, required mitigations, or rejected exceptions, and the QA receipt records whether policy exceptions have documented decisions.
review_decisions:
- id: accept-demo-residual-risk
title: Accept demo residual risk for report pack
status: accepted_risk
owner: QA Lead
related_policy_violations:
- max_risk_score
related_cases:
- injection-case
evidence: Assessment owner accepted this residual risk for a controlled report demo.
notes: Re-review before external publication.
Risk registers can be pre-filled with suite-level remediation tracking defaults. These values flow into suite-risk-register.json and suite-risk-register.csv for each generated finding, so the report pack can be handed to the owner without manually editing blank tracking columns first.
risk_register_defaults:
owner: AppSec Team
status: open
due_date: "2026-06-30"
Suites can include externally maintained model pricing inputs for reproducible usage cost estimates. ForgeDAN does not fetch live prices; the report records the source string you provide and computes estimated_cost_usd from observed prompt and completion tokens. You can keep rates inline, or point the suite at a local JSON/YAML pricing catalog. Catalog files are included in Source Inventory with SHA256 and size metadata so report reviewers can audit the price source used for cost calculations.
usage_pricing:
prompt_usd_per_1k_tokens: 0.01
completion_usd_per_1k_tokens: 0.02
source: pricing-sheet-v1
usage_pricing_file: usage-pricing-catalog.yml
```yaml
<details> <summary><b>REST API reference (click to expand)</b></summary>
```
| 方法 | 类型 | 中文说明 | 论文 |
|---|---|---|---|
| **FORGEDAN** | Evolutionary | 多层级字符/词/句变异,结合语义适应度和双 judge 机制 | [arXiv:2511.13548](https://arxiv.org/abs/2511.13548) |
| **AutoDAN** | Evolutionary | 面向隐蔽越狱 prompt 的层次化遗传算法 | [ICLR 2024](https://arxiv.org/abs/2310.04451) |
| **PAIR** | LLM-iterative | 通过 attacker-target LLM 迭代完成黑盒越狱 | [NeurIPS 2024](https://arxiv.org/abs/2310.08419) |
| **GCG** | Gradient-free | 基于贪心坐标搜索的 adversarial suffix 生成 | [ICML 2023](https://arxiv.org/abs/2307.15043) |
| **Crescendo** | Multi-turn | 从低风险内容逐步升级到高风险请求的多轮攻击 | [USENIX Security 2025](https://arxiv.org/abs/2404.01833) |
| **TAP** | Tree search | Tree-of-thought 攻击搜索,带剪枝与多 LLM 协作 | [NeurIPS 2024](https://arxiv.org/abs/2312.02119) |
forgedan suite preflight to catch missing metadata, weak handoff criteria, unresolved scorer names, missing provenance, and incomplete deterministic replay settings before spending provider budget.forgedan suite run; the run writes raw and redacted machine-readable artifacts, Markdown/HTML reports, CSV matrices, coverage, risk register, release notes, and a manifest.forgedan suite validate-report and forgedan suite verify-bundle; these checks bind schemas, hashes, summary counts, redacted artifacts, Markdown/HTML sidecars, and cross-artifact identities back to the source result.forgedan suite qa-report --strict-handoff; the receipt records checklist status, blockers, acceptance criteria, source inventory, schema checks, and reviewer-facing evidence.forgedan suite archive and forgedan suite verify-archive; generated release notes and the full report bundle index carry these handoff commands, and verify-bundle checks that they stay present. The same archive flow supports normal suite report packs and historical comparison packs.| Provider | 模型范围 | 配置示例 |
|---|---|---|
| OpenAI | GPT-3.5、GPT-4、GPT-4o | openai:gpt-4 |
| Anthropic | Claude 3 Opus/Sonnet/Haiku | anthropic:claude-3-opus |
| Gemini Pro、Gemini Vision | gemini:gemini-pro | |
| DeepSeek | DeepSeek-Chat、DeepSeek-Coder | deepseek:deepseek-chat |
| Zhipu / 智谱 | GLM-4、GLM-3 | zhipu:glm-4 |
| Qwen / 通义千问 | Qwen-Max、Qwen-Plus | qwen:qwen-max |
| Moonshot / 月之暗面 | Kimi | moonshot:moonshot-v1-8k |
| Yi / 零一万物 | Yi-Large、Yi-Medium | yi:yi-large |

高质量的LLM安全评估框架,提供多种攻击方法
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
总体来看,LLM安全评估框架 是一款质量优秀的AI工具,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | LLM-Security-Assessment-Framework |
| 原始描述 | 开源AI工具:FORGEDAN — An Evolutionary Framework for LLM Security Assessment | 6 Attack Meth。⭐22 · Python |
| Topics | ai安全对抗性攻击网络安全进化算法Python |
| GitHub | https://github.com/Coff0xc/LLM-Security-Assessment-Framework |
| License | MIT |
| 语言 | Python |
收录时间:2026-06-03 · 更新时间:2026-06-03 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。