Return token usage directly from the execute_llm call (e.g., as part of the response tuple) rather than relying on shared mutable state on the gateway. Alternatively, use contextvars or make the gateway return usage per-call.
Consider using a lightweight dependency injection container or a dataclass-based AppContext that is passed through, rather than module-level global state. Alternatively, wrap the CLI app in a class that accepts the factory in __init__. This would make the dependency explicit and testable.
Either rename the dimension to 'dependency' in the Literal type and update all references, or rename the agent role to 'maintainability'. The current hybrid creates confusion. If 'maintainability' is the intended dimension, update DIMENSION_LABELS and documentation to clarify that the 'dependency' agent covers the 'maintainability' dimension.
Extract _prioritize_source_files and _read_key_source_files into a use case (e.g., use_cases/prepare_codebase.py) or an adapter. Extract _build_sarif into a dedicated adapter (e.g., adapters/sarif_adapter.py). The composition root should only orchestrate, not implement business rules.
Consider sanitizing or escaping closing XML tags (e.g., </analyzed_code>) within user-controlled content before embedding it in prompts. Additionally, consider using Anthropic's native tool-use or structured input features rather than string interpolation for data boundaries.
CritiqueAgent validated findings using extended thinking. 7 of 61 findings were individually confirmed.
AI-assisted screening based on finding text. Not a substitute for professional penetration testing.
AI-estimated composite score. Consult qualified advisors for investment decisions.
Keyword-based mapping to AICPA Common Criteria (CC1–CC9). Not a formal SOC 2 audit.
Keyword-based mapping to PCI DSS 4.0 Requirement 6 (Secure Systems & Software). Not a formal PCI assessment.
Keyword-based mapping to NIST Cybersecurity Framework 2.0 (6 Functions). Not a formal NIST assessment.
Measures finding distribution across files. For contributor-based bus factor analysis, use git blame tools.
Based on $175/hr senior engineer rate and ~4 hours for equivalent manual review. Actual costs vary.