Your codebase scores B+ (86/100) — strong performance with documentation gaps
Top Strengths
PerformanceA (92)
SecurityA (91)
MaintainabilityB+ (86)
Key Concerns
DocumentationC+ (76)
ArchitectureB (81)
QualityB+ (86)
Severity Distribution
medium (16)
low (20)
info (14)
50 findings · 6 agents · 248s · ~88h tech debt
What to Fix First
01>mediumMassive duplication between APIResponse and LegacyAPIResponsesrc/anthropic/_legacy_response.py:188
Extract a shared `_parse_to_python(response, cast_to, *, is_stream, stream_cls, client, options)` free function and call it from both LegacyAPIResponse._parse and BaseAPIResponse._parse. Keep the public surfaces (sync vs async parse(), property-vs-method) divergent but eliminate the duplicated type-dispatch ladder.
02>mediumSync/async request loop duplicated nearly verbatim in BaseClient subclassessrc/anthropic/_base_client.py:877
Factor the retry/error-handling pipeline into a generic helper that takes `send`/`sleep`/`close_response` callables (one sync set, one async set). Alternatively, push the loop body into pure functions that operate on already-resolved `httpx.Response` and let each subclass handle only the I/O verbs. This would reduce maintenance risk and ensure parity.
03>mediumSSE event-name dispatch is a hand-maintained string ladder duplicated sync/asyncsrc/anthropic/_streaming.py:96
Define a module-level `_KNOWN_EVENT_TYPES: frozenset[str]` (or a small dispatch table) shared by both Stream and AsyncStream. The dispatch loop becomes `if sse.event in _KNOWN_EVENT_TYPES:` and the two stream implementations stay structurally identical. Better still, drive event handling from a generated registry tied to the OpenAPI spec.
04>mediumPublic package __init__ imports from lib/* unconditionally, coupling generated client to hand-written extensionssrc/anthropic/__init__.py:89
Use the existing `_resources_proxy`-style lazy attribute pattern for optional provider clients, or guard imports with `try/except ImportError` and only re-export when the optional extras are installed. This preserves the layering: generated core has zero runtime dependency on hand-written lib/, and provider variants pay-for-what-you-use.
05>mediumBaseClient leaks subclass concerns: hard-coded model token table and Anthropic-specific timeout calcsrc/anthropic/_base_client.py:570
Move `_calculate_nonstreaming_timeout` and the model-token table into the messages resource (resources/messages/messages.py) where the streaming/non-streaming policy lives. The base client should expose only generic `Timeout` helpers; product policy belongs at the resource layer.
B+0 / 100
Industry median: 68 · Well above industry median for Python projectsB grades represent well-maintained codebases with room for improvement
✓Resource layer cleanly separated and dependency direction is correct — Positive observation: SyncAPIResource / AsyncAPIResource (in _resource.py) are minimal, depend only on SyncAPIClient / AsyncAPIClient, and bind the HTTP verbs as bound methods. Resources never import sibling resources; cross-cutting concerns (retry, auth, serialization) live in the base client. Provider variants (lib/bedrock, lib/vertex, lib/aws) extend the same SyncAPIClient/AsyncAPIClient hierarchy rather than re-implementing it. The dependency direction is consistent: __init__ → _client → _base_client → _response/_streaming/_models, with no upward leaks. This is mature SDK architecture.
✓Exception handler swallows all errors during async client cleanup — AsyncHttpxClientWrapper.__del__ uses a bare `except Exception: pass` to catch any error during shutdown, which is acceptable for destructors but hides resource-cleanup failures (connection-pool leaks, file descriptor exhaustion). In long-running services this could mask connection exhaustion. CWE-755.
✓Pydantic-version conflict groups configured properly — Positive finding: `[tool.uv].conflicts` correctly declares mutual exclusion between `pydantic-v1`/`pydantic-v2` groups and between `pydantic-v1` and the `mcp` extra. Combined with the locked `index = [{ url = "https://pypi.org/simple", default = true }]` this gives reproducible installs from public PyPI regardless of contributor uv config — strong supply-chain hygiene.
CritiqueAgent validated findings using extended thinking. 4 of 50 findings were individually confirmed.
Architecture (9)
mediumMassive duplication between APIResponse and LegacyAPIResponse~6.0h▶
src/anthropic/_legacy_response.py:188-320
_response.py and _legacy_response.py contain near-identical parsing logic in BaseAPIResponse._parse and LegacyAPIResponse._parse (handling of TypeAlias unwrap, Annotated unwrap, JSONLDecoder, stream_cls, NoneType, str/int/float/bool, httpx.Response, BaseModel coercion, content-type parsing). Both code paths must be maintained in lockstep; bugs fixed in one are easily missed in the other. While dual existence is justified for backward compatibility, the implementation could share a private `_parse_impl` helper or compose via inheritance to remove ~200 lines of duplicated branching.
Recommendation: Extract a shared `_parse_to_python(response, cast_to, *, is_stream, stream_cls, client, options)` free function and call it from both LegacyAPIResponse._parse and BaseAPIResponse._parse. Keep the public surfaces (sync vs async parse(), property-vs-method) divergent but eliminate the duplicated type-dispatch ladder.
mediumSync/async request loop duplicated nearly verbatim in BaseClient subclasses~12.0h▶
src/anthropic/_base_client.py:877-1004
SyncAPIClient.request and AsyncAPIClient.request implement the same retry/backoff/timeout/HTTP-status-error pipeline (lines ~877-1004 vs ~1182-1320 in _base_client.py) with only `await`/`anyio.sleep` vs `time.sleep` differing. This is hundreds of lines of duplicated control flow — any change to retry semantics, logging, or error mapping must be made twice and stay consistent. Same applies to `_process_response`, `_sleep_for_retry`, and the post/patch/put/delete shells.
Recommendation: Factor the retry/error-handling pipeline into a generic helper that takes `send`/`sleep`/`close_response` callables (one sync set, one async set). Alternatively, push the loop body into pure functions that operate on already-resolved `httpx.Response` and let each subclass handle only the I/O verbs. This would reduce maintenance risk and ensure parity.
mediumSSE event-name dispatch is a hand-maintained string ladder duplicated sync/async~1.5h▶
src/anthropic/_streaming.py:96-135
Stream.__stream__ and AsyncStream.__stream__ both contain an enormous `or`-chain enumerating ~25 SSE event names (`message_start`, `agent.tool_use`, `session.deleted`, etc.). The list is duplicated verbatim in two places. Adding a new server-side event requires editing two locations; forgetting one silently drops events from the async client. This is a classic shotgun-surgery anti-pattern that scales poorly as the agents/sessions API grows.
Recommendation: Define a module-level `_KNOWN_EVENT_TYPES: frozenset[str]` (or a small dispatch table) shared by both Stream and AsyncStream. The dispatch loop becomes `if sse.event in _KNOWN_EVENT_TYPES:` and the two stream implementations stay structurally identical. Better still, drive event handling from a generated registry tied to the OpenAPI spec.
mediumPublic package __init__ imports from lib/* unconditionally, coupling generated client to hand-written extensions~3.0h▶
src/anthropic/__init__.py:89-94
anthropic/__init__.py imports from lib.aws, lib.tools, lib.vertex, lib.bedrock, lib.foundry, lib.streaming at top level (lines ~89-94). This means every `import anthropic` eagerly loads optional integrations, and a syntax error or import cost in any provider module breaks the core SDK. It also blurs the architectural boundary between the Stainless-generated core (resources/, _client.py, _base_client.py) and hand-written `lib/` extensions. Optional dependency extras (vertex, bedrock, aws, mcp) declared in pyproject.toml suggest these should be lazy.
Recommendation: Use the existing `_resources_proxy`-style lazy attribute pattern for optional provider clients, or guard imports with `try/except ImportError` and only re-export when the optional extras are installed. This preserves the layering: generated core has zero runtime dependency on hand-written lib/, and provider variants pay-for-what-you-use.
mediumBaseClient leaks subclass concerns: hard-coded model token table and Anthropic-specific timeout calc~2.0h▶
src/anthropic/_base_client.py:570-583
BaseClient._calculate_nonstreaming_timeout (lines ~570-583) hard-codes Anthropic-specific business logic (`60 * 60`, `128_000`, the streaming-required error message linking to anthropic-sdk-python README) into the generic base client. Likewise, _constants.py defines `MODEL_NONSTREAMING_TOKENS` with concrete model IDs. The base client is supposed to be a transport-layer abstraction shared by Anthropic, Bedrock, Vertex, AWS, and Foundry — embedding model-specific knowledge here violates the layering and makes the core harder to reuse for non-Messages endpoints.
Recommendation: Move `_calculate_nonstreaming_timeout` and the model-token table into the messages resource (resources/messages/messages.py) where the streaming/non-streaming policy lives. The base client should expose only generic `Timeout` helpers; product policy belongs at the resource layer.
low_DefaultHttpxClient and _DefaultAsyncHttpxClient duplicate full transport setup~2.0h▶
src/anthropic/_base_client.py:1063-1107
_DefaultHttpxClient.__init__ (lines ~1063-1107) and _DefaultAsyncHttpxClient.__init__ (lines ~1418-1462) are byte-for-byte identical except for `HTTPTransport` vs `AsyncHTTPTransport`. Socket option detection, proxy map construction, and mounts logic are all duplicated. A bug in one (e.g., a missed TCP_KEEP* constant) would not be reflected in the other.
Recommendation: Extract a `_build_default_httpx_kwargs(transport_cls)` helper that returns a kwargs dict and pass `HTTPTransport` or `AsyncHTTPTransport` as the transport class. Both default clients then become trivial subclasses.
lowAsyncHttpxClientWrapper.__del__ schedules aclose on event loop, risk of unawaited coroutine~3.0h✓ validated▶
src/anthropic/_base_client.py:1497-1505
AsyncHttpxClientWrapper.__del__ calls `asyncio.get_running_loop().create_task(self.aclose())`. If __del__ runs when no loop is running (process shutdown, finalization in another thread, or a non-asyncio runtime like trio/anyio), `get_running_loop()` raises RuntimeError, which is silently swallowed by the bare `except Exception`. The cleanup then never happens and the connection leaks. The TODO comment acknowledges this, but the design choice — relying on __del__ for resource cleanup of an async resource — is architecturally fragile.
Recommendation: Document the explicit `await client.close()` / `async with` requirement more prominently and consider emitting a ResourceWarning when the wrapper is GC'd while still open. For trio/anyio compatibility, detect the active sniffio backend and use `anyio.from_thread` or skip the auto-close. Long-term, remove the __del__ hook and rely solely on context-manager semantics.
low_make_status_error mapping duplicated between Anthropic and AsyncAnthropic~2.0h▶
src/anthropic/_client.py:205-238
Anthropic._make_status_error (lines ~205-238) and AsyncAnthropic._make_status_error (lines ~376-409) contain identical status-code-to-exception ladders. Same for the auth header methods (_api_key_auth, _bearer_auth), _validate_headers, default_headers, and copy(). Roughly 150 lines per client class are mirror images, with the only differences being `httpx.Client` vs `httpx.AsyncClient` and `Stream` vs `AsyncStream`. New error codes (e.g., the recently added 413 RequestTooLargeError, 529 OverloadedError) must be added in both places.
Recommendation: Define a module-level `_STATUS_ERROR_MAP: dict[int, type[APIStatusError]]` and a shared `_resolve_status_error(status_code, err_msg, response, body)` helper. Both Anthropic and AsyncAnthropic delegate to it. Same approach for auth/header logic — consider a mixin class `_AnthropicAuthMixin` carrying the shared bits.
lowImplicit circular import between _streaming.py and lib/streaming~1.0h▶
src/anthropic/_streaming.py:24-41
_streaming.py defines metaclasses _SyncStreamMeta / _AsyncStreamMeta that import from `.lib.streaming` inside __instancecheck__ (lines ~30 and ~159). This works because the import is deferred to method-call time, but it establishes a hidden cycle: core (_streaming) ← lib.streaming ← core. The metaclass also exists solely to emit a deprecation warning for an old isinstance() contract, meaning every isinstance(x, Stream) check in user code triggers a lib.streaming import. This is a subtle layering violation where the generated core depends on a hand-written extension.
Recommendation: Plan the deprecation warning's removal in the next major version (the warning text already says so) and delete the metaclass at that point. Until then, document the cycle explicitly with a comment, and ensure no eager imports of lib.streaming occur from _streaming module-level code.
estimated effort: ~33h
Security (9)
lowPotentially sensitive request/response headers logged at DEBUG level~2.0h▶
src/anthropic/_base_client.py:968-977
The base client logs full request URLs and complete response headers at DEBUG level. Response headers can include sensitive items (e.g., set-cookie, anthropic-organization-id, request-id correlations) and request URLs may contain query-string-encoded secrets if a caller uses custom_query for auth. While DEBUG is opt-in, downstream applications often enable DEBUG globally and inadvertently leak sensitive data into log aggregation systems. CWE-532: Insertion of Sensitive Information into Log File. OWASP A09:2021.
Recommendation: Redact known sensitive header names (Authorization, X-Api-Key, Cookie, Set-Cookie, Proxy-Authorization) before logging headers. Consider also redacting the URL query string or only logging the path. Document that DEBUG logging may include sensitive data so operators can configure log filters appropriately.
lowRequest options dump at DEBUG may include request bodies/JSON payloads~2.0h▶
src/anthropic/_base_client.py:333-346
_build_request logs the full FinalRequestOptions via model_dump at DEBUG level, including json_data (the user prompt and any embedded data). For an LLM SDK this can include PII, source code, secrets pasted into prompts, etc. CWE-532. While exclude={'content'} is set on Pydantic v1, regular json_data is still serialized. Operators enabling DEBUG may unintentionally persist sensitive prompt content.
Recommendation: Document this behavior prominently in SECURITY.md/README and consider adding a redaction hook or a flag to suppress json_data in debug output. At minimum, log only field names/sizes by default and require a separate verbose flag to dump full payloads.
lowfollow_redirects=True by default may leak credentials on cross-origin redirects~3.0h✓ validated▶
src/anthropic/_base_client.py:685-690
Both _DefaultHttpxClient and _DefaultAsyncHttpxClient set follow_redirects=True by default. Combined with auth headers attached via default_headers (X-Api-Key / Authorization Bearer), if the configured base_url (or a user-supplied base_url) returns a 30x redirect to a different host, httpx will replay the same Authorization/X-Api-Key headers on the redirect target, exposing the credential to a third-party host. CWE-200: Exposure of Sensitive Information; CWE-601: Open Redirect. httpx does not strip Authorization on cross-origin redirects in older versions, and credentials in the X-Api-Key custom header are never stripped automatically.
Recommendation: Default follow_redirects to False for authenticated requests, or strip Authorization/X-Api-Key headers before following a redirect to a different host. At minimum, document the risk and recommend explicit base_url validation for users overriding ANTHROPIC_BASE_URL.
lowANTHROPIC_BASE_URL environment variable accepted without validation~2.0h✓ validated▶
src/anthropic/_client.py:91-95
The Anthropic / AsyncAnthropic clients read ANTHROPIC_BASE_URL from the environment with no scheme/host validation. An attacker who can influence the process environment (e.g., compromised CI, malicious .env, supply-chain) can redirect all API traffic — including the X-Api-Key header — to an attacker-controlled HTTP/HTTPS endpoint. CWE-15: External Control of System or Configuration Setting; CWE-918 (SSRF-adjacent). While environment-variable trust is a typical SDK pattern, the SDK does not warn when base_url points to a non-anthropic.com host, nor does it enforce HTTPS.
Recommendation: Emit a warning (or require an explicit opt-in) when base_url is overridden to a non-anthropic.com host or to a non-HTTPS scheme. At minimum, document the risk in SECURITY.md and recommend that production systems set base_url explicitly in code rather than via env.
infotrust_env defaults to True via httpx, enabling environment-controlled proxying~1.5h▶
src/anthropic/_base_client.py:691-715
_DefaultHttpxClient/_DefaultAsyncHttpxClient construct transports with proxy mounts derived from environment variables (HTTP_PROXY/HTTPS_PROXY/NO_PROXY) via get_environment_proxies(). If an attacker controls these env vars on a host, they can route Anthropic API traffic through a MITM proxy. TLS verification still applies (verify=True default), so successful interception requires a trusted CA, but the behavior is worth flagging. CWE-441: Unintended Proxy or Intermediary. This matches httpx's standard trust_env behavior and is documented; not a defect, but customers in hardened environments should know to set trust_env=False.
Recommendation: Document in SECURITY.md that the SDK respects HTTP(S)_PROXY/NO_PROXY environment variables by default, and recommend passing a custom http_client with trust_env=False for high-assurance deployments. Consider adding an explicit `respect_env_proxies` constructor flag.
infoAPI key stored as plain attribute and emitted via headers without protection~1.5h▶
src/anthropic/_client.py:81-89
self.api_key and self.auth_token are stored as plain str attributes on the client and merged into auth_headers on each request. There is no use of SecretStr/zeroization, and the value will appear in any repr/pickle/serialization of the client object. While this is conventional for HTTP SDKs, exception tracebacks or accidental logging of the client instance can leak the key. CWE-312: Cleartext Storage of Sensitive Information.
Recommendation: Consider wrapping api_key/auth_token with a SecretStr-like helper that overrides __repr__/__str__ to redact the value, and document that users should never log the client instance. Add a __repr__ to Anthropic/AsyncAnthropic that omits credentials.
infoIdempotency key derived from non-cryptographic uuid4 is acceptable but worth noting▶
src/anthropic/_base_client.py:660-661
_idempotency_key uses uuid.uuid4() which is sufficient for collision avoidance and not security-sensitive here (idempotency keys are not auth tokens). No vulnerability — flagged only because the codebase uses random.random() for retry jitter (also non-security). Verified that random/uuid usage is not used for any authentication or token-generation purpose.
Recommendation: No action required. Continue to ensure no security-sensitive token/nonce is generated from `random` — only from `secrets`/`uuid4`.
infoMissing TLS minimum version / certificate pinning configuration~2.0h▶
src/anthropic/_base_client.py:666-720
The SDK relies entirely on httpx defaults for TLS configuration: it accepts whatever the system trust store and OpenSSL version negotiate. There is no enforcement of TLS 1.2+ and no support for certificate pinning to api.anthropic.com. For most users this is acceptable (modern Python defaults to TLS 1.2+), but high-assurance deployments cannot pin the Anthropic CA without subclassing the transport. CWE-295/CWE-757.
Recommendation: Document how to pass a custom httpx.SSLContext via the `http_client` parameter for users who need TLS hardening or pinning. Consider exposing a `verify` / `ssl_context` constructor parameter directly on the Anthropic client.
infoUnable to verify AWS SigV4, Vertex/Google OAuth, MCP credential vault, and _logs setup▶
SECURITY.md
The PLAN identifies aws/_auth.py, aws/_credentials.py, bedrock/_auth.py, vertex/_auth.py, _extras/_google_auth.py, lib/tools/mcp.py, and _utils/_logs.py as in-scope, but their source was not included in the analyzed bundle. I cannot validate signing correctness, credential-cache permissions, OAuth token refresh, MCP secret storage, or whether _setup_logging adds redaction filters. This is a coverage gap, not a confirmed vulnerability.
Recommendation: Provide the omitted files for a follow-up security review focused on: (1) SigV4 canonical request construction and clock-skew handling, (2) Vertex/Google credential refresh and token caching scope, (3) MCP OAuth token storage on disk (file mode 0600, no world-readable cache), (4) whether _setup_logging installs any sensitive-data filter, and (5) handling of AWS_SESSION_TOKEN/temporary credentials.
estimated effort: ~14h
Quality (5)
mediumSignificant code duplication between sync/async client implementations~4.0h▶
src/anthropic/_base_client.py
The Anthropic and AsyncAnthropic classes in _client.py duplicate ~150 lines of nearly identical code (constructor, copy(), _make_status_error, _validate_headers, _api_key_auth, _bearer_auth, default_headers). Similar duplication exists between SyncAPIClient and AsyncAPIClient in _base_client.py for request(), _process_response, and the entire _DefaultHttpxClient/_DefaultAsyncHttpxClient classes (~50 lines duplicated for socket option setup).
Recommendation: Extract shared socket option / transport setup into a helper function. Acknowledge that some duplication is required between sync/async due to await semantics, but the transport/socket configuration block is purely synchronous and can be shared. Note: this is generated code from Stainless, so changes should be made upstream.
mediumLong stream event matching uses repetitive or-chain instead of set/lookup~1.0h▶
src/anthropic/_streaming.py
Both Stream.__stream__ and AsyncStream.__stream__ in _streaming.py contain a large if-statement chaining 30+ string equality checks via `or` operators. This is duplicated between sync and async implementations and is hard to maintain — adding a new event type requires editing two places. Cyclomatic complexity is unnecessarily high.
Recommendation: Define a module-level frozenset of known event names (e.g., `_KNOWN_EVENTS = frozenset({...})`) and replace the chained or-comparisons with `if sse.event in _KNOWN_EVENTS:`. This reduces complexity, dedupes the list across sync/async, and makes maintenance trivial.
mediumconstruct_type function has high cyclomatic complexity~4.0h▶
src/anthropic/_models.py
construct_type() in _models.py is ~80 lines with ~15 branches handling unions, dicts, lists, basemodels, datetime, date, float coercion, type aliases, annotated types, and discriminated unions. The function mixes multiple responsibilities and makes the control flow hard to follow and test.
Recommendation: Split into smaller helpers: _construct_union, _construct_dict, _construct_list, _construct_model, _construct_scalar. Use a small dispatch table keyed on origin where possible. Target complexity <= 10 per function.
lowruff target-version mismatch with project requires-python~0.2h▶
pyproject.toml
pyproject.toml sets `requires-python = '>= 3.9'` but `[tool.ruff] target-version = 'py38'`. This causes ruff to lint against an older Python than is actually supported, missing modernization opportunities (e.g., union syntax, walrus, dict union operators) and producing inconsistent style guidance vs pyright (`pythonVersion = '3.9'`).
Recommendation: Set `target-version = 'py39'` in [tool.ruff] to align with `requires-python` and pyright's `pythonVersion`. This unlocks py39-specific lint rules and modernization fixes.
infoTest files referenced by plan were not provided for review▶
tests/conftest.py
The plan lists tests/conftest.py, tests/test_client.py, tests/test_streaming.py, tests/test_transform.py, tests/lib/streaming/test_messages.py, tests/lib/tools/test_runners.py, tests/api_resources/test_messages.py, tests/api_resources/beta/test_agents.py as focus areas, but the actual contents of these files are not present in the analyzed_code. As a result, we cannot directly assess test coverage gaps, snapshot/inline-snapshot usage, retry-test thoroughness, or fixture quality.
Recommendation: Re-run analysis with the test file contents included so coverage of error paths, streaming events, and retry behavior can be verified. The presence of `inline-snapshot`, `respx`, `pytest-asyncio`, `time-machine`, and `http-snapshot` in dev deps is a strong positive signal, but actual usage cannot be confirmed.
`Anthropic.__init__` and `AsyncAnthropic.__init__` have a brief docstring describing env-var inference but do not document each parameter (api_key, auth_token, base_url, timeout, max_retries, default_headers, default_query, http_client), their accepted types/sentinels (NotGiven, Omit), default values, or the runtime exceptions raised (e.g., the TypeError raised by `_validate_headers` when neither api_key nor auth_token is set). For an SDK whose primary entry point is this constructor, this is a meaningful gap.
Recommendation: Expand the `__init__` docstrings on both `Anthropic` and `AsyncAnthropic` to follow Google/NumPy-style docstrings: list each Args entry with type and default, document the `_strict_response_validation` semantics, describe the auth resolution order, and list Raises (TypeError on unresolved auth, ValueError on invalid config). Cross-link to README sections for environment variables and proxies.
`Anthropic.copy` and `AsyncAnthropic.copy` (aliased as `with_options`) accept a number of subtle parameters — `default_headers` vs `set_default_headers`, `default_query` vs `set_default_query`, and `_extra_kwargs` — whose mutual-exclusion semantics and merge behavior are non-obvious. The docstring is one line and does not describe these distinctions or the `with_options` alias.
Recommendation: Document each parameter in the `copy` docstring, explicitly explaining: (a) `default_headers` merges with existing headers while `set_default_headers` replaces them, (b) the same for query params, (c) that the two are mutually exclusive (raises ValueError), and (d) the public alias `with_options`. Provide a usage example such as `client.with_options(timeout=10).messages.create(...)`.
mediumException class hierarchy lacks docstrings on most concrete error types~1.5h▶
src/anthropic/_exceptions.py:79-130
In `_exceptions.py`, only `APIStatusError`, `APIError.body`, and `APITimeoutError` (via its message) carry any documentation. The eight concrete status-code subclasses (`BadRequestError`, `AuthenticationError`, `PermissionDeniedError`, `NotFoundError`, `ConflictError`, `RequestTooLargeError`, `UnprocessableEntityError`, `RateLimitError`, `ServiceUnavailableError`, `OverloadedError`, `DeadlineExceededError`, `InternalServerError`) have no docstrings. Users catching these exceptions need to know which conditions trigger each and what attributes (`request_id`, `status_code`, `type`, `body`) are available.
Recommendation: Add a one-paragraph docstring to each concrete exception class describing the API condition that produces it and reminding users of the available attributes. Also add a module-level docstring summarizing the hierarchy (AnthropicError -> APIError -> APIStatusError -> specific subclasses; APIError -> APIConnectionError -> APITimeoutError) and link to the API errors documentation.
medium`MODEL_NONSTREAMING_TOKENS` constant is undocumented~0.5h▶
src/anthropic/_constants.py:21-31
`_constants.py` defines `MODEL_NONSTREAMING_TOKENS`, a model->token threshold map used by `_calculate_nonstreaming_timeout` to decide when streaming is required. There is no docstring explaining what the dictionary represents, when entries should be added, or how end users should interpret it. Because this directly affects when a `ValueError` is raised on `messages.create`, it is user-visible behavior that should be documented.
Recommendation: Add a comment/docstring above `MODEL_NONSTREAMING_TOKENS` explaining: the keys are model identifiers (including Bedrock/Vertex variants), the values are the maximum `max_tokens` allowed for non-streaming requests, and exceeding the value forces clients to stream. Cross-reference this from README/api.md long-requests guidance.
mediumStream event-name list is duplicated and undocumented in `Stream`/`AsyncStream`~1.5h▶
src/anthropic/_streaming.py:95-138
Both `Stream.__stream__` and `AsyncStream.__stream__` contain a long, identical hard-coded list of supported SSE event names (message_start, content_block_delta, agent.tool_use, session.status_running, span.model_request_start, etc.). There is no module-level documentation listing the event types the SDK understands, no link to the public events reference, and the duplication makes drift between sync/async paths likely. Users writing custom stream handlers have no enumerated list of events they can rely on.
Recommendation: Extract the supported event-name set to a module-level constant (e.g., `SUPPORTED_STREAM_EVENTS = frozenset({...})`) with a docstring explaining each category (messages, agents, sessions, spans). Reference this constant from both `Stream` and `AsyncStream`, and document the full list in helpers.md.
mediumModule-level docstrings missing from key public modules~1.5h▶
src/anthropic/_streaming.py:1-5
Several user-facing modules in `src/anthropic/` open straight to imports with no module docstring: `_client.py`, `_exceptions.py`, `_response.py`, `_legacy_response.py`, `_streaming.py`, `_types.py`, and `_models.py`. Tools such as Sphinx, IDE hover, and `help(anthropic._streaming)` show nothing. Given that `_response.py` (`APIResponse`/`AsyncAPIResponse`) and `_streaming.py` (`Stream`/`AsyncStream`) define types users frequently import or annotate against, module docstrings would noticeably improve discoverability.
Recommendation: Add a 2-5 line module docstring to each of `_client.py`, `_exceptions.py`, `_response.py`, `_legacy_response.py`, `_streaming.py`, `_models.py`, `_types.py`, and `_base_client.py` summarizing the module's purpose and listing the most important public symbols. Note that `_legacy_response.py` is slated for replacement so users should prefer `_response.py`.
low`DeadlineExceededError` and `ServiceUnavailableError` are defined but not exported~1.0h▶
src/anthropic/_exceptions.py:11-22
`_exceptions.py` defines `ServiceUnavailableError` (503) and `DeadlineExceededError` (504), but the module's `__all__` does not list them and `anthropic/__init__.py` does not re-export them. Users cannot import them as `from anthropic import DeadlineExceededError`, and the documentation has no obvious place to describe these errors. Additionally, `_make_status_error` in `_client.py` does not map 503/504 status codes to these specific classes, so they are effectively unreachable through the normal request path — a documentation/behavior mismatch.
Recommendation: Either remove the unused classes, or (a) add them to `__all__` and re-export from `anthropic/__init__.py`, (b) wire them up in `Anthropic._make_status_error`/`AsyncAnthropic._make_status_error` for status codes 503 and 504, and (c) document them in the README/api.md error-handling section.
low`HttpxBinaryResponseContent` lacks class docstring and exposes deprecated methods~0.5h▶
src/anthropic/_legacy_response.py:339-360
`HttpxBinaryResponseContent` in `_legacy_response.py` has no class-level docstring describing that it wraps an httpx.Response for binary downloads, what its lifecycle is (closed automatically? caller must close?), and that `stream_to_file` / `astream_to_file` are deprecated due to a bug. The `@deprecated` decorators include short messages, but a class-level note would help discoverability.
Recommendation: Add a class docstring to `HttpxBinaryResponseContent` explaining its purpose, the recommended `with_streaming_response` pattern, and the deprecation status of `stream_to_file`/`astream_to_file`. Also document `write_to_file` semantics (eagerly buffers entire response).
low`PageInfo` and pagination classes lack usage examples~1.0h▶
src/anthropic/_base_client.py:113-270
`PageInfo`, `BasePage`, `BaseSyncPage`, `BaseAsyncPage`, and `AsyncPaginator` have brief docstrings on the class itself but none demonstrate how a consumer should iterate paginated endpoints (e.g., `for item in client.models.list(): ...` or `async for item in await client.models.list(): ...`). Pagination is one of the most common stumbling blocks in SDK consumption.
Recommendation: Add usage examples to `BaseSyncPage.iter_pages` and `BaseAsyncPage.__aiter__` showing the canonical sync `for x in client.foo.list(): ...` and async `async for x in client.foo.list(): ...` patterns. Cross-reference these from api.md or a dedicated Pagination section in README.md.
low`make_request_options`, `BaseClient`, `_DefaultHttpxClient` are public-by-name but underdocumented~1.5h▶
src/anthropic/_base_client.py:700-760
`make_request_options` has only a one-line docstring; `BaseClient`, `_DefaultHttpxClient`, and `_DefaultAsyncHttpxClient` have no class docstrings explaining their role (the latter being aliased to public `DefaultHttpxClient`/`DefaultAsyncHttpxClient`/`DefaultAioHttpClient`). The public aliases at the bottom of the file have one-line docstrings but the underlying class behavior — TCP keepalive socket options, environment-proxy handling, and `aiohttp` extra requirement — is undocumented for users who subclass them.
Recommendation: Add class docstrings to `_DefaultHttpxClient` and `_DefaultAsyncHttpxClient` describing the defaults applied (timeout, limits, follow_redirects=True, TCP keepalive socket options, environment proxy handling) so users overriding `http_client` know what they will lose. Expand the `make_request_options` docstring with a note on `NotGiven` filtering and an example.
low`SyncAPIResource` and `AsyncAPIResource` have no docstrings~0.5h▶
src/anthropic/_resource.py:11-41
`_resource.py` defines `SyncAPIResource` and `AsyncAPIResource` (the base for every generated resource). Neither class nor its `_sleep` method has any docstring explaining the contract for resource subclasses or how downstream code is expected to consume them. While these are largely internal, they are exposed via the `with_raw_response` / `with_streaming_response` patterns and inherited by every resource module.
Recommendation: Add docstrings to `SyncAPIResource` and `AsyncAPIResource` describing their role as the base class for generated resources, the proxied client methods (`_get`, `_post`, etc.), and the `_sleep` helper used by polling helpers. Note that they are internal implementation details (not intended for direct subclassing by SDK users).
low`build`, `construct_type`, `set_pydantic_config`, `transform_schema` exposed without user-facing docs~1.0h▶
src/anthropic/_models.py:405-440
`anthropic/__init__.py` re-exports `transform_schema` (from `lib._parse._transform`). Inside `_models.py`, `build()`, `construct_type()`, `construct_type_unchecked()`, and `set_pydantic_config()` are public-by-naming and used in user examples. Their docstrings exist but are short and lack guidance on when to prefer each (e.g., `build` for type-safe construction vs `construct_type` for loose coercion vs Pydantic's own `model_validate`). For an SDK that heavily relies on these helpers in its tools/agents helpers, this matters.
Recommendation: Expand docstrings on `build`, `construct_type`, `construct_type_unchecked`, and `set_pydantic_config` with a 'When to use this' section comparing them to standard Pydantic alternatives. Document `transform_schema` (re-exported as `anthropic.transform_schema`) at the module level in `lib/_parse/_transform.py` since it appears in the public API surface.
infoREADME, api.md, helpers.md, tools.md, CHANGELOG.md, CONTRIBUTING.md, foundry.md content not provided for analysis~0.5h▶
README.md
The PLAN listed README.md, CONTRIBUTING.md, api.md, helpers.md, tools.md, CHANGELOG.md, and src/anthropic/lib/foundry.md as the documentation files of focus, but only their filenames appear in the file listing — none of the actual file contents were included in the SOURCE CODE section. As a result, completeness and accuracy of those documents (the primary documentation surface for SDK consumers) cannot be directly assessed from the provided material. Findings below are based solely on the Python source modules that were provided.
Recommendation: When evaluating documentation, supply the contents of the named markdown files (README.md, api.md, helpers.md, tools.md, CHANGELOG.md, CONTRIBUTING.md, foundry.md) so coverage of resources (messages, completions, models, beta, files, batches, agents), platforms (Bedrock, Vertex, AWS, Foundry), and helpers (streaming, tools, MCP) can be verified.
infoExamples directory referenced but example file contents not provided~2.0h▶
examples/agents.py
The PLAN names `examples/agents.py`, `examples/messages_stream.py`, and `examples/tools_runner.py` as in-scope, but only the filenames appear in the listing — their bodies are not in the SOURCE CODE block. Coverage of Bedrock/Vertex/Azure/MCP/agents examples therefore cannot be confirmed against `pyproject.toml`'s `mcp`, `vertex`, `aws`, `bedrock`, and `aiohttp` extras. Notably, `pyproject.toml`'s `pyright.exclude` references `examples/mcp_tool_runner.py`, and `mypy.exclude` references `examples/mcp_server_weather.py`, `examples/tools_with_mcp.py`, `examples/memory/basic.py` — implying those files exist and should be linked from README/examples docs.
Recommendation: Provide the example file contents for review. Then ensure the README has an `Examples` section linking to each example (agents, messages_stream, tools_runner, mcp_server_weather, mcp_tool_runner, tools_with_mcp, memory/basic) with a one-line description and the optional extras required (e.g., `pip install anthropic[mcp]`).
estimated effort: ~16h
Maintainability (9)
mediumPermissive upper bound on pydantic enables untested v3 prereleases~8.0h▶
pyproject.toml:13-13
pydantic is pinned as `pydantic>=1.9.0, <3` in pyproject.toml. This range is unusually wide and allows installation across pydantic v1 and v2. While the code includes a PYDANTIC_V1 compatibility shim in `_compat.py` and `_models.py`, supporting both major versions doubles the test surface and introduces brittle conditional code paths. Pydantic v1 reached end-of-life on June 30, 2024 and only receives critical security fixes through Pydantic Logfire customers — keeping v1 support increases long-term maintenance burden and exposes downstream users to an unsupported runtime.
Recommendation: Plan a deprecation of Pydantic v1 support and tighten the constraint to `pydantic>=2.0,<3`. If v1 must remain supported short-term, document the EOL status and add a DeprecationWarning when PYDANTIC_V1 is detected.
medium`aiohttp` extra has no version bound~1.0h▶
pyproject.toml:46-46
The `aiohttp` optional extra declares `["aiohttp", "httpx_aiohttp>=0.1.9"]`. `aiohttp` itself is unbounded, which is risky because aiohttp has a recurring history of CVEs (e.g. CVE-2024-23334 directory traversal in <3.9.2, CVE-2024-30251 infinite loop in <3.9.4, CVE-2024-52303/52304 in <3.10.11). Without a lower bound, users can install vulnerable old versions; without an upper bound, a major release could break `httpx_aiohttp` integration.
Recommendation: Constrain to a known-good range, e.g. `"aiohttp>=3.9.4,<4"` (excludes the directory-traversal CVE-2024-23334 and infinite-loop CVE-2024-30251 fixed in 3.9.2/3.9.4 respectively) and document the rationale.
lowUnpinned `sniffio` dependency~0.5h▶
pyproject.toml:16-16
`sniffio` is listed without any version specifier in the core dependencies. Every other core dependency uses an explicit lower-and-upper bound (e.g. `httpx>=0.25.0, <1`). An unbounded transitive can install a future incompatible major version of sniffio, breaking anyio integration without warning.
Recommendation: Pin to a sane range, e.g. `"sniffio>=1.1,<2"`, matching the convention used for the other dependencies.
lowDuplicate definitions for `aws` and `bedrock` extras encourage drift~1.0h▶
pyproject.toml:48-49
`aws` and `bedrock` extras declare identical dependency lists (`boto3 >= 1.28.57`, `botocore >= 1.31.57`). Defining the same set in two places is a maintenance hazard — future bumps will inevitably get out of sync, and consumers won't know which extra is canonical.
Recommendation: Make one extra an alias of the other (e.g. define `aws` and have `bedrock` reference it via PEP 735 dependency groups, or document that `bedrock` is deprecated in favor of `aws`). At minimum, add a comment that they must remain identical.
The project declares `requires-python = ">= 3.9"` and classifiers cover 3.9–3.14, but `[tool.ruff] target-version = "py38"` tells ruff to target 3.8 syntax. This means ruff will not flag uses of features introduced in 3.9 (e.g. `dict[...]`, `list[...]` builtin generics in non-`from __future__` contexts) and may suggest backwards-compatible rewrites. It's a minor configuration drift that affects code quality enforcement.
Recommendation: Bump to `target-version = "py39"` to match the supported floor (or `py38` only if you intentionally still support 3.8 in some downstream).
lowDev tools pinned to exact versions without auto-update mechanism visible~1.0h▶
pyproject.toml:41-56
Dev dependencies pin `pyright==1.1.399`, `mypy==1.17`, and `http-snapshot[httpx]==0.1.8` exactly. Exact pinning of dev tooling is reasonable for reproducibility, but without a Dependabot/Renovate configuration in the provided files it's easy for these to fall behind security fixes (mypy/pyright themselves rarely have CVEs, but their transitive dependencies can).
Recommendation: Add a Dependabot or Renovate config that targets `pyproject.toml` and `uv.lock` weekly, scoped to dev/build dependencies, so exact pins get refreshed automatically.
infoPyright `pythonVersion = "3.9"` good, but mypy lacks explicit `python_version`~0.5h▶
pyproject.toml:105-110
Pyright is configured for Python 3.9, which correctly matches the supported floor. The `[tool.mypy]` section, however, does not set `python_version`, so mypy defaults to the interpreter version running it. This can let modern-syntax-only constructs slip past local checks while breaking on 3.9 users.
Recommendation: Add `python_version = "3.9"` under `[tool.mypy]` to mirror pyright and ensure 3.9 compatibility is enforced in CI regardless of the interpreter mypy runs on.
infoOptional `httpx_aiohttp` import lacks version compatibility check~1.0h▶
src/anthropic/_base_client.py
In `_base_client.py` the code does `import httpx_aiohttp` and then subclasses `httpx_aiohttp.HttpxAiohttpClient`. Because the `aiohttp` extra requires `httpx_aiohttp>=0.1.9` (a 0.x package, no upper bound), a future API rename of `HttpxAiohttpClient` would raise an `AttributeError` at class-definition time and break import of the whole SDK for users who have an incompatible `httpx_aiohttp` installed without the extra.
Recommendation: Add an upper bound on `httpx_aiohttp` (e.g. `httpx_aiohttp>=0.1.9,<0.2`) and/or guard the subclass construction with a `getattr`/try-except so a mismatched version downgrades to a clear runtime error rather than an import-time crash.
infoOptional `mcp` extra correctly gated for Python >= 3.10▶
pyproject.toml:50-50
Positive finding: `mcp = ["mcp>=1.0; python_version >= '3.10'"]` correctly uses an environment marker so the extra is a no-op on Python 3.9 (which `mcp` does not support). Pyright is also configured to exclude `examples/mcp_tool_runner.py` since lint runs on 3.9. This is the right pattern and avoids hard-failing pip installs on 3.9 users requesting `[mcp]`.
Recommendation: No action needed. Consider documenting in README that `[mcp]` requires Python 3.10+.
JSONLDecoder.__decode__ accumulates lines via repeated `buf += line` byte concatenation. For long JSON lines split across many chunks, each `+=` on bytes creates a new bytes object, leading to O(n²) memory/CPU. Additionally, it processes one splitline at a time and re-checks endswith. With multi-megabyte JSON lines this becomes a real bottleneck.
Recommendation: Use a `bytearray` for the buffer (in-place `+=` is amortized O(1)) or accumulate chunks in a list and `b"".join(...)` when a full line is detected. Also consider iterating using `iter_lines`-style logic at the chunk boundary rather than splitlines per chunk.
estimated effort: ~1h
Cross-Cutting Insights
Patterns identified by CritiqueAgent that span multiple dimensions
arch-003, qual-001, and doc-009 all describe the same SSE event-list duplication — a single refactor (extract module-level frozenset + document in helpers.md) resolves architecture, quality, and documentation findings simultaneously.
arch-001, arch-007, and qual-000 collectively describe the sync/async duplication tax. Since the SDK is Stainless-generated, these are upstream-template concerns and should be batched into a single ADR/upstream issue rather than addressed in this repo.
qual-005 and dep-007 are duplicate findings about the ruff target-version/requires-python mismatch. Should be deduplicated — single trivial fix.
sec-000, sec-001, and sec-005 share a redaction theme (logs, debug dumps, repr). A single SecretStr-style helper plus a redacting log filter would close all three.
doc-004 is partially a code-correctness issue: the unreachable 503/504 mapping in _make_status_error makes the documented error classes inaccessible — fixing the mapping resolves both the doc and a latent quality bug.
arch-005 (model-specific logic in BaseClient) and doc-007 (undocumented MODEL_NONSTREAMING_TOKENS) both stem from product policy living at the wrong layer; relocating to resources/messages addresses both.
Technical Debt Summary
0estimated hours to remediate
cost to remediate: ~$15,356 at $150/hr avg dev rate
AI-assisted screening based on finding text. Not a substitute for professional penetration testing.
Investment Readiness Score
⚠Automated heuristic — not a substitute for formal due diligence or financial advisory. Score is derived from code-quality signals, not business fundamentals.
0/ 100
needs work
Moderate technical risk. Several areas need attention before fundraising.
Component Breakdown
Overall Score
8625%
Security Posture
9120%
Issue Concentration
4810%
Dependency Health
010%
Code Complexity
5010%
License Compliance
9510%
SOC 2 Readiness
4610%
Critical Findings
1005%
AI-estimated composite score. Consult qualified advisors for investment decisions.
SOC 2 Readiness
⚠Automated heuristic — not a substitute for formal SOC 2 assessment. Findings are mapped by keyword analysis, not control evidence evaluation.