# AbstractCore

> Unified Python interface for cloud + local LLM providers with streaming, tool calling, structured output, media handling (images/audio/video + documents), embeddings, and an optional OpenAI-compatible HTTP server. Default install stays lightweight; features are enabled via extras.

Last reviewed against source: 2026-06-07. Package version in source: 2.13.35. Supported Python: 3.9+; GitHub CI tests 3.9, 3.10, 3.11, 3.12, and 3.13.

This is the curated, high-signal index for agents. For a self-contained handbook with the important details inlined, read `llms-full.txt`.

Important notes:
- Default tool execution is **pass-through**: AbstractCore returns tool calls; your host/runtime executes them.
- AbstractCore is **offline-first** for local models: it will not silently download model weights (download/pull explicitly).
- Durable memory bloc caching uses one exact text/file bloc -> one provider/model-native cache
  artifact. The shared contract covers MLX, HuggingFace Transformers, and supported HuggingFace
  GGUF exact-renderer paths through Python helpers and `/acore/blocs/kv/*`, including safe
  list/delete/prune operations with live-binding checks.

Ecosystem note:
- AbstractCore is part of **AbstractFramework** (umbrella: https://github.com/lpalbou/AbstractFramework).
- In the ecosystem, **AbstractRuntime** is the recommended runtime for executing `response.tool_calls` durably (policy, retries, persistence): https://github.com/lpalbou/abstractruntime

Quick start:
- Install: `pip install abstractcore`
- Hosted SDK bundle: `pip install "abstractcore[remote]"` (OpenAI + Anthropic SDKs; OpenRouter/Portkey/local `/v1` providers use core HTTP dependencies)
- First steps: `docs/getting-started.md` -> `docs/prerequisites.md`
- Run the gateway server: `pip install "abstractcore[server]"` then `abstractcore serve`
- Run the single-model endpoint: `pip install "abstractcore[server]"` then `abstractcore-endpoint --help` (see `docs/endpoint.md`)
- Repo/dev checks: `pip install -e ".[dev,test]"` ; `pytest -q` ; `black .` ; `ruff check .`

## Read First

- [README](README.md): what AbstractCore is + install matrix
- [Docs index](docs/README.md): recommended reading paths
- [Getting Started](docs/getting-started.md): `create_llm(...)`, `generate(...)`, streaming, tools, structured output, media
- [Prerequisites](docs/prerequisites.md): provider setup (keys/base URLs) + local hardware notes
- [FAQ](docs/faq.md): common issues + setup gotchas

## Examples

- [Examples index](examples/README.md): organized by topic + a guided learning path
- [Learning path](examples/learning_path/README.md): 6 scripts in order (`01_...` → `06_...`)
- [Prompt caching](examples/prompt_caching/README.md): `CachedSession` + attachments + REPL demo
- [Media](examples/media/README.md): inspect pipeline + image/video/audio demos
- [Tools](examples/tools/README.md): tool calling + host execution (ToolRegistry vs manual dispatcher)

## Core API + Concepts

- [API (Python)](docs/api.md): public API map and common patterns
- [API Reference](docs/api-reference.md): full function/class listing
- [Generation parameters](docs/generation-parameters.md): sampling defaults, caller overrides, max tokens, and thinking semantics across providers
- [Prompt caching](docs/prompt-caching.md): `CachedSession`, cache keys, file attachments, and durable memory bloc bindings
- [HuggingFace model compatibility](docs/huggingface-model-compatibility.md): Transformers/GGUF loading rules, quantized checkpoint caveats, and trusted proof targets
- [Memory blocs](docs/memory-blocs.md): text/file -> bloc -> provider/model durable cache artifacts with optional request-time binding
- [Tool calling](docs/tool-calling.md): `@tool`, passthrough vs execution, built-in tools
- [Tool syntax rewriting](docs/tool-syntax-rewriting.md): preserve tool-call markup (`tool_call_tags`, server `agent_format`)
- [Structured output](docs/structured-output.md): `response_model=...`, native vs prompted, validation/retry behavior
- [Media handling](docs/media-handling-system.md): `media=[...]`, audio/video policies, vision fallback and plugins
- [Centralized config](docs/centralized-config.md): `abstractcore --config`, `abstractcore config defaults|set-default|clear-default`, capability route defaults (`input.*`, `output.*`, `embedding.*`, `rerank.*`), explicit `input.voice`/`input.video` fallback routing, provider keys, HTTP server auth/hardening, logging, vision/audio/video strategies

## Providers (IDs + setup)

- [Provider registry](abstractcore/providers/registry.py): provider IDs, defaults, install extras
- [Provider setup guide](docs/prerequisites.md): env vars + examples for OpenAI/Anthropic/Ollama/LMStudio/MLX/HuggingFace/vLLM/OpenRouter/Portkey/OpenAI-compatible
- Install profile vocabulary: `abstractcore[apple]` is the hardware alias for the MLX local LLM stack, `abstractcore[gpu]` is the hardware alias for the vLLM local LLM stack, capability extras such as `voice`/`audio`/`vision`/`music` stay remote-light, and `abstractcore[all-apple]` / `abstractcore[all-gpu]` are larger aggregate profiles with local plugin engines where supported.
- Optional media plugin floors in this source: `abstractvoice>=0.10.17`, `abstractvision>=0.3.22`, and `abstractmusic>=0.1.13`. The AbstractVision plugin supports direct `llm.vision.t2i/i2i/upscale_image/t2v/i2v` calls, exact MLX-Gen model ids, source/reference image edits, SeedVR2 image upscaling with canonical `AbstractFramework/seedvr2-{3b,7b}-{8bit,4bit}` packages, typed Wan A14B `guidance_2` video controls, and image/video output routing through `generate(..., output=...)`; top-level progress callbacks are forwarded to generated image/video output specs.

## Server (OpenAI-compatible `/v1`)

- [Server docs](docs/server.md): run with `abstractcore serve` + env vars + server auth token `ABSTRACTCORE_AUTH_TOKEN` + Swagger `Authorize` validation on `/docs` when server auth is enabled + `/v1/models?capability_route=...` model discovery filters for precise routes such as `input.image,output.text` and `embedding.text` + media catalogs including `/v1/audio/music`, `/v1/images/upscale`, `/v1/videos/generations`, `/v1/videos/edits`, async `/v1/vision/jobs/images/*` and `/v1/vision/jobs/videos/*` progress polling, and repeatable multipart `reference_images` for image edits + runtime/prompt-cache/memory-bloc control planes (`/acore/models/load`, `/acore/prompt_cache/*`, `/acore/blocs/*`) + shared text-inference controls on both `/v1/chat/completions` and `/v1/responses` (`base_url`, `agent_format`, `thinking`, `prompt_cache_key`, `prompt_cache_binding`, `unload_after`). `/v1/vision/jobs/images/upscale` accepts multipart `provider=mlx-gen`, canonical SeedVR2 q8/q4 `model`, `image`, `scale` or `resolution`, optional source-weight `quantize`, and exposes MLX-Gen denoise-step progress at `progress.last_event`. `/v1/responses` returns OpenAI Responses `object:"response"` payloads (and Responses SSE events when `stream=true`) for `input` requests; legacy `messages` requests return Chat Completions payloads. `/v1/responses` also accepts Responses-style `tools`/`tool_choice` and normalizes `web_search*` tools into function tools for host-side execution. Request body/query `api_key` is disabled; use server-side provider config, `X-AbstractCore-Provider-API-Key`, or `Authorization` only when server auth is not configured.
- [Server implementation](abstractcore/server/app.py): FastAPI gateway source
- [Endpoint docs](docs/endpoint.md): single-model OpenAI-compatible endpoint (`abstractcore-endpoint`)
- [Endpoint implementation](abstractcore/endpoint/app.py): endpoint server source

## Contributing

- [Contributing](CONTRIBUTING.md): formatting/lint/test + release checklist
- [Changelog](CHANGELOG.md): version history + upgrade notes
- [pyproject.toml](pyproject.toml): extras + console scripts

## Optional

- [Architecture](docs/architecture.md): core/provider/session/tool/media/capabilities design
- [Practical recipes](docs/examples.md): copy/paste code snippets (provider-agnostic)
- [Troubleshooting](docs/troubleshooting.md): common errors and fixes
- [Embeddings](docs/embeddings.md): EmbeddingManager for local and remote text embeddings; endpoint-backed providers can use embedding-only `/v1/embeddings` routes
- [Capabilities](docs/capabilities.md): `llm.voice` / `llm.audio` / `llm.vision` / `llm.music` plugin model, shared discovery contract, and boundary between `/v1/models` route metadata and generated-media plugin catalogs
- [MCP](docs/mcp.md): MCP tool servers (HTTP/stdio)
- [llms-full](llms-full.txt): self-contained agent handbook (this repo)
