Pull a model
Browse a catalog of open models — Llama, Mistral, Gemma, Qwen, DeepSeek — and download with a single command. Models live on your disk, not on someone else's server.
ollama · run open models locally
Pull an open model, run it offline, and chat with it from your terminal or your app. No accounts, no API keys, no tokens billed by the thousand — your machine is the server.
what ollama does
Browse a catalog of open models — Llama, Mistral, Gemma, Qwen, DeepSeek — and download with a single command. Models live on your disk, not on someone else's server.
Chat from the terminal, or call the OpenAI-compatible local API from any language. Streaming, tool calls, and structured output work the same as the hosted runtimes your app already targets.
Drop Ollama behind the agent framework, IDE plugin, or chat client you already use — LangChain, OpenWebUI, Continue, your homegrown CLI. The runtime stays out of the way.
stay close to the work
Roughly one email a month — release notes for new models, changelog highlights, and the occasional benchmark. No marketing, no upsells, unsubscribe in one click.