freellmpool › guide

Know which free LLM tiers are usable right now

Free LLM tiers drift through the day — keys expire, providers go down, and daily caps fill up. freellmpool ships capacity tools that tell you, locally, which of your free providers are usable right now, which are near their quota, and which keys to add next. freellmpool is a free, open-source tool that pools the free tiers of 18 LLM providers behind one OpenAI-compatible endpoint; these commands keep that pool healthy.

pip install freellmpool
freellmpool capacity status --target 5

1. capacity status — a local snapshot

capacity status reads your provider catalog, your environment (which keys are set), and your per-day usage counters, then labels every provider with a status. It never calls a provider, so it's instant:

$ freellmpool capacity status --target 5
LLM capacity: 4/5 healthy providers
Action recommended: add 1 provider(s).

  healthy     groq          Groq             used=12/1000  models=9   key=GROQ_API_KEY
  low_quota   github        GitHub Models    used=130/150  models=35  key=GITHUB_TOKEN
  healthy     cerebras      Cerebras         used=0/14400  models=2   key=CEREBRAS_API_KEY
  ...
StatusMeaning
healthyConfigured and usable from local state.
low_quotaUsage is above 80% of the daily request hint.
exhaustedUsage reached the daily request hint.
invalid_keyYour local key inventory says the key has expired.
missingThe provider exists in the catalog but isn't configured.

--target N flags when you have fewer than N healthy providers; --all also lists missing providers and external-only catalog candidates you could add. By default it refreshes an advisory external catalog over the network (a read-only metadata fetch); pass --no-catalog-sync to keep it fully local.

2. providers health — does it actually respond?

capacity status reads local state; providers health goes one step further and sends a tiny real request to each configured provider, so you can tell a missing key from a rate-limited or down provider:

$ freellmpool providers health
  provider/model                status        latency  note
  groq/llama-3.3-70b-versatile  ok             237 ms  2 tok
  mistral/mistral-small-latest  ok             539 ms  2 tok
  cerebras/gpt-oss-120b         rate_limited        -  HTTP 429

  2/3 providers ok

Use -p groq,cerebras to test a subset, --timeout to bound each call, and -m <model> to pin a model.

3. keys — a checklist and an interactive add

To reach a target number of healthy providers, ask for a checklist of which keys to create:

$ freellmpool keys checklist --target 5
Manual key checklist to reach 5 healthy providers:
  - cerebras: create a key manually, then set CEREBRAS_API_KEY

keys add then walks you through it — it writes the key to your config.toml and records metadata (name, dates, notes) in an optional inventory at ~/.config/freellmpool/keys.toml. The inventory is metadata only; raw secrets stay in your config or environment.

freellmpool keys add groq                       # configure a known provider
freellmpool keys add Hyperbolic                 # match & import from the external catalog
freellmpool keys add MyProvider --base-url https://api.example.com/v1 --yes

If the name isn't a local provider, keys add checks the synced external catalog (mnfst/awesome-free-llm-apis), matching typos and model names with a small fuzzy search, and can import the suggestion. Or it builds a minimal OpenAI-compatible provider and autodiscovers its models from the GET /models endpoint.

The external catalog is advisory only. Your local providers.toml stays the source of truth for routing; freellmpool doesn't send traffic to a discovered provider until you've imported it and set its key. Imported endpoints are validated (https, no junk) before they're written.

4. The dashboard

When the proxy is running (freellmpool proxy), open http://127.0.0.1:8080/dashboard. Alongside request counts, cache hits, and estimated savings, it shows a healthy-provider count and a per-provider capacity table — the same signal as capacity status, in the browser.

FAQ

Which free LLM provider should I use right now?

Run freellmpool capacity status: it sorts your configured providers by health and remaining capacity, so the top healthy rows are the ones to use. The pool also picks automatically per request and fails over when one is rate-limited.

Does capacity status make network calls?

The provider statuses are computed from local state only. By default it also refreshes the advisory external catalog over the network; pass --no-catalog-sync to skip that and stay fully offline. Live provider probing is a separate command, providers health.

Does it store my API keys?

The key inventory holds metadata only (provider, env-var name, optional dates and notes), never raw secrets. keys add writes the actual key to config.toml (chmod 600); you can also just keep keys in environment variables.

Part of freellmpool (MIT, free, open source). Full reference: docs/CAPACITY.md. Updated 2026-06-06.