AgentsKit · WebLLM

LLM running 100% in your browser via WebGPU. First load downloads the model (~1–4 GB depending on quantization) and compiles WASM. Subsequent loads hit the cache.