AI Skill Hub 强烈推荐:浏览器LLM推理 是一款优质的AI工具。已获得 1.1k 颗 GitHub Star,AI 综合评分 8.0 分,在同类工具中表现稳健。如果你正在寻找可靠的AI工具解决方案,这是一个值得深入了解的选择。
浏览器LLM推理 是一款基于 TypeScript 开发的开源工具,专注于 llama、wasm、webassembly 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
浏览器LLM推理 是一款基于 TypeScript 开发的开源工具,专注于 llama、wasm、webassembly 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:npm 全局安装 npm install -g wllama # 方式二:npx 直接运行(无需安装) npx wllama --help # 方式三:项目依赖安装 npm install wllama # 方式四:从源码运行 git clone https://github.com/ngxson/wllama cd wllama npm install npm start
# 命令行使用
wllama --help
# 基本用法
wllama [options] <input>
# Node.js 代码中使用
const wllama = require('wllama');
const result = await wllama.run(options);
console.log(result);
# wllama 配置说明 # 查看配置选项 wllama --config-example > config.yml # 常见配置项 # output_dir: ./output # log_level: info # workers: 4 # 环境变量(覆盖配置文件) export WLLAMA_CONFIG="/path/to/config.yml"

WebAssembly binding for llama.cpp
For changelog, please visit releases page
[!IMPORTANT] 🔥🔥 V3 is out, with WebGPU, multimodal and tool calling support. Read the V3 release guide For compatibility issues, please refer to @wllama/wllama-compat

split and cat)Limitations: - To enable multi-thread, you must add Cross-Origin-Embedder-Policy and Cross-Origin-Opener-Policy headers. See this discussion for more details. - Max file size is 2GB, due to size restriction of ArrayBuffer. If your model is bigger than 2GB, please follow the Split model section below.
npm i
This repository already come with pre-built binary from llama.cpp source code. However, in some cases you may want to compile it yourself: - You don't trust the pre-built one. - You want to try out latest - bleeding-edge changes from upstream llama.cpp source code.
You can use the commands below to compile it yourself:
```shell
npm run build:wasm
npm run build ```
For complete code, see examples/basic/index.html
import { Wllama } from './esm/index.js';
(async () => {
const CONFIG_PATHS = {
default: './esm/wasm/wllama.wasm',
};
// Automatically switch between single-thread and multi-thread version based on browser support
// If you want to enforce single-thread, add { "n_threads": 1 } to LoadModelConfig
const wllama = new Wllama(CONFIG_PATHS);
// Define a function for tracking the model download progress
const progressCallback = ({ loaded, total }) => {
// Calculate the progress as a percentage
const progressPercentage = Math.round((loaded / total) * 100);
// Log the progress in a user-friendly format
console.log(`Downloading... ${progressPercentage}%`);
};
// Load GGUF from Hugging Face hub
// (alternatively, you can use loadModelFromUrl if the model is not from HF hub)
await wllama.loadModelFromHF(
{ repo: 'ggml-org/models', file: 'tinyllamas/stories260K.gguf' },
{ progressCallback }
);
const response = await wllama.createChatCompletion({
messages: [{ role: 'user', content: elemInput.value }],
max_tokens: 50,
temperature: 0.5,
top_k: 40,
top_p: 0.9,
});
console.log(response.choices[0].message.content);
})();
Alternatively, you can use the *.wasm files from CDN:
import WasmFromCDN from '@wllama/wllama/esm/wasm-from-cdn.js';
const wllama = new Wllama(WasmFromCDN);
// NOTE: this is not recommended, only use when you can't embed wasm files in your project
Demo: - Basic usages with completions and embeddings: https://github.ngxson.com/wllama/examples/basic/ (source code) - Embedding and cosine distance: https://github.ngxson.com/wllama/examples/embeddings/ (source code) - Multimodal (vision) completion: https://github.ngxson.com/wllama/examples/multimodal/ (source code) - Tool calling: https://github.ngxson.com/wllama/examples/tools/ (source code)
git clone --recurse-submodules https://github.com/ngxson/wllama.git cd wllama
高性能浏览器LLM推理工具
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
总体来看,浏览器LLM推理 是一款质量优秀的AI工具,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | wllama |
| 原始描述 | 开源AI工具:WebAssembly binding for llama.cpp - Enabling on-browser LLM inference。⭐1.1k · TypeScript |
| Topics | llamawasmwebassemblytypescript |
| GitHub | https://github.com/ngxson/wllama |
| License | MIT |
| 语言 | TypeScript |
收录时间:2026-05-25 · 更新时间:2026-05-25 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。