AI Skill Hub 推荐使用:Squish 是一款优质的AI工具。AI 综合评分 7.5 分,在同类工具中表现稳健。如果你正在寻找可靠的AI工具解决方案,这是一个值得深入了解的选择。
Squish 是一款基于 Python 开发的开源工具,专注于 apple-silicon、inference-engine、int4 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
Squish 是一款基于 Python 开发的开源工具,专注于 apple-silicon、inference-engine、int4 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install squish
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install squish
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/konjoai/squish
cd squish
pip install -e .
# 验证安装
python -c "import squish; print('安装成功')"
# 命令行使用
squish --help
# 基本用法
squish input_file -o output_file
# Python 代码中调用
import squish
# 示例
result = squish.process("input")
print(result)
# squish 配置文件示例(config.yml) app: name: "squish" debug: false log_level: "INFO" # 运行时指定配置文件 squish --config config.yml # 或通过环境变量配置 export SQUISH_API_KEY="your-key" export SQUISH_OUTPUT_DIR="./output"
Local LLM inference for Apple Silicon. Faster end-to-end response on long contexts, less RAM, INT3 support.
<img src="assets/squish-logo-1.png" height="320" alt="Squish Logo"/>
---
Prerequisite (macOS/Homebrew): Xcode Command Line Tools are required. Install them with xcode-select --install. If Homebrew reports "Command Line Tools are too outdated", update from System Settings -> General -> Software Update, or reinstall CLT.
```bash
security find-certificate -a -p -c "TWM" /Library/Keychains/System.keychain > ~/TWM-all.pem ```
security find-certificate -c "TWM-Root" -p > ~/TWM-root.pem
```bash
4x faster quantization - install the Rust extension:
cd squish_quant_rs && python3 -m maturin build --release && pip install .
Requirements: macOS 13+, Apple Silicon (M1–M5), Python 3.10+.
---
squish run qwen2.5-7b-int4 \ --block-kv-cache ~/.cache/squish/blocks \ --prompt-kv-cache ~/.cache/squish/pkv \ --port 8080
Use it as an OpenAI-compatible client:
bash curl http://localhost:8080/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "qwen2.5-7b-int4", "messages": [{"role": "user", "content": "Hello"}] }'
Or point any OpenAI / Ollama client at it:
bash export OPENAI_BASE_URL=http://localhost:8080/v1 export OPENAI_API_KEY=squish
| Flag | Purpose |
|---|---|
--block-kv-cache <DIR> | Block-paged KV cache for shifting-prefix workloads (agents, multi-turn). Persists across daemon restarts via .safetensors blocks. |
--prompt-kv-cache <DIR> | Exact-prompt KV cache. Single-digit-millisecond TTFT on verbatim repeats. |
--block-kv-size N | Block size in tokens (default 64). |
--draft-model <MODEL> | Speculative-decode draft model (opt-in; see [v5.2 diagnosis](results/benchmarks_v5_2/SPEC_DECODE_DIAGNOSIS.md) for current status — net-negative on M3 INT4 with the draft models tested, kept off by default). |
--draft-depth N | Speculative decode depth K. |
--no-spec, --no-cache | Disable flags, intended for benchmark controls. |
squish daemon install / uninstall | macOS LaunchAgent integration. |
Picking the right cache for your workload:
- Exact-prompt repeats (cached scripts, fixed templates, automated jobs): --prompt-kv-cache alone. ~9 ms TTFT on a cache hit. - Shifting-prefix workloads (agents, multi-turn conversations): --block-kv-cache alone, or combined config. - General use without knowing the workload: combined config (both caches enabled). Best end-to-end completion time across prompt sizes.
The combined config currently doesn't inherit PKV's fast-hit TTFT due to a lookup ordering issue documented in results/benchmarks_v5_1_1/DIAGNOSIS.md; reordering is tracked as a v5.2 follow-up.
---
export OLLAMA_HOST=http://localhost:8080
Install the macOS LaunchAgent so the daemon starts at login:
bash squish daemon install ```
The SquishBar menu-bar app (apps/macos/SquishBar/) ships alongside the daemon — model picker, load progress, and a global hotkey for the chat panel. Build it from Xcode or grab the signed .app from the GitHub release page.
---
高性能本地LLM服务器
该工具使用 NOASSERTION 协议,商用场景请仔细阅读协议条款,必要时咨询法律意见。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
📄 NOASSERTION — 请查阅原始协议条款了解具体使用限制。
总体来看,Squish 是一款质量良好的AI工具,在同类工具中具备一定竞争力。AI Skill Hub 将持续追踪其更新动态,建议收藏备用,结合自身场景选择合适时机引入使用。
| 原始名称 | squish |
| 原始描述 | 开源AI工具:🤖🗜️⚡️ Local LLM server for Apple Silicon. 5.4× faster end-to-end on long conte。⭐7 · Python |
| Topics | apple-siliconinference-engineint4kv-cachellama-cpp-alternative |
| GitHub | https://github.com/konjoai/squish |
| License | NOASSERTION |
| 语言 | Python |
收录时间:2026-06-03 · 更新时间:2026-06-08 · License:NOASSERTION · AI Skill Hub 不对第三方内容的准确性作法律背书。