gpustack Agent工作流 是 AI Skill Hub 本期精选AI工具之一。已获得 5.0k 颗 GitHub Star,综合评分 8.2 分,整体质量较高。我们强烈推荐将其纳入你的 AI 工具库,帮助提升工作效率。
gpustack Agent工作流 是一款基于 Python 开发的开源工具,专注于 GPU管理、分布式推理、推理引擎 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
gpustack Agent工作流 是一款基于 Python 开发的开源工具,专注于 GPU管理、分布式推理、推理引擎 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install gpustack
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install gpustack
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/gpustack/gpustack
cd gpustack
pip install -e .
# 验证安装
python -c "import gpustack; print('安装成功')"
# 命令行使用
gpustack --help
# 基本用法
gpustack input_file -o output_file
# Python 代码中调用
import gpustack
# 示例
result = gpustack.process("input")
print(result)
# gpustack 配置文件示例(config.yml) app: name: "gpustack" debug: false log_level: "INFO" # 运行时指定配置文件 gpustack --config config.yml # 或通过环境变量配置 export GPUSTACK_API_KEY="your-key" export GPUSTACK_OUTPUT_DIR="./output"
<br>
<p align="center"> <img alt="GPUStack" src="https://raw.githubusercontent.com/gpustack/gpustack/main/docs/assets/gpustack-logo.png" width="300px"/> </p> <br>
<p align="center"> <a href="https://docs.gpustack.ai" target="_blank"> <img alt="Documentation" src="https://img.shields.io/badge/Docs-GPUStack-blue?logo=readthedocs&logoColor=white"></a> <a href="./LICENSE" target="_blank"> <img alt="License" src="https://img.shields.io/github/license/gpustack/gpustack?logo=github&logoColor=white&label=License&color=blue"></a> <a href="https://discord.gg/VXYJzuaqwD" target="_blank"> <img alt="Discord" src="https://img.shields.io/badge/Discord-GPUStack-blue?logo=discord&logoColor=white"></a> <a href="https://twitter.com/intent/follow?screen_name=gpustack_ai" target="_blank"> <img alt="Follow on X(Twitter)" src="https://img.shields.io/twitter/follow/gpustack_ai?logo=X"></a> </p> <br>
<p align="center"> <a href="./README.md">English</a> | <a href="./README_CN.md">简体中文</a> | <a href="./README_JP.md">日本語</a> </p>
<br>
GPUStack is an open-source GPU cluster manager for AI model serving and GPU instance provisioning. It configures and orchestrates inference engines — vLLM, SGLang, TensorRT-LLM, or your own — and lets you launch SSH-accessible GPU instances on demand. Its core features include: - Multi-Cluster GPU Management. Manages GPU clusters across multiple environments. This includes on-premises servers, Kubernetes clusters, and cloud providers. - Pluggable Inference Engines. Automatically configures high-performance inference engines such as vLLM, SGLang, and TensorRT-LLM. You can also add custom inference engines as needed. - Day 0 Model Support. GPUStack's pluggable engine architecture enables you to deploy new models on the day they are released. - Performance-Optimized Configurations. Offers pre-tuned modes for low latency or high throughput. GPUStack supports extended KV cache systems like LMCache and HiCache to reduce TTFT. It also includes built-in support for speculative decoding methods such as EAGLE3, MTP, and N-grams. - GPU Instances. Launches SSH-accessible GPU instances on demand for development, fine-tuning, and interactive workloads. - Enterprise-Grade Operations. Offers support for automated failure recovery, load balancing, monitoring, authentication, and access control.
Run the following command to install and start the GPUStack server using Docker:
sudo docker run -d --name gpustack \
--restart unless-stopped \
-p 80:80 \
--volume gpustack-data:/var/lib/gpustack \
gpustack/gpustack
<details> <summary>Alternative: Use Quay Container Registry Mirror</summary>
If you cannot pull images from Docker Hub or the download is very slow, you can use our Quay.io mirror by pointing your registry to quay.io:
sudo docker run -d --name gpustack \
--restart unless-stopped \
-p 80:80 \
--volume gpustack-data:/var/lib/gpustack \
quay.io/gpustack/gpustack \
--system-default-container-registry quay.io </details>
Check the GPUStack startup logs:
sudo docker logs -f gpustack
After GPUStack starts, run the following command to get the default admin password:
sudo docker exec gpustack cat /var/lib/gpustack/initial_admin_password
Open your browser and navigate to http://your_host_ip to access the GPUStack UI. Use the default username admin and the password you retrieved above to log in.
Catalog page in the GPUStack UI.Qwen3 0.6B model from the list of available models.Save button to deploy the model.
Running, the model has been deployed successfully.
Playground - Chat in the navigation menu, check that the model qwen3-0.6b is selected from the top-right Model dropdown. Now you can chat with the model in the UI playground.
make package.API Keys page, then click the New API Key button.Name and click the Save button.```bash
export GPUSTACK_API_KEY=your_api_key curl http://your_gpustack_server_url/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $GPUSTACK_API_KEY" \ -d '{ "model": "qwen3-0.6b", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "Tell me a joke." } ], "stream": true }' ```
gpustack是专业的GPU集群推理管理方案,代码质量高,维护活跃,架构设计合理。适合需要大规模部署推理服务的企业和研究机构,是同类工具中的佼佼者。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。
经综合评估,gpustack Agent工作流 在AI工具赛道中表现稳健,质量优秀。如果你已有明确的使用需求,可以直接上手体验;如果还在评估阶段,建议对比同类工具后再做决策。
| 原始名称 | gpustack |
| 原始描述 | 开源AI工作流:A GPU cluster manager that configures and orchestrates inference engines like vL。⭐5.0k · Python |
| Topics | GPU管理分布式推理推理引擎集群编排CUDA |
| GitHub | https://github.com/gpustack/gpustack |
| License | Apache-2.0 |
| 语言 | Python |
收录时间:2026-05-18 · 更新时间:2026-05-19 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。