经 AI Skill Hub 精选评估,AI工作流 获评「强烈推荐」。这款Agent工作流在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 8.0 分,适合有一定技术背景的用户使用。
RL环境+评估AI代理。定义一次,训练任何AI模型。
AI工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
RL环境+评估AI代理。定义一次,训练任何AI模型。
AI工作流 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
# 方式一:pip 安装(推荐)
pip install hud-python
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install hud-python
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/hud-evals/hud-python
cd hud-python
pip install -e .
# 验证安装
python -c "import hud_python; print('安装成功')"
# 命令行使用
hud-python --help
# 基本用法
hud-python input_file -o output_file
# Python 代码中调用
import hud_python
# 示例
result = hud_python.process("input")
print(result)
# hud-python 配置文件示例(config.yml) app: name: "hud-python" debug: false log_level: "INFO" # 运行时指定配置文件 hud-python --config config.yml # 或通过环境变量配置 export HUD_PYTHON_API_KEY="your-key" export HUD_PYTHON_OUTPUT_DIR="./output"
HUD is a platform for building RL environments for AI agents, across coding, browser, computer-use, and robotics. Define an environment, write tasks, and run them as evals and training across any model, at any scale.
To learn more, see the documentation and environment reference.
A capability is a connection the environment exposes; a harness attaches its own tools to it. The same environment serves a one-shot Q&A or a full computer-use rollout, depending on which capabilities the harness opens.
| Protocol | What it exposes |
|---|---|
**ssh** | Shell + files in a sandboxed workspace (env.workspace(root)) |
**mcp** | Tools over the Model Context Protocol |
**cdp** | Browser control over the Chrome DevTools Protocol |
**rfb** | Full computer-use over VNC: screen + keyboard/mouse |
**robot** *(beta)* | Schema-driven robot observation/action loop over WebSocket |
Ships natively: Claude, OpenAI (Responses), OpenAI-compatible endpoints, and Gemini via create_agent("claude-sonnet-4-5") (or gpt-…, gemini-…). The harness wires capability-backed tools for the model you choose at run time.
Bring your own: a harness attaches to a capability and defines a tool spec — wrap browser-use on cdp, a VLA policy on robot, or your own agent on ssh / mcp. No protocol work required.
→ Capabilities · Models · Robots
```bash
uv tool install hud-python --python 3.12
From the platform UI you can run batches, compare models on the same taskset, and inspect every trace.
A template is an async generator registered with @env.template(): yield a prompt, receive the agent's answer, yield a reward. Calling the template mints a runnable Task; one function spans a whole dataset of variants. The simplest needs no capabilities — just a prompt and a grader:
from hud import Environment
env = Environment(name="letter-count")
@env.template()
async def count_letter(word: str = "strawberry", letter: str = "r"):
answer = yield f"How many '{letter}'s are in '{word}'? Reply with just the number."
yield 1.0 if answer and str(word.count(letter)) in answer else 0.0
tasks = [count_letter(word=w) for w in ("strawberry", "raspberry", "blueberry")]
Run it immediately against any model:
hud eval tasks.py claude --group 3
Each graded evaluation is a trace (the SDK's live handle is a Run). With HUD_API_KEY set, every rollout is recorded on hud.ai. Tasks that need a shell, browser, GUI, or robot declare capabilities (below); everything else — variants, grading, batching — stays identical.
Then scaffold your first environment:
bash hud init my-env ```

A built image is the end product for your tasks: one build packs every task from a single definition. The recommended path is hud deploy, which builds and registers your environment on HUD in one step; then sync a taskset and run remotely:
hud deploy
hud sync tasks my-taskset
hud eval my-taskset --remote
For local iteration, the same protocol works against a container on your laptop:
docker build -f Dockerfile.hud -t my-env .
docker run -d --name run1 -p 8765:8765 my-env
hud task start fix_bug --url tcp://127.0.0.1:8765
hud task grade fix_bug --url tcp://127.0.0.1:8765 --answer "..."
docker rm -f run1
hud-python是一个开源AI工作流,提供了RL环境和评估工具,用于定义和训练AI代理。其设计简洁,易于使用,适合于各种AI模型的训练。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
AI Skill Hub 点评:AI工作流 的核心功能完整,质量优秀。对于自动化工程师和运维人员来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | hud-python |
| 原始描述 | 开源AI工作流:RL environments + evals for AI agents. Define once, train anything.。⭐271 · Python |
| Topics | workflowagentsevalspython |
| GitHub | https://github.com/hud-evals/hud-python |
| License | MIT |
| 语言 | Python |
收录时间:2026-07-02 · 更新时间:2026-07-02 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端