经 AI Skill Hub 精选评估,PestPHP AI评估插件 获评「推荐使用」。这款Agent工作流在功能完整性、社区活跃度和易用性方面表现出色,AI 评分 7.5 分,适合有一定技术背景的用户使用。
基于PestPHP的Laravel AI SDK评估插件,使用LLM作为评估者,支持语义评估。
PestPHP AI评估插件 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
基于PestPHP的Laravel AI SDK评估插件,使用LLM作为评估者,支持语义评估。
PestPHP AI评估插件 是一套完整的 AI Agent 自动化工作流方案。通过可视化的节点编排,将复杂的多步骤任务拆解为清晰的自动化流程,实现全程无人值守的智能处理。支持与数百种外部服务和 API 无缝集成,适合构建数据处理管线、业务自动化和 AI 辅助决策系统。
# 克隆仓库 git clone https://github.com/shipfastlabs/pest-plugin-evals cd pest-plugin-evals # 查看安装说明 cat README.md # 按 README 完成环境依赖安装后即可使用
# 查看帮助 pest-plugin-evals --help # 基本运行 pest-plugin-evals [options] <input> # 详细使用说明请查阅文档 # https://github.com/shipfastlabs/pest-plugin-evals
# pest-plugin-evals 配置说明 # 查看配置选项 pest-plugin-evals --config-example > config.yml # 常见配置项 # output_dir: ./output # log_level: info # workers: 4 # 环境变量(覆盖配置文件) export PEST_PLUGIN_EVALS_CONFIG="/path/to/config.yml"
<p align="center"> <img src="docs/og.png" height="300" alt="Pest Plugin" /> <p align="center"> <a href="https://github.com/shipfastlabs/pest-plugin-evals/actions"><img alt="GitHub Workflow Status (master)" src="https://github.com/shipfastlabs/pest-plugin-evals/actions/workflows/tests.yml/badge.svg"></a> <a href="https://packagist.org/packages/shipfastlabs/pest-plugin-evals"><img alt="Total Downloads" src="https://img.shields.io/packagist/dt/shipfastlabs/pest-plugin-evals"></a> <a href="https://packagist.org/packages/shipfastlabs/pest-plugin-evals"><img alt="Latest Version" src="https://img.shields.io/packagist/v/shipfastlabs/pest-plugin-evals"></a> <a href="https://packagist.org/packages/shipfastlabs/pest-plugin-evals"><img alt="License" src="https://img.shields.io/packagist/l/shipfastlabs/pest-plugin-evals"></a> </p> </p>
------
it('evaluates a pre-configured agent', function () {
$agent = new RefundAgent($user);
expectAgent($agent, 'Can I return a damaged laptop?')
->toContain('refund')
->toPassJudge('Response explains the refund policy clearly');
});
You can also use Laravel's ::make() method:
it('evaluates agent created with make()', function () {
expectAgent(RefundAgent::make(user: $user), 'Can I return a damaged laptop?')
->toContain('refund');
});
composer require shipfastlabs/pest-plugin-evals --dev
Publish the config (optional):
php artisan vendor:publish --tag=eval-config
use function ShipFastLabs\PestEval\expectAgent;
it('answers refund questions accurately', function () {
expectAgent(RefundAgent::class, 'Can I return a damaged laptop?')
->toContain('refund')
->toContain('return')
->toPassJudge('Response explains the refund policy clearly')
->toBeRelevant(0.8);
});
Run your evals:
pest --eval
Eval tests are excluded from normal test runs automatically. Place your eval tests in tests/Evals/ — when you run pest without --eval, the plugin excludes that directory so evals never pollute your regular test suite.
pest --eval targets the tests/Evals directory. If it does not exist, it falls back to --group=eval.
// config/eval.php
return [
'ai' => [
'scoring' => [
'provider' => env('EVAL_SCORING_PROVIDER', 'openai'),
'model' => env('EVAL_SCORING_MODEL', 'gpt-4.1-mini'),
],
'embedding' => [
'provider' => env('EVAL_EMBEDDING_PROVIDER', 'openai'),
'model' => env('EVAL_EMBEDDING_MODEL', 'text-embedding-3-small'),
],
],
];
it('eval pipeline works with faked responses', function () {
expectAgent(
RefundAgent::class,
'What is your return policy?',
fake: ['Our return policy allows returns within 30 days.'],
)->toContain('30 days')
->toMatch('/\d+ days/');
});
it('answers factually', function () {
expectAgent(CapitalCityAgent::class, 'What is the capital of Japan?')
->toBeFactual(expected: 'Tokyo');
});
| Expectation | Description | Scorer used |
|---|---|---|
->toBeRelevant(0.7) | Checks if response is on-topic | Relevance |
->toBeSafe(0.7) | Evaluates for harmful content | Safety |
->toBeFactual(0.7, expected: '...') | Fact-checks against reference | Factuality |
->toPassJudge('criteria', 0.7) | Custom LLM evaluation | LlmJudge |
->toBeSimilar('ref', 0.7) | Embedding cosine similarity | SemanticSimilarity |
->toHaveToolCalls([...]) | Validates tool calls/arguments | ToolCallMatch |
->toFollowTrajectory([...]) | Validates tool call sequence | AgentTrajectory |
->toPassScorer($scorer, 0.7) | Use any custom Scorer instance | Any |
All thresholds default to 0.7 and represent the minimum score (0.0-1.0) required to pass.
expectAgent(
string|Closure|Agent $agent, // Agent class name, closure, or instance
string $prompt, // The input prompt
array $fake = [], // Fake responses (bypasses agent execution)
array $attachments = [], // Files to pass to the agent (Document, Image)
): Expectation
// Chain ->repeat(N) for multiple runs:
->repeat(5) // Run agent 5 times, all assertions checked on every output
A PestPHP plugin for evaluating Laravel AI SDK agents. Build evals with LLM-as-judge, semantic similarity, and deterministic matchers — all with a native Pest expect() API.
该插件提供了一个基于PestPHP的Laravel AI SDK评估插件,支持语义评估和LLM评估,适合用于评估AI代理的性能。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ MIT 协议 — 最宽松的开源协议之一,可自由商用、修改、分发,仅需保留版权声明。
AI Skill Hub 点评:PestPHP AI评估插件 的核心功能完整,质量良好。对于自动化工程师和运维人员来说,这是一个值得纳入个人工具库的选择。建议先在非生产环境试用,再逐步推广。
| 原始名称 | pest-plugin-evals |
| 原始描述 | 开源AI工作流:A PestPHP plugin for evaluating Laravel AI SDK agents with LLM-as-judge, semanti。⭐11 · PHP |
| Topics | workflowai-sdkevalslaravelpesttestphp |
| GitHub | https://github.com/shipfastlabs/pest-plugin-evals |
| License | MIT |
| 语言 | PHP |
收录时间:2026-05-24 · 更新时间:2026-05-30 · License:MIT · AI Skill Hub 不对第三方内容的准确性作法律背书。
选择 Agent 类型,复制安装指令后粘贴到对应客户端