OrchestKit Eval Plan
v7.6.0
89 skills + 31 agents
$0 on Max 20x
Presets
Recommended
Skip baseline, Sonnet, 8 days
Aggressive
All at once, 3 days, tight on limits
Conservative
With baseline, reps=3, 12 days
Skills Only
Skip agents, 5 days
Generation
Model
sonnet
Sonnet (faster, higher rate limit)
Opus (slower, more capable)
Haiku (fastest, cheapest)
Max turns (skills)
10
Max turns (agents)
15
Timeout per call (sec)
180
Coverage
Skip baseline (saves ~50%)
Include agents (31)
Include background skills (67)
Re-run NEUTRAL with baseline
Grading
Grading reps
1
Cases per bg skill
3
Cases per agent
5
Rate Limits
Msgs per 5h window
900
Overview
Timeline
Coverage Map
CLI Budget
Rate Limits