Generated from tools/hero_benchmark.data.json. fak — 50-turn × 5-agent fleet serving time: 19.0 minutes (lower is better).
Log scale (minutes) so all three bars are visible — the gap is the point. Both arms run fully live on the identical Q8 forward pass, bit-identical tokens — baseline = the tuned warm per-agent KV cache (the SGLang / vLLM floor).