Many agents · many turns · long context · and a different-sized tool result every turn. The honest baseline is a warm per-agent KV cache — each agent already reuses its own KV across turns (what vLLM / SGLang / provider prompt-caching give you). fak's win is the step that cache can't take: cross-agent prefix sharing — one copy of the shared setup serves the whole fleet. Pick an agentic shape and watch the gap.
① Pick an agentic shape exact · timing-free · model-independent
② How each agent's context assembles tool results drawn to scale
The HEADLINE is fak vs the tuned warm-cache baseline (a persistent per-agent KV cache — what vLLM / SGLang prefix-caching or provider prompt-caching give you). Both run fully live here — the warm-cache arm has no quadratic re-prefill, so it never needs projecting.
fak in-kernel engine · pure-Go Q8 forward pass · tokens are real model output. No network, no API — all on this box.