| Baseline | Total runs | Pass rate | Last verdict | Last run at |
|---|---|---|---|---|
| Loading… | ||||
| Timestamp | Baseline | Verdict | Judge model | Agent version | Details |
|---|---|---|---|---|---|
| Loading… | |||||
Wire your CI pipeline to POST a JSON body to
/api/admin/evals/runs after each LLM-as-judge run:
POST /api/admin/evals/runs
{
"id": "ci-2026-05-15-1734",
"baseline": "intent-support",
"timestamp": "2026-05-15T17:34:00Z",
"agentVersion": "atmosphere-4.0.46",
"prompt": "...the judge prompt that was sent...",
"judgeResponse": "{\"verdict\": true}",
"verdict": true,
"scores": {"relevance": 0.9, "groundedness": 0.95},
"judgeModel": "gpt-4o-mini",
"passed": true,
"notes": "promoted to main"
}
The endpoint requires
atmosphere.admin.http-write-enabled=true plus an
authenticated principal whose ControlAuthorizer grants
evals.write. Every accepted submission is recorded in the
control audit log so an operator can reconstruct who submitted what.