One row per run.
| Run | Command | LLM | Doc / Code | Duration | Status |
|---|---|---|---|---|---|
| #{{ r.run_id }} | {{ r.command or "—" }} | {{ r.llm_provider_model or r.llm_model or "—" }} | {% set doc = r.doc_profile or "" %} {% set code = r.code_profile or "" %} {% if doc and code %}{{ doc }} / {{ code }}{% elif doc or code %}{{ doc or code }}{% else %}—{% endif %} | {% if r.duration_sec %}{{ "%.2f"|format(r.duration_sec) }}s{% else %}—{% endif %} | {{ r.status_label }} |
Per-run roll-ups (model time, tokens, cost, confidence band split, approval rate). Best value per row is highlighted.
| Metric | {% for rid in run_ids %}#{{ rid }} | {% endfor %}
|---|---|
| {{ arow.label }} | {% for rid in run_ids %} {% set cell = arow.cells[rid] %}{{ cell.display }} | {% endfor %}
Each cell shows the chosen description for that run. The cell with the highest logprob across the row gets the winner highlight.
| Asset | {% for rid in run_ids %}#{{ rid }} | {% endfor %}|
|---|---|---|
| {{ prow.label }} | {% for rid in run_ids %} {% set cell = prow.cells[rid] %} {% if cell %}
{{ cell.description }}
|
{% else %}
— | {% endif %} {% endfor %}
No aggregate metrics or per-column overlap to display for the selected runs.
{% endif %} {# ── Quality metrics card (Tier 0/1/2 academic metrics) ────────── #} {% if quality_per_run %}Tier {{ quality_tier }} academic text-quality analysis. {% if quality_references_summary %}References: {{ quality_references_summary }}.{% endif %}
| Run | Diversity (TTR) | Schema grounding | chrF | ROUGE-L | BERTScore | Embed. agree. | Judge win-rate |
|---|---|---|---|---|---|---|---|
| #{{ row.run_id }} | {% if row.type_token_ratio is not none %}{{ "%.0f"|format(row.type_token_ratio * 100) }}%{% else %}—{% endif %} | {% if row.schema_grounding is not none %}{{ "%.0f"|format(row.schema_grounding * 100) }}%{% else %}—{% endif %} | {% if row.chrf is not none %}{{ "%.0f"|format(row.chrf * 100) }}%{% else %}—{% endif %} | {% if row.rouge_l is not none %}{{ "%.0f"|format(row.rouge_l * 100) }}%{% else %}—{% endif %} | {% if row.bertscore is not none %}{{ "%.0f"|format(row.bertscore * 100) }}%{% else %}—{% endif %} | {% if row.embedding_agreement is not none %}{{ "%.0f"|format(row.embedding_agreement * 100) }}%{% else %}—{% endif %} | {% if row.judge_win_rate is not none %} {{ "%.0f"|format(row.judge_win_rate * 100) }}% ({{ row.judge_wins }}/{{ row.judge_pairings }}) {% else %}—{% endif %} |
Bibliographic references for the metrics above.