{% extends "base.html" %} {% block content %}

Escalation Drift

G-019 Layer C surface for the escalation-drift scanner (T-1555). The daily escalation-drift-daily cron scans completed tasks for symptom-fix candidates: H1 (bug-class without ## RCA), H2 (same learning ID repeated across 3+ tasks in 30 days), H3 (bug-class without RCA AND no learning capture).

{% if source_path %}

Source: {{ source_path }}

{% endif %}
{% if not data %}

No drift data yet

Run fw cron run escalation-drift-daily or wait for the daily 5:23 UTC cron to populate .context/working/escalation-drift-LATEST.yaml.

{% else %} {# ───────────────── Corpus totals ───────────────── #}

Corpus

Generated: {{ data.generated or 'unknown' }}
Total tasks scanned: {{ data.corpus_total or 0 }}   Bug-class: {{ data.bug_class_total or 0 }} {% if data.bug_class_pct is defined %}({{ data.bug_class_pct }}%){% endif %}

{# ───────────────── Heuristics ───────────────── #}

Heuristics

Heuristic What it catches Flagged % of bug-class
H1 Bug-class task with no ## RCA section {{ data.h1_flagged or 0 }} {{ data.h1_pct_of_bug_class or 0 }}%
H2 Repeat learning ID across 3+ tasks in 30 days {{ data.h2_repeat_patterns or 0 }}
H3 Bug-class without RCA and no learning {{ data.h3_flagged or 0 }} {{ data.h3_pct_of_bug_class or 0 }}%
Recent (30d) Any heuristic firing in the last 30 days {{ data.recent_30d_flagged or 0 }}
{# ───────────────── H2: top repeating patterns ───────────────── #} {% if data.h2_top %}

H2 — Top Repeating Learnings

Same learning surfaced in N tasks within 30 days. High counts often mean the underlying gap was never structurally fixed.

{% for row in data.h2_top[:15] %} {% endfor %}
Learning Task count
{{ row.learning }} {{ row.task_count }}
{% endif %} {# ───────────────── Recent flagged sample (with v0.5 triage column) ───────────────── #} {% if data.recent_sample %}

Recent Flagged Tasks (sample)

The Triage column shows the v0.5 LLM verdict per candidate when available. real_symptom_fix = LLM agrees with the heuristic; false_positive = LLM disagrees; defer = LLM lacked context. Empty cell = candidate not yet triaged.

{% for entry in data.recent_sample %} {% set tid = entry.tid.split('-')[0] ~ '-' ~ entry.tid.split('-')[1] %} {% set v = v05_by_task.get(tid) %} {% endfor %}
Task Title Triage (v0.5) Confidence
{{ tid }} {{ entry.name }} {% if v %}{{ v.verdict }}{% else %}—{% endif %} {% if v %}{{ '%.2f' % v.confidence }}{% else %}—{% endif %}
{% endif %} {# ───────────────── v0.5 LLM augmentation panel (T-1727) ───────────────── #} {% if v05 %}

v0.5 LLM Augmentation

Per-candidate triage by ollama-local {{ v05.model or 'hermes3' }} via fw resolver dispatch (workflow escalation-triage). Idempotency window: {{ v05.idempotency_days or 7 }} days.

{% if v05_source %}

Source: {{ v05_source }}

{% endif %} {% set by_verdict = {} %} {% for c in v05.candidates or [] %} {% set _ = by_verdict.update({c.verdict: by_verdict.get(c.verdict, 0) + 1}) %} {% endfor %}
Verdict Count Meaning
real_symptom_fix {{ by_verdict.get('real_symptom_fix', 0) }} LLM confirms heuristic — true symptom fix without RCA
false_positive {{ by_verdict.get('false_positive', 0) }} LLM disagrees — task is not a symptom fix (refactor / typo / out-of-line RCA)
defer {{ by_verdict.get('defer', 0) }} LLM lacked context to decide
ERROR / PARSE-FAIL {{ by_verdict.get('ERROR', 0) + by_verdict.get('PARSE-FAIL', 0) }} Dispatch failed or LLM output not parseable

Generated: {{ v05.generated or 'unknown' }} · Dispatched: {{ v05.dispatched or 0 }} · Skipped (idempotent): {{ v05.skipped_idempotent or 0 }} · Errors: {{ v05.errors or 0 }}

{% endif %} {% if data.report_md %}

Full Report

Human-readable analysis: {{ data.report_md }}

{% endif %} {% endif %}
{% endblock %}