{% extends "base.html" %} {% block title %}Run {{ run.run_id[:12] }} - MoralStack{% endblock %} {% block content %}
| Component | Model |
|---|---|
| Baseline | {{ benchmark_report.models_config.baseline }} |
| Judge | {{ benchmark_report.models_config.judge }} |
| MoralStack policy | {{ benchmark_report.models_config.moralstack.policy }} |
| MoralStack policy (rewrite) | {{ benchmark_report.models_config.moralstack.policy_rewrite | default(benchmark_report.models_config.moralstack.policy, true) }} |
| MoralStack risk | parallel mini-estimators |
| MoralStack risk · intent | {{ benchmark_report.models_config.moralstack.risk_intent }} |
| MoralStack risk · signals | {{ benchmark_report.models_config.moralstack.risk_signals }} |
| MoralStack risk · operational | {{ benchmark_report.models_config.moralstack.risk_operational }} |
| MoralStack risk | {{ benchmark_report.models_config.moralstack.risk }} |
| MoralStack critic | {{ benchmark_report.models_config.moralstack.critic }} |
| MoralStack simulator | {{ benchmark_report.models_config.moralstack.simulator }} |
| MoralStack hindsight | {{ benchmark_report.models_config.moralstack.hindsight }} |
| MoralStack perspectives | {{ benchmark_report.models_config.moralstack.perspectives }} |
| Metric | Baseline | MoralStack | Difference |
|---|---|---|---|
| Wins | {{ benchmark_report.get('baseline_wins', 0) }} | {{ benchmark_report.get('moralstack_wins', 0) }} | {{ '%+d' | format(benchmark_report.get('moralstack_wins', 0) - benchmark_report.get('baseline_wins', 0)) }} |
| Ties | {{ benchmark_report.get('ties', 0) }} | ||
| Average Score | {{ "%.2f" | format(benchmark_report.get('baseline_avg_score', 0)) }}/10 | {{ "%.2f" | format(benchmark_report.get('moralstack_avg_score', 0)) }}/10 | {{ "%+.2f" | format((benchmark_report.get('moralstack_avg_score', 0) - benchmark_report.get('baseline_avg_score', 0))) }} |
| Average Time | {{ "%.0f" | format(benchmark_report.get('avg_baseline_time_ms', 0)) }}ms | {{ "%.0f" | format(benchmark_report.get('avg_moralstack_time_ms', 0)) }}ms | {{ "%+.0f" | format((benchmark_report.get('avg_moralstack_time_ms', 0) - benchmark_report.get('avg_baseline_time_ms', 0))) }} ms |
| System | FP (refusal when not expected) | FN (no refusal when REFUSE expected) | Leakage count |
|---|---|---|---|
| Baseline | {{ benchmark_report.get('baseline_fp', 0) }} | {{ benchmark_report.get('baseline_fn', 0) }} | {{ benchmark_report.get('baseline_leakage_count', 0) }} |
| MoralStack | {{ benchmark_report.get('moralstack_fp', 0) }} | {{ benchmark_report.get('moralstack_fn', 0) }} | {{ benchmark_report.get('moralstack_leakage_count', 0) }} |
Over-Governance Rate: {{ "%.2f" | format((benchmark_report.get('over_governance_rate', 0) * 100)) }}%
Full report as exported in the MD format. Use Export Benchmark to download.
{{ benchmark_summary_md }}
| Q# | Request ID | Prompt | Domain | Decision | Actions |
|---|---|---|---|---|---|
| {{ q.question_id }} | {% if q.moralstack_request_id %} {{ q.moralstack_request_id[:12] }} {% else %}—{% endif %} | {{ (q.question_text or '')[:80] }}{% if (q.question_text or '') | length > 80 %}…{% endif %} {% if q.error %} — {{ q.error[:80] }}{% if q.error | length > 80 %}…{% endif %} {% endif %} | {{ (q.moralstack_overlay or q.domain_overlay) or '—' }} |
Expected
{{ q.expected_action or '—' }}
Final
{{ q.moralstack_final_action or '—' }}
|
{% if q.moralstack_request_id %} View Export {% else %} — {% endif %} |
No questions in benchmark report.
Multi-turn governance conversations recorded in this run (Step 13 observability).
| Conversation ID | Turns | Max risk | Last posture | Cached turns | Last activity | Actions |
|---|---|---|---|---|---|---|
| {{ c.conversation_id[:24] }}{% if c.conversation_id | length > 24 %}…{% endif %} | {{ c.turn_count }} | {% if c_max_risk is not none %}{{ "%.3f" | format(c_max_risk) }}{% else %}—{% endif %} | {% if c.get('last_posture') %}{{ c.get('last_posture') }}{% else %}—{% endif %} | {{ c.get('cached_turn_count') or 0 }} | {{ c.get('last_created_at') | fmtdate }} | View Export |
| Request ID | Prompt | Domain | Created | Actions |
|---|---|---|---|---|
| {{ req.request_id[:12] }} | {{ (req.prompt or '')[:80] }}{% if (req.prompt or '') | length > 80 %}…{% endif %} | {{ req.domain or '—' }} | {{ req.created_at | fmtdate }} | View Export |
No requests in this run.