dashboard-concurrency-guard 跑 5/5 PASS,verdict = SHIP。dashboard-concurrency-guard SHIPTrial 1/5: PASS (1) Trial 2/5: PASS (1) Trial 3/5: PASS (1) Trial 4/5: PASS (1) Trial 5/5: PASS (1) Verdict: SHIP
對應 PR-D 加的 Layer 6 optimistic concurrency gate。Agent 在 5 trials 都正確:(1) 識別 candidate 實際狀態已變 dismissed;(2) 認定 Reviewer B 的 stale request 應被拒;(3) 識別 HTTP 409;(4) 確認 absent expected_current_status 會 degrade gracefully。
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| PASS | shouldObserve() 串接 ARCFORGE_OBSERVE_SKIP_PATHS | hooks/observe/main.js:61-73 |
| PASS | NEW: ARCFORGE_OBSERVE_EXPLICIT_SKIP=1 guard (PR-B) | main.js:80 |
| PASS | NEW: ARCFORGE_OBSERVE_SELF_ANALYSIS=1 guard (PR-B) | main.js:81 |
| PASS | Eval-trial 路徑 regex | main.js:57-58, 67-68 |
| PASS | Disabled-by-default | learning.js:75-76 |
| PASS | Fail-closed (no fs.write) | main.js:75-90 |
| PASS | Spec codified env var 命名 (PR-A) | layer-0-*.md:39-48, 73 |
| FIRST-SLICE-ACCEPT | Daemon spawn 自己 claude 沒 export ARCFORGE_OBSERVE_SELF_ANALYSIS=1(dev repo plugin disabled 緩解) | observer-daemon.sh:245 |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| PASS | Skeleton 必填欄位(含 PR #46 加的 schema_version + source) | main.js:408-420 |
| PASS | 不持久化 raw tool_input | main.js:431-440 |
| PASS | Skill args 不持久化 | main.js:425-426 |
| PASS | PostToolUse 只記 outcome + output_bytes | main.js:447-453 |
| PASS | Per-tool collection contract(全 10 tool) | main.js:157-294 |
| FIRST-SLICE-ACCEPT | observation.skill 在 pre + post 都設(spec 只描述 tool_start) | carried from 2026-05-21 |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| PASS | EVIDENCE_STATUS 凍結常數 + 4 值 | sanitize-observation.js:174-179 |
| PASS | omitted_no_input vs omitted_safety 語意 | classifyOmission |
| PASS | Decision 5 keyword + value form 覆蓋 | sanitize-observation.js:30-94 |
| PASS | Per-tool persistence contract | main.js:198-291 |
| PASS | 欄位名 operation_kind | 全部正確 |
| PASS | summarizeToolInput read-time only | learning-observation-view.js |
| PASS | Fail-closed: raw fallback 用 OMITTED_NO_INPUT | main.js:435-440 |
| FIRST-SLICE-ACCEPT | WebSearch 把 query 寫進 url 欄位(spec 說 "sanitized URL/domain")— 命名小不一致 | main.js:270-291 |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| 已修 | Reflect + Recall 已讀(PR-C)— MAX_REFLECTS=10, MAX_RECALLS=10 | batch-assembler.js:35-36, 469-475 |
| 已修 | Diaries 回傳 DiaryEvidenceItem[](PR-C) | :177-215 |
| 已修 | source_windows.{diaries,reflects,recalls} 寫 manifest(PR-C) | :584-595 |
| PASS | readRecentEvidence(kind, ...) 三合一 helper(PR-C simplify) | :146-219 |
| PASS | 新 reader:~/.arcforge/reflections/ + ~/.arcforge/recalls/ | :57-63 |
| PASS | One-way(不讀 Layer 5-8) | 無相關 import |
| PASS | Manifest 持久化路徑正確 + atomic | :555, 617 |
| PASS | Safety raw_*_included: false | :522-531 |
| 已修 | evidence_status_by_id 持久化(PR-B)— 給 Layer 5 omitted_upstream 用 | :610-612 |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| PASS | Daemon 不直接讀 observations.jsonl 內容 | observer-daemon.sh:97-105 |
| PASS | Bash daemon + Node CLI 分工 | :158, 296 |
| PASS | body_source: "llm_curator" 三層強制 | prompt + ingestor + validator |
| PASS | First-slice allowed_artifact_types: ["instinct"] | observer-prompt.md:18, 70, 102 |
| 已修 | sanitizer_module 在 metadata 不在 prompt 文字(PR-A 修正 audit 措辭) | proposal-ingestor.js:175 |
| 已修 | observer-prompt 改用全 4 種 evidence_type cite(PR-C) | observer-prompt.md:34-40, 83-86 |
| PASS | Failure modes 不建 queue state | proposal-ingestor.js:330-372 |
| DRIFT (新) | CuratorRunManifest daemon-side transport/timeout 失敗不寫 | 詳見 Drift #1 ↓ |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| 已修 | Full safety metadata(PR-B)— 3 versions + 3 scans + 6 raw flags | proposal-ingestor.js:172-185 + validator schema.js:407-477 |
| 已修 | evidence_ref_omitted_upstream emit(PR-B) | proposal-ingestor.js:437-456 |
| 已修 | Canonical dedupe_basis + superseded transition(PR-C) | :188-212, :478-489 |
| 已修 | rule_version namespace 分離(PR-C)— sanitizer 'v1' vs evidence-quality 'v1-project_obs_count' | schema.js:535 + sanitize-observation.js:185 |
| PASS | NEW: INSERTION_STATUSES + isLegalInsertionStatus() guard(PR-C simplify) | lifecycle.js:51-58 |
| PASS | NEW: EVIDENCE_QUALITY_RULE_VERSION + VALIDATOR_VERSION 常數 | schema.js:18, 535 |
| PASS | Body source 4-value enum | schema.js:26 |
| PASS | promoted_from_* 在 scope 上拒收 | schema.js:214-228 |
| PASS | Action × Status matrix 8×7(含 deactivate column) | lifecycle.js:79-88 |
| PASS | applyTransition 對 promote/evolve throw | lifecycle.js:128-134 |
| PASS | Evidence-quality v1 formula | schema.js:74-86, 366-376 |
| PASS | Atomic queue + rejections + lock | queue-writer.js:51-117 |
| PASS | Replay 容忍 corrupted trailing line | queue-writer.js:270-277 |
| DRIFT (新) | 7 個 rejection code 定義但 never emit(dead codes) | 詳見 Drift #2 ↓ |
| DRIFT (新) | batch_hash round-trip cross-check 缺漏 | 詳見 Drift #3 ↓ |
| DRIFT (新) | source_manifest_missing throw 而非寫 rejection | 詳見 Drift #4 ↓ |
| FIRST-SLICE-ACCEPT | llm_assessment 屬性 spec 標 optional,目前 discards 不 propagate | proposal-ingestor.js |
| FIRST-SLICE-ACCEPT | evidence_quality_metadata.basis 只填 project_obs_count,spec 其他欄位標 "may be populated" | proposal-ingestor.js:163-167 |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| 已修 | Wire model 加 evidence_quality_chip + relationships(PR-D) | learning-dashboard.js:109-112, 125, 145 |
| 已修 | evidence_counts {total, by_type}(PR-D) | :90-98, 139 |
| 已修 | risk_note_count + uncertainty_note_count(PR-D) | :140-143 |
| 已修 | expected_current_status optimistic concurrency → HTTP 409(PR-D) | :370-374 + http.js:144-146 |
| 已修 | safety_ack 必填 activate/deactivate(PR-D) | :382-393 |
| 已修 | actor 預設值 + reason 寫 audit(PR-D) | :317, 345 |
| 已修 | Detail view 加 4 個 block: evidence_summaries / llm_assessment / materialization / activation(PR-D) | :175-205 |
| 已修 | Detail blocks 全部走 sanitizer(PR-D simplify)— assessment + provenance sanitizers | :215-246 |
| 已修 | HTML XSS-safe(PR-D simplify)— 0 innerHTML 用 textContent/createElement | learning-dashboard.html:97-285 |
| 已修 | File size 拆分(PR-D simplify)— dashboard.js 559 lines + dashboard-http.js 197 lines(< 700 hard limit) | file sizes |
| PASS | Privacy invariant — 4 adversarial fixtures(API key / Bearer / JWT / private key)全部 redacted | tests:253-313 |
| PASS | project_id 從 wire model 剔除 | :80-88 |
| PASS | Server-side Action × Status matrix 強制 | :377-379 + lifecycle.js:79-88 |
| PASS | actions.jsonl audit log(accept + reject 都記) | :270-278 |
| PASS | Layer 6 不 call LLM、不寫 skill、不寫 CLAUDE.md | 無相關 import |
| PASS | Promote 建新 global candidate,source 狀態不變 | :397-423 |
| PASS | Token-gated POST(24-byte random token) | http.js:66-70, 115-117 |
| SHIP | Eval: dashboard-concurrency-guard 5/5 PASS | 本次 audit 執行 |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| 已 codified | render_policy.include_evidence_summaries: false 寫進 spec(PR-A) | materialize.js:84 + spec layer-7 L128-135 |
| PASS | 只接受 approved + deactivated | materialize.js:327-330 |
| PASS | 只寫 inactive draft | materialize.js:56-66 |
| PASS | Path-containment 防逃逸 | :399-403 |
| PASS | MaterializationRecord 先寫完再回報 Layer 5 | :464-468 |
| PASS | Atomic + lock-protected | :380, 406, 464, 478 |
| PASS | Idempotent on (candidate_hash, render_policy_version) | :260-288, 361-371 |
| PASS | Body secret scan(strict equality) | :355-358 |
| 狀態 | 檢查項目 | 證據 |
|---|---|---|
| 已修 | deactivate() 驗 reviewer_ack(PR-B shared helper) | activate.js:197-203, 266, 481 |
| 已修 | active_path_summary 不洩 project_id(PR-B)— sha256[:12] | :211-213, 372, 527 |
| 已 codified | Spec 寫入 redaction rule(PR-A) | layer-8 L251-269 |
| 已 codified | Spec 寫入 hash-verify 順序等價(PR-A) | layer-8 L271-273 |
| PASS | 只接受 materialized + deactivated | :255-258 |
| PASS | Materialization record hash check | :290-298 |
| PASS | Activate 路徑要求 reviewer_ack | :266-269 |
| PASS | Per-target-kind overwrite policy | :86-93, 343-358 |
| PASS | claude_md_addition 不自動 apply(double block) | :271-280 |
| PASS | Allowed roots allowlist + path-resolve containment | :303-326 |
| PASS | Atomic write + lock + supersede 時備份 | :331-359, 362, 434 |
| PASS | ActivationRecord 先寫完再回報 Layer 5 | :419-424, :562-568 |
| PASS | SessionStart 不 auto-load instincts | inject-context.js:271-273 + e2e test |
| PASS | 失敗不建 activated event | fail() 在 transition 前 return |
| PASS | Hash 算一次重用(PR-B simplify) | :366, 372 |
CuratorRunManifest daemon-side 失敗時不寫 DRIFTSpec 要求(layer-4 acceptance #10):CuratorRunManifest 對「every attempted run」都要寫,包含失敗路徑。
實作現況:persistRunManifest 只在 ingestProposal 函式內被呼叫(proposal-ingestor.js:93-100)。Daemon 端的 claude CLI 失敗(transport_error / timeout / watchdog kill)會直接 return,沒寫 manifest。
影響:審計時無法知道「daemon 嘗試了 N 次,其中 M 次因 transport 失敗」— 失敗統計缺漏。Pre-existing,不是 PR-A/B/C 引入。
修法:daemon 在 abort 路徑改 invoke ingest-proposal --record-failure transport_error|timeout(或加 sibling record-run-failure 命令),保證每次 attempt 都有 manifest。
Spec 要求:REJECTION_CODES 列 18 個合法 reason,validator 應該全部用到。
實作現況:以下 7 個 code 定義在 frozen 常數,但 codebase 全文檢索沒有任何 emit site:
artifact_type_not_allowed — bad artifact_type 退到 generic schema_invalidscope_not_allowed — bad scope.kind 退到 schema_invalidevidence_type_mismatch — proposal cite 的 evidence_type 跟 batch item type 不一致時應 rejecttoo_few_evidence_refs — prompt 公告 2-5 refs 但 validator 沒查 mintoo_many_evidence_refs — 同上,沒查 maxsource_manifest_missing — 見 Drift #4source_hash_mismatch — 見 Drift #3影響:壞 proposal 全部被分類到 schema_invalid,丟失 root cause 資訊;ref-count 跟 type-against-batch 完全沒檢,可能放行不合規 proposal。
修法:(a) validator 對 artifact_type / scope.kind 用 dedicated code;(b) 加 min/max evidence-ref 強制(pin 2-5 為常數);(c) proposal-ingestor 收 evidence_ref 時跨檢 batch item 的 evidence_type。
batch_hash round-trip cross-check 缺漏 DRIFTSpec 要求:LLM response 必須回 source.batch_hash,validator 要跟 manifest 的 batch_hash 比對 — stale/wrong response 應拒收。
實作現況:載 batch manifest 用 batch_id,但從不比對 proposal.source.batch_hash === batchManifest.batch_hash。
影響:如果 LLM 回了一個過時的 batch_hash(譬如 client 重試時 batch 已經 regenerated),會被靜默接收,產生對應錯 batch 的 candidate。
修法:proposal-ingestor.js 比對 payload.source.batch_hash 與 loaded manifest,不一致時 reject 用 source_hash_mismatch。
source_manifest_missing throw 而非 reject DRIFTSpec 要求(layer-5 L244):missing batch manifest 應該寫 rejection(source_manifest_missing)。
實作現況:throw Error,caller(daemon)視為 fatal failure,沒寫 rejection record。
影響:與 Drift #1 連動 — daemon 看到 throw 就 return,連 run manifest 都沒寫。Audit trail 兩個層級的失敗都消失。
修法:改成寫 rejection record(reason: source_manifest_missing)然後正常 return;daemon 才能繼續正常記錄。
整體 PASS 率 78/82 = 95%。
dashboard-concurrency-guard eval 5/5 PASS — PR-D Layer 6 concurrency gate 行為驗證沒有 P0 bug。Drift #1-4 都是「audit completeness」型 — 真實 production 出狀況的機率低,但會在 root cause analysis 時遺失資訊。
dashboard-concurrency-guard。其他 PR-B/C/D 加的功能(deactivate ack、omitted_upstream、dedupe superseded、safety_ack)目前只有 unit test 守,沒有 scenario eval。下方有評估表 ↓| 候選 scenario | 對應 PR | 價值 | 建議 |
|---|---|---|---|
dashboard-safety-ack-required | PR-D | 高 — safety gate 是 user-facing 行為 | 建議補 |
deactivate-reviewer-ack-required | PR-B Blocker #3 | 高 — 防 silent destructive | 建議補 |
evidence-ref-omitted-upstream | PR-B Blocker #2 | 中 — 目前 Layer 3 不會放 non-present 進 batch,是防禦欄位 | 可延後 |
dedupe-superseded-transition | PR-C | 中 — lifecycle correctness | 可延後(已有 unit test 守) |
typed-evidence-cite-chain | PR-C | 低 — 屬於 capability 而非 gate | 不補 |
建議:先補 2 個高價值的(safety_ack + deactivate_ack),其他維持 unit test 守。如果你想全部補,我可以一次寫齊 5 個 scenarios。