# Ralph Progress Log
## Codebase Patterns
- VFS success paths should call `clear_last_error()`; use `set_last_error(...)` only for real failure paths and keep diagnostics in structured tracing fields.
- SQLite bound TEXT values may contain embedded NUL bytes, but SQL statement strings are rejected at `CString::new(sql)` before SQLite prepare.
- VFS `xDelete` on the main database path should fail loudly with `SQLITE_IOERR_DELETE`; it cannot delete persisted depot state by clearing local memory.
- SQLite VFS production opens fetch page 1 and seed `page_size` from the header via `fetch_initial_main_page` -> `seed_main_page`; test builds use the direct-storage snapshot path instead.
- Depot burst-mode cold lag should derive from workflow `CMP/root.cold_watermark_txid`; `branch_manifest_cold_drained_txid_key` is legacy and should be ignored by production paths.
- PITR interval expiry is owned by the compaction reclaimer's `expired_pitr_interval_rows` lane; use `udb::compare_and_clear`, not helper-level raw clears.
- Legacy database-scoped depot key helpers (`meta_head_key`, `delta_prefix`, `pidx_delta_prefix`, `shard_prefix`) are v1 pegboard compatibility only; do not use them for current Depot invariant scans.
- Workflow test matrix macros that wrap caller blocks in async futures should use `async move $body` so per-run locals are created inside each generated future.
- SQLite VFS `xSync` durability depends on depot's `sqlite_commit` reply waiting for the FDB transaction commit.
- SQLite VFS page-cache fills should use the internally synchronized `moka::Cache` handle directly; do not hold a VFS state write guard only to call `page_cache.insert(...)`.
- Path-shimmed Rust unit tests can keep private module access while moving bulky fixtures into sibling `tests/inline/*_support.rs` modules re-exported from the shim.
- Depot compaction workflows use one module per workflow under `src/workflows/`; shared contracts, companion-loop helpers, and test hooks live under `src/compaction/`, with `depot::workflows::compaction::*` kept as the compatibility facade.
- VFS crash-recovery tests can inject dirty pages through `direct_vfs_ctx`, force unacked close-time commits with `DirectTransportHooks::fail_next_commit`, and reopen against the same `DirectStorage` to assert only committed state is durable.
- VFS commit-concurrency tests can pause direct commit transport with `DirectTransportHooks::pause_next_commit`, issue `io_read` through the xRead callback, then release the commit to assert snapshot behavior deterministically.
- Depot capped ancestor reads cannot trust latest PIDX alone after the parent branch advances; scan capped DELTA history for requested pages before falling through to SHARD/cold.
- Fork/reopen isolation tests should use fresh `Db` handles for the fork before and after parent writes so assertions prove durable branch state, not a warmed per-handle cache.
- Direct VFS harnesses should lazy-initialize RocksDB-backed `DirectStorage` with `tokio::sync::OnceCell::get_or_init`; a `OnceLock` get-then-set race can open the same RocksDB path twice and trip the per-path `LOCK`.
- SQLite VFS process-global registration should live in a dedicated Drop guard so panics after `sqlite3_vfs_register` still call `sqlite3_vfs_unregister`.
- `NativeDatabase::Drop` dirty-page flushes should use a short `tokio::time::timeout`; if the commit future times out, log and return without calling `sqlite3_close_v2`.
- Depot shard-cache fill workers should use cloned `async-channel` receivers for multi-consumer dispatch; do not wrap one `mpsc::Receiver` in a mutex.
- Shard-cache fill idle waiters must create and enable `Notify::notified()` before loading `outstanding`; `notify_waiters()` does not store permits.
- Db branch read cache state belongs in one `CacheSnapshot`; publish branch id, ancestry, access bucket, and PIDX index under one write lock.
- Use `tokio::sync::Barrier` for concurrent test start guns; `Notify::notify_waiters()` can lose a start signal before waiters arm.
- Compaction cold-shard reads must distinguish missing cold objects from sparse pages and re-read the live `CMP/cold_shard` ref under `Serializable` before returning fetched bytes.
- Debug-only Depot takeover reconcile launched from `Db::new*` must be scheduled in the background; do not join it on the constructor's Tokio worker.
- Timing-sensitive WebSocket driver tests should use test-issued permit messages and actor acks to advance exact work counts; avoid real-clock delays plus magnitude ranges.
- TS actor connections expose `connection.ready`; await it instead of polling `isConnected` or retrying startup actions.
- Driver tests that wait for a target actor to sleep should poll a separate observer actor for lifecycle events; target actions and normal target connections can keep resetting or blocking sleep.
- Depot `get_pages` sparse semantics are split: in-range pages with no source return a zero page, but corrupted source blobs must return an explicit error.
- Restore-point creation must re-read the resolved `COMMITS/{txid}` row inside the pin-write transaction; abort with `RestoreTargetExpired` instead of writing a Ready pin for reclaimed history.
- Restore database rollback and undo restore-point pinning must commit in one UDB transaction; never swap DBPTR before the undo pin is durable.
- Future DB pins can rely on older cold-backed SHARD coverage; tests should evict the FDB SHARD, clear hot PIDX for the page, and assert reads still resolve through `CMP/cold_shard`.
- Gasoline workflow-row waiters can subscribe to `BumpSubSubject::WorkflowCreated { tag }`; workflow creation publishes the bump after the UDB commit for each created tag value.
- Depot inspect decode and pagination logic lives in `depot::inspect`; `api-peer` should only mount thin internal `/depot/inspect/...` handlers and must not expose this surface through public SDKs.

Started: Fri May  1 03:05:26 PM PDT 2026
---
## 2026-05-01 15:11:12 PDT - US-001
- Replaced the `commit_atomic_write` success-path `set_last_error("post-commit atomic write succeeded: ...")` call with `clear_last_error()`.
- Preserved the requested DB size diagnostic as a structured `tracing::debug!` field instead of encoding it in `last_error`.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: pre-fix `cargo test -p rivetkit-sqlite commit_atomic_write_clears_last_error_on_success` failed with the success string in `take_last_kv_error()`; after the fix, that test passed and `cargo check -p rivetkit-sqlite` passed.
- **Learnings for future iterations:**
  - VFS success paths should call `clear_last_error()`; reserve `set_last_error(...)` for real failure paths.
  - Size and timing diagnostics in VFS commit paths belong in structured tracing fields, not SQLite last-error state.
---
## 2026-05-01 15:13:51 PDT - US-002
- Implemented the loud-failure contract for main database `xDelete`: `vfs_delete` now sets `last_error` and returns `SQLITE_IOERR_DELETE` when `path == ctx.actor_id`.
- Tightened the regression test to assert the chosen `SQLITE_IOERR_DELETE` contract and the explicit last-error message.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs.rs`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: pre-fix `cargo test -p rivetkit-sqlite vfs_delete_main_db_resets_in_memory_state` failed because main-db xDelete returned `SQLITE_OK` with `db_size_pages=2`; after the fix, that test passed and `cargo check -p rivetkit-sqlite` passed.
- **Learnings for future iterations:**
  - Do not make main DB `xDelete` pretend success by only mutating local VFS state; it cannot delete persisted depot state.
  - Aux-file deletes remain separate from main DB deletes and still use `delete_aux_file`.
---
## 2026-05-01 15:15:20 PDT - US-003
- Renamed the depot restore-point metric Rust identifier from `SQLITE_RESTORE_POINT_COUNT_PER_NAMESPACE` to `SQLITE_RESTORE_POINT_COUNT_PER_BUCKET`.
- Kept the Prometheus metric name `sqlite_restore_point_count_per_bucket` unchanged.
- Files changed: `engine/packages/depot/src/conveyer/metrics.rs`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `rg -n "SQLITE_RESTORE_POINT_COUNT_PER_NAMESPACE|SQLITE_RESTORE_POINT_COUNT_PER_BUCKET" /home/nathan/r2/engine/packages/depot/src /home/nathan/r2/scripts/ralph/prd.json` and `cargo check -p depot` passed.
- **Learnings for future iterations:**
  - Depot metric Rust identifiers should match renamed Prometheus concepts when the metric string has already moved to the new terminology.
---
## 2026-05-01 15:19:42 PDT - US-004
- Removed the legacy `branch_manifest_cold_drained_txid_key` production reads from depot burst-mode and dirty-marker cleanup paths.
- Changed burst-mode lag to use workflow `CMP/root.cold_watermark_txid`, reusing the already-read compaction root during commit.
- Updated depot burst/quota tests to seed `branch_compaction_root_key` and keep legacy key writes only as an ignored-key regression.
- Files changed: `engine/packages/depot/CLAUDE.md` (via `AGENTS.md` symlink), `engine/packages/depot/src/burst_mode.rs`, `engine/packages/depot/src/conveyer/commit/apply.rs`, `engine/packages/depot/src/conveyer/commit/dirty.rs`, `engine/packages/depot/src/conveyer/commit/helpers.rs`, `engine/packages/depot/tests/burst_mode.rs`, `engine/packages/depot/tests/conveyer_commit.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `rg -n "branch_manifest_cold_drained_txid_key|set.*cold_drained|atomic.*cold_drained" /home/nathan/r2/engine/packages/depot --glob '!target/**'`, `cargo check -p depot`, and `cargo test -p depot` passed.
- **Learnings for future iterations:**
  - Depot burst-mode cold lag comes from workflow `CMP/root.cold_watermark_txid`; do not reintroduce the legacy cold-drained manifest key in production paths.
  - Tests that need cold-lag recovery should seed `branch_compaction_root_key`, not `branch_manifest_cold_drained_txid_key`.
---
## 2026-05-01 15:23:53 PDT - US-005
- Deleted the dead `delete_expired_pitr_interval_coverage` helper from depot PITR interval utilities.
- Removed the helper-only regression test/import; PITR expiry is covered by the live compaction reclaimer tests that clear `expired_pitr_interval_rows` with `udb::compare_and_clear`.
- Files changed: `engine/packages/depot/src/conveyer/pitr_interval.rs`, `engine/packages/depot/tests/conveyer_pitr_interval.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `rg -n "delete_expired_pitr_interval_coverage" /home/nathan/r2/engine/packages/depot/src /home/nathan/r2/engine/packages/depot/tests --glob '!target/**'` now returns no matches; `cargo test -p depot pitr_interval`, `cargo check -p depot`, and `cargo test -p depot` passed.
- **Learnings for future iterations:**
  - PITR interval expiry should flow through the compaction reclaimer, where stale rows are cleared with `udb::compare_and_clear`.
  - Keep `pitr_interval` helper tests focused on read/write/scan selection; deletion semantics belong in reclaimer workflow coverage.
---
## 2026-05-01 15:29:34 PDT - US-006
- Removed the debug takeover scan of legacy database-scoped `META`/`DELTA`/`PIDX`/`SHARD` keys and left `reconcile_inner` as an explicit no-op until a branch-scoped invariant check exists.
- Kept the legacy key helpers for v1 pegboard actor compatibility and documented that contract in `conveyer/keys.rs` plus depot `AGENTS.md`.
- Updated `tests/takeover.rs` to prove malformed legacy database-scoped rows are ignored.
- Carried forward the pre-existing depot tier-matrix test changes already present in the worktree; the workflow matrix macro needed `async move $body` for the full depot suite to compile.
- Files changed: `engine/packages/depot/AGENTS.md` (via `CLAUDE.md` symlink), `engine/packages/depot/src/conveyer/keys.rs`, `engine/packages/depot/src/takeover.rs`, `engine/packages/depot/tests/takeover.rs`, `engine/packages/depot/tests/conveyer_branch.rs`, `engine/packages/depot/tests/conveyer_restore_point.rs`, `engine/packages/depot/tests/workflow_compaction_skeletons.rs`, `.agent/notes/depot-tier-test-matrix-results.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `rg -n "depot_keys::(meta_head_key|delta_prefix|pidx_delta_prefix|shard_prefix)|\\b(meta_head_key|delta_prefix|pidx_delta_prefix|shard_prefix)\\(" engine/packages --glob '!target/**'`, `cargo test -p depot --test takeover legacy_database_scoped_rows_are_ignored`, `cargo test -p depot`, and `cargo check -p depot -p pegboard` passed.
- **Learnings for future iterations:**
  - Legacy database-scoped depot key helpers are still load-bearing for v1 pegboard actor cleanup and migration fallback.
  - Current Depot invariant checks must target branch-scoped storage; scanning legacy database-scoped keys is only checking compatibility debris.
  - Use `async move $body` in the workflow matrix macro so per-run locals inside the macro body are created inside each generated future.
---
## 2026-05-01 15:33:34 PDT - US-007
- Documented the SQLite VFS `io_sync` durability contract directly above the callback.
- Added the same reusable invariant to the repo-level `AGENTS.md` via the `CLAUDE.md` symlink.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `CLAUDE.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo check -p rivetkit-sqlite` passed with existing Rust 2024 unsafe-operation warnings.
- **Learnings for future iterations:**
  - SQLite VFS `xSync` returns after `ctx.flush_dirty_pages()` resolves, so byte durability depends on depot's `sqlite_commit` reply waiting for the FDB transaction commit.
  - If pegboard-envoy ever pre-acks before the FDB tx commit, the VFS `xSync` durability contract is broken.
---
## 2026-05-01 15:35:33 PDT - US-008
- Added `query_sql_with_embedded_nul_is_rejected` to pin that SQL strings with embedded NUL bytes are rejected before SQLite prepare.
- The new test inspects the `anyhow::Error` chain and confirms the `CString::new(sql)` NUL-byte error is surfaced.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/tests/query_text_nul.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p rivetkit-sqlite query_sql_with_embedded_nul_is_rejected`, `cargo check -p rivetkit-sqlite`, and `cargo test -p rivetkit-sqlite` passed with existing `vfs.rs` Rust 2024 unsafe-operation warnings.
- **Learnings for future iterations:**
  - Keep SQL-string NUL rejection separate from bound TEXT NUL support: `BindParam::Text("a\0b")` round-trips, but SQL text containing `\0` is rejected at `CString::new(sql)`.
  - For query helper error tests, inspect `anyhow::Error::chain()` so wrapper changes do not hide the underlying CString failure.
---
## 2026-05-01 15:42:01 PDT - US-009
- Re-verified S7 and closed it as `wontfix` instead of changing code.
- Grep showed `VfsContext::new` calls `fetch_initial_main_page`, and the fetched page flows into `VfsState::seed_main_page`; `cargo check -p rivetkit-sqlite` passed without dead-code warnings for either helper.
- Files changed: `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `rg -n "seed_main_page|fetch_initial_main_page" /home/nathan/r2/rivetkit-rust/packages/rivetkit-sqlite /home/nathan/r2/.agent/notes/sqlite-review-issues.md /home/nathan/r2/scripts/ralph --glob '!target/**'`; `cargo check -p rivetkit-sqlite`.
- **Learnings for future iterations:**
  - `fetch_initial_main_page` and `seed_main_page` are live on production opens and refresh `page_size` from page 1's SQLite header.
  - The test-only direct transport path seeds from `snapshot_pages`, so it bypasses the production initial-page fetch under `#[cfg(test)]`.
---
## 2026-05-01 15:43:41 PDT - US-010
- Removed the unnecessary post-fetch `state.write()` guard in `resolve_pages`.
- Cloned the internally synchronized `moka::Cache` handle under a short read and used that handle for fetched-page inserts.
- Cleaned up the shifted `match` and cache-fill loop indentation.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo check -p rivetkit-sqlite`; `cargo test -p rivetkit-sqlite`.
- **Learnings for future iterations:**
  - `moka::Cache` is internally synchronized; VFS code should not hold a broad state write guard only to insert into `page_cache`.
  - For cache-fill paths, clone the cache handle under a short read when the loop does not need other mutable VFS state.
---
## 2026-05-01 15:48:55 PDT - US-011
- Moved the VFS direct-storage and mock-protocol test fixtures from `src/vfs.rs` into `tests/inline/vfs_support.rs`.
- Re-exported those support items from the existing path-shimmed `tests/inline/vfs.rs` module and updated test-only transport references in `src/vfs.rs` to use `tests::...`.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs_support.rs`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo check -p rivetkit-sqlite`; `cargo test -p rivetkit-sqlite`.
- **Learnings for future iterations:**
  - A path-shimmed Rust test module can own a sibling support module and `pub(super) use` its fixtures back to the production module for `#[cfg(test)]` transport wiring.
  - Fixture fields that tests mutate directly, such as `MockProtocol::get_pages_response`, need crate-visible access after being moved into a child support module.
---
## 2026-05-01 15:54:29 PDT - US-012
- Restructured Depot compaction workflows so the concrete workflow entry modules are `src/workflows/db_manager.rs`, `db_hot_compacter.rs`, `db_cold_compacter.rs`, and `db_reclaimer.rs`.
- Moved shared compaction contracts, helper logic, companion-loop orchestration, and debug test hooks into `src/compaction/`.
- Preserved the existing public `depot::workflows::compaction::*` facade while deleting the old `src/workflows/compaction.rs` umbrella and `src/workflows/compaction/` directory.
- Files changed: `engine/packages/depot/src/workflows/mod.rs`, `engine/packages/depot/src/workflows/{db_manager,db_hot_compacter,db_cold_compacter,db_reclaimer}.rs`, `engine/packages/depot/src/compaction/{mod,types,shared,companion,test_hooks}.rs`, `engine/packages/depot/src/lib.rs`, `engine/packages/depot/CLAUDE.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot`; `cargo check -p depot`.
- **Learnings for future iterations:**
  - The actual compaction workflow names are `DbManagerWorkflow`, `DbHotCompacterWorkflow`, `DbColdCompacterWorkflow`, and `DbReclaimerWorkflow`; `companion` is shared loop logic.
  - Public callers should keep importing workflow contracts from `depot::workflows::compaction::*`; internal workflow files can use `crate::compaction::*` and `crate::compaction::shared::*`.
  - Moving modules out from an old parent may require widening formerly sibling-only `pub(super)` helpers to `pub(crate)`.
---
## 2026-05-01 15:58:02 PDT - US-013
- Added `direct_engine_crash_with_dirty_buffer_recovers_last_commit` to cover the VFS crash window where dirty pages are buffered locally but the commit never acks.
- The test seeds a durable row, injects an `empty_db_page()` into the VFS dirty buffer, forces the close-time commit to fail via `DirectTransportHooks::fail_next_commit`, and verifies reopen sees the last successful commit.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p rivetkit-sqlite direct_engine_crash_with_dirty_buffer_recovers_last_commit`, `cargo check -p rivetkit-sqlite`, and `cargo test -p rivetkit-sqlite` passed with existing Rust 2024 unsafe-operation warnings.
- **Learnings for future iterations:**
  - Use `DirectTransportHooks::fail_next_commit` to simulate a commit request that never becomes durable in the direct VFS test harness.
  - Reopening against the same `DirectStorage` is the clean way to verify crash-recovery state, because the new VFS context seeds from the committed storage snapshot instead of the dead handle's local dirty buffer.
---
## 2026-05-01 16:02:03 PDT - US-014
- Added `concurrent_reader_during_commit_atomic_observes_consistent_snapshot` to cover a reader issuing xRead while a writer is held mid-`commit_atomic_write`.
- Extended `DirectTransportHooks` with `pause_next_commit`, a deterministic commit gate used only by the direct VFS test transport.
- Updated the S9 issue note to remove concurrent-reader coverage from the remaining missing list; PITR-restore-then-write and fork-and-immediately-reopen remain open.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs_support.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p rivetkit-sqlite concurrent_reader_during_commit_atomic_observes_consistent_snapshot`, `cargo check -p rivetkit-sqlite`, and `cargo test -p rivetkit-sqlite` passed with existing Rust 2024 unsafe-operation warnings.
- **Learnings for future iterations:**
  - Use `DirectTransportHooks::pause_next_commit` when a VFS test needs to pause after the commit request enters direct transport but before the commit ack returns.
  - For raw xRead coverage, construct a `VfsFile` with the target `VfsContext` and call `io_read` directly from a scoped reader thread.
---
## 2026-05-01 16:09:20 PDT - US-015
- Added Depot integration coverage for PITR-restore-then-write via `write_after_pitr_restore_lands_on_restored_branch`.
- Fixed capped ancestor reads so `get_pages` scans historical DELTA rows up to the ancestor cap when latest PIDX points past the fork/restore point.
- Files changed: `engine/packages/depot/src/conveyer/read.rs`, `engine/packages/depot/src/conveyer/read/plan.rs`, `engine/packages/depot/tests/conveyer_restore_point.rs`, `engine/packages/depot/CLAUDE.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot write_after_pitr_restore_lands_on_restored_branch`, `cargo check -p depot`, and `cargo test -p depot`.
- **Learnings for future iterations:**
  - Latest PIDX is a current-owner map, so capped ancestor reads must scan capped DELTA history when parent commits after the cap have moved PIDX forward.
  - Depot returns zero-filled bytes for missing in-range pages; only pages above `db_size_pages` return `None`.
  - A fresh `Db` handle is useful after restore/write tests to prove durable branch state instead of a local cache result.
---
## 2026-05-01 16:12:16 PDT - US-016
- Added Depot integration coverage for fork-and-immediately-reopen via `fork_database_immediate_reopen_isolated_from_parent_later_writes`.
- The test forks at a known source commit, immediately opens a fresh fork `Db`, verifies pre-fork pages plus sparse-page/EOF behavior, advances the parent, and verifies another fresh fork `Db` still excludes the parent's later writes.
- Files changed: `engine/packages/depot/tests/fork_database.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot fork_database_immediate_reopen_isolated_from_parent_later_writes`, `cargo check -p depot`, and `cargo test -p depot`.
- **Learnings for future iterations:**
  - Fork isolation tests should reopen with a fresh `Db` handle after parent writes to prove branch metadata and storage reads are correct without relying on a warmed cache.
  - Depot returns zero-filled bytes for missing pages within `db_size_pages`, while pages above the fork head's `db_size_pages` return `None`.
---
## 2026-05-01 16:15:30 PDT - US-017
- Added `direct_engine_open_engine_is_concurrency_safe` to reproduce concurrent `DirectEngineHarness::open_engine()` calls racing on the same RocksDB tempdir.
- Replaced the harness's non-atomic `std::sync::OnceLock` get-then-set storage initialization and 50 x 10ms retry loop with `tokio::sync::OnceCell::get_or_init`.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: pre-fix `cargo test -p rivetkit-sqlite direct_engine_open_engine_is_concurrency_safe -- --nocapture` failed with RocksDB `LOCK` errors; after the fix, that test passed, `cargo test -p rivetkit-sqlite` passed, and `cargo check -p rivetkit-sqlite` passed with existing Rust 2024 unsafe-operation warnings.
- **Learnings for future iterations:**
  - The flake was not shared tempdirs across tests; it was concurrent lazy initialization inside one harness instance.
  - RocksDB allows only one open handle per path, so test helpers that cache a RocksDB-backed engine must make initialization atomic instead of retrying failed opens.
---
## 2026-05-01 16:18:51 PDT - US-018
- Added `vfs_registration_is_removed_after_registration_panic` to prove a uniquely named SQLite VFS is gone from `sqlite3_vfs_find(...)` after a panic unwind.
- Split SQLite VFS registration into `SqliteVfsRegistration`, a Drop guard that owns `sqlite3_vfs_register`/`sqlite3_vfs_unregister`; `SqliteVfs` now keeps the VFS name and boxed context alive while that guard unregisters.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs.rs`, `CLAUDE.md` via `AGENTS.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p rivetkit-sqlite vfs_registration_is_removed_after_registration_panic`, `cargo check -p rivetkit-sqlite`, and `cargo test -p rivetkit-sqlite` passed with existing Rust 2024 unsafe-operation warnings.
- **Learnings for future iterations:**
  - Keep SQLite VFS process-global registration ownership in a dedicated guard; code between `sqlite3_vfs_register` and final `SqliteVfs` construction can panic.
  - Store the VFS context in a `Box<VfsContext>` and pass its stable heap pointer to SQLite instead of manually pairing `Box::into_raw` with `Drop for SqliteVfs`.
---
## 2026-05-01 16:25:05 PDT - US-019
- Added `native_database_drop_times_out_pending_commit` to prove `NativeDatabase::Drop` returns when the commit future never resolves.
- Added a pending-commit mode to `MockProtocol`, then bounded drop-time dirty-page flushes with `tokio::time::timeout`; on timeout, Drop logs `tracing::error!`, nulls the DB pointer, and returns without calling `sqlite3_close_v2`.
- Files changed: `rivetkit-rust/packages/rivetkit-sqlite/src/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs.rs`, `rivetkit-rust/packages/rivetkit-sqlite/tests/inline/vfs_support.rs`, `CLAUDE.md` via `AGENTS.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p rivetkit-sqlite native_database_drop_times_out_pending_commit` failed before the close-path adjustment by timing out at 2s; after the fix, it passed. `cargo check -p rivetkit-sqlite` and `cargo test -p rivetkit-sqlite` passed with existing Rust 2024 unsafe-operation warnings.
- **Learnings for future iterations:**
  - Build `tokio::time::timeout` inside the `Handle::block_on(async { ... })` context; constructing it outside the runtime panics.
  - Use a multi-thread Tokio runtime for tests that drop a `NativeDatabase` from a separate `std::thread`, otherwise current-thread timers may not advance there.
  - If a drop-time commit times out, `sqlite3_close_v2` can re-enter VFS close/flush work, so return immediately after logging.
---
## 2026-05-01 16:29:09 PDT - US-020
- Replaced the shard-cache fill queue's shared `Arc<Mutex<mpsc::Receiver<ShardCacheFillJob>>>` with `async-channel`.
- Each cache-fill worker now owns a cloned receiver and awaits jobs directly, removing the serialized receiver lock from the dispatch path.
- Files changed: `Cargo.lock`, `engine/packages/depot/Cargo.toml`, `engine/packages/depot/src/conveyer/read/cache_fill.rs`, `engine/packages/depot/AGENTS.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot`; `cargo check -p depot`.
- **Learnings for future iterations:**
  - Use `async-channel::bounded` when Depot needs bounded multi-consumer worker dispatch.
  - Keep shard-cache fill backpressure and duplicate coalescing in `ShardCacheFillQueue::enqueue`; only the worker receive side needs multi-consumer cloning.
---
## 2026-05-01 16:33:36 PDT - US-021
- Replaced debug-only `takeover::reconcile_blocking` with `takeover::reconcile_nonblocking`, scheduling reconcile on the current Tokio runtime or a detached background thread instead of joining from `Db::new_inner`.
- Added `db_new_does_not_wait_for_takeover_reconcile`, which pauses the debug reconcile with a `Notify` gate, verifies another runtime task makes progress, and asserts the constructor returns before reconcile is released.
- Files changed: `engine/packages/depot/src/takeover.rs`, `engine/packages/depot/src/conveyer/db.rs`, `engine/packages/depot/tests/takeover.rs`, `engine/packages/depot/AGENTS.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot db_new_does_not_wait_for_takeover_reconcile`, `cargo check -p depot`, `cargo test -p depot --test list_databases parent_tombstone_visibility_is_capped_across_deep_bucket_chain -- --nocapture`, and a clean rerun of `cargo test -p depot`.
- **Learnings for future iterations:**
  - Debug-only constructor checks still run on hot-path constructors, so schedule long or I/O-backed reconcile work in the background.
  - Use `oneshot` for test completion signals that may fire before the waiter arms; `Notify::notify_waiters()` does not store a permit.
  - A `Notify` gate is a deterministic slow-operation fixture when a test needs to hold async work pending without adding a real-clock sleep.
---
## 2026-05-01 16:39:06 PDT - US-022
- Replaced the active DB write fixture's `setTimeout(ACTIVE_DB_WRITE_DELAY_MS)` loop pacing with a WebSocket-driven Promise gate.
- Unskipped and rewrote the active-db-writes sleep test to issue exactly 3 `continue-write` permits, wait for actor `write` acks, trigger sleep, then assert exactly 3 persisted write rows after wake.
- Files changed: `rivetkit-typescript/packages/rivetkit/fixtures/driver-test-suite/sleep-db.ts`, `rivetkit-typescript/packages/rivetkit/tests/driver/actor-sleep-db.test.ts`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `pnpm vitest run tests/driver/actor-sleep-db.test.ts -t active-db-writes`; 10x loop of the same filter; `pnpm run check-types`; `pnpm biome check fixtures/driver-test-suite/sleep-db.ts tests/driver/actor-sleep-db.test.ts`; `pnpm run check:test-skips`; `pnpm run check:wait-for-comments`. `pnpm run check` is not defined in the package, and full `pnpm run lint` is blocked by pre-existing unrelated Biome diagnostics.
- **Learnings for future iterations:**
  - WebSocket driver tests can make in-flight handler progress deterministic by having the actor await test-issued permit messages and ack each completed unit.
  - Keep exact-count shutdown tests free of real-clock pacing; trigger sleep only after the expected actor ack has arrived.
---
## 2026-05-01 16:42:45 PDT - US-023
- Added `ActorConnRaw.ready`, a per-connection Promise that resolves when the centralized connection status transition first reaches `connected`.
- Added `ready resolves when connection first opens` in `tests/driver/actor-conn.test.ts` without `vi.waitFor`; the test asserts the Promise is initially pending, resolves when `isConnected` becomes true, and remains the same Promise on later reads.
- Files changed: `rivetkit-typescript/packages/rivetkit/src/client/actor-conn.ts`, `rivetkit-typescript/packages/rivetkit/tests/driver/actor-conn.test.ts`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `pnpm vitest run tests/driver/actor-conn.test.ts -t "ready resolves when connection first opens"`; `pnpm run check-types`; `pnpm biome check src/client/actor-conn.ts tests/driver/actor-conn.test.ts`; `pnpm run check:wait-for-comments`. `pnpm run check` is not defined for `rivetkit-typescript/packages/rivetkit`.
- **Learnings for future iterations:**
  - Await `connection.ready` when tests need the client WebSocket init round trip to complete.
  - Resolve one-shot connection readiness from `#setConnStatus("connected")` so it stays aligned with `isConnected`.
---
## 2026-05-01 16:47:42 PDT - US-024
- Replaced startup-race `vi.waitFor` wrappers with `await connection.ready` plus a single direct DB action/read in the four requested driver test files.
- Removed the obsolete startup/stopping polling comments and unused ready-timeout constants/imports.
- Files changed: `rivetkit-typescript/packages/rivetkit/tests/driver/actor-db-stress.test.ts`, `rivetkit-typescript/packages/rivetkit/tests/driver/actor-db.test.ts`, `rivetkit-typescript/packages/rivetkit/tests/driver/actor-db-pragma-migration.test.ts`, `rivetkit-typescript/packages/rivetkit/tests/driver/actor-db-raw.test.ts`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `pnpm run check-types`; `pnpm run check:wait-for-comments`; `pnpm vitest run tests/driver/actor-db-stress.test.ts tests/driver/actor-db.test.ts tests/driver/actor-db-pragma-migration.test.ts tests/driver/actor-db-raw.test.ts` passed. `pnpm run check` is not defined in `rivetkit-typescript/packages/rivetkit`; focused `pnpm biome check` is still blocked by pre-existing formatter/non-null diagnostics in touched files.
- **Learnings for future iterations:**
  - For first-action startup races, await the actor connection's `ready` promise before issuing the action instead of retrying the action itself.
  - Reacquire keyed actor handles before verification when that is the existing test pattern, but still gate the single read on `ready`.
---
## 2026-05-01 16:54:53 PDT - US-025
- Replaced the target-actor `getCount()` polling loop in `actor-db.test.ts` with polling against a separate `lifecycleObserver` actor.
- Added opt-in lifecycle event recording to `dbActorRaw` so the sleep test can wait for `sleep` events without sending actions to the target actor; the test asserts the recorded sleep timestamp stays within the original sleep observer window.
- Replaced the hard-crash target-action retry loop with a single wake/read after `hardCrashActor(...)`.
- Files changed: `rivetkit-typescript/packages/rivetkit/fixtures/driver-test-suite/actor-db-raw.ts`, `rivetkit-typescript/packages/rivetkit/tests/driver/actor-db.test.ts`, `rivetkit-typescript/CLAUDE.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `pnpm vitest run tests/driver/actor-db.test.ts -t "persists across sleep"`; `pnpm vitest run tests/driver/actor-db.test.ts -t "hard crash"`; `pnpm run check-types`; `pnpm run check:wait-for-comments`; `pnpm biome check fixtures/driver-test-suite/actor-db-raw.ts tests/driver/actor-db.test.ts`. `pnpm run check` is not defined in `rivetkit-typescript/packages/rivetkit`.
- **Learnings for future iterations:**
  - Polling a separate observer actor does not touch the target actor's HTTP request counter, so it can observe sleep without resetting the target sleep timer.
  - Holding a normal target actor connection open can prevent this DB actor from sleeping; do not use a target connection as the sleep observer for this fixture.
---
## 2026-05-01 16:58:38 PDT - US-026
- Fixed the shard-cache fill `wait_idle_for_test` lost-wakeup race by pre-arming and enabling `idle_notify.notified()` before loading `outstanding`.
- Added `shard_cache_fill_wait_idle_prearms_before_rechecking_outstanding`, which forces work to drain and call `notify_waiters()` after the waiter sees nonzero outstanding but before it awaits.
- Files changed: `engine/packages/depot/src/conveyer/read/cache_fill.rs`, `engine/packages/depot/src/conveyer/db.rs`, `engine/packages/depot/tests/conveyer_read.rs`, `engine/packages/depot/CLAUDE.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot shard_cache_fill_wait_idle_prearms_before_rechecking_outstanding`; `cargo test -p depot`.
- **Learnings for future iterations:**
  - `Notify::notify_waiters()` does not store permits, so counter waiters must create and enable `notified()` before the counter load and then re-check.
  - A one-shot debug hook after a nonzero counter load can deterministically reproduce arm-after-check races without timing sleeps.
---
## 2026-05-01 17:05:30 PDT - US-027
- Collapsed Depot's branch read cache from separate `branch_id`, `ancestors`, `last_access_bucket`, and PIDX cache fields into one `CacheSnapshot`.
- Updated `get_pages` and `commit` to read one snapshot for transaction hints and publish one replacement snapshot under a single write lock when branch state changes.
- Added `branch_cache_snapshot_is_atomic_across_dbptr_move`, which warms the old branch cache, rolls DBPTR to a new branch, starts concurrent reads with a `Barrier`, and observes that the cache never exposes the new branch id with the old ancestry root.
- Files changed: `engine/packages/depot/src/conveyer/db.rs`, `engine/packages/depot/src/conveyer/read.rs`, `engine/packages/depot/src/conveyer/commit/apply.rs`, `engine/packages/depot/tests/conveyer_read.rs`, `engine/packages/depot/CLAUDE.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot branch_cache_snapshot_is_atomic_across_dbptr_move -- --nocapture`; `cargo check -p depot`; `cargo test -p depot`.
- **Learnings for future iterations:**
  - Branch id, ancestry, access bucket, and PIDX cache are one logical read-cache snapshot; update them together so DBPTR moves cannot publish mixed cache state.
  - Use `tokio::sync::Barrier` when a test needs multiple spawned tasks to start together. `Notify::notify_waiters()` is not a safe start gun unless waiters are already armed.
---
## 2026-05-01 17:12:05 PDT - US-028
- Added `cold_ref_retired_during_cold_object_fetch_errors_instead_of_zero_fill`, which pauses a filesystem cold-tier object read, concurrently retires the compaction cold-shard ref and deletes the cold object, then asserts `get_pages` returns `ShardCoverageMissing` instead of a zero-filled page.
- Updated cold read-through to distinguish missing cold objects from sparse pages; compaction cold-shard object misses now error, while successful object fetches re-read the live `CMP/cold_shard` ref under `Serializable` before returning bytes or scheduling shard-cache fill.
- Files changed: `engine/packages/depot/src/conveyer/read/cold.rs`, `engine/packages/depot/tests/conveyer_read.rs`, `engine/packages/depot/AGENTS.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot cold_ref_retired_during_cold_object_fetch_errors_instead_of_zero_fill -- --nocapture`; `cargo check -p depot`; `cargo test -p depot`.
- **Learnings for future iterations:**
  - A compaction cold-shard candidate is a real source edge: if the object disappears, return `ShardCoverageMissing` instead of falling through to sparse-page zero-fill.
  - `PageMissing` remains distinct from `ObjectMissing` so sparse pages inside an otherwise valid cold object can still use normal zero-fill semantics.
  - Cold read-through tests can deterministically pause `ColdTier::get_object` with a small wrapper and mutate UDB/cold storage before releasing the read.
---
## 2026-05-01 17:15:47 PDT - US-029
- Added Depot read-path regression coverage for SQLite sparse-page semantics.
- `get_pages_zero_fills_sparse_page_without_any_source` pins that an in-range page with no source returns a zero page.
- `get_pages_errors_for_corrupted_delta_source` pins that a corrupted delta source returns a decode error instead of zero-filling.
- Files changed: `engine/packages/depot/tests/conveyer_read.rs`, `engine/packages/depot/AGENTS.md` via `CLAUDE.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot get_pages_ --test conveyer_read`; `cargo test -p depot`.
- **Learnings for future iterations:**
  - Depot `get_pages` returns zero-filled bytes for legitimate sparse in-range pages only when no source exists.
  - Corrupted source blobs should bubble a read/decode error instead of being treated as sparse pages.
---
## 2026-05-01 17:21:10 PDT - US-030
- Added `create_restore_point_revalidates_target_commit_after_resolve_race`, which pauses restore-point creation after tx A resolves a timestamp target, clears the resolved `COMMITS`, `VTX`, and `PITR_INTERVAL` rows to model reclaim winning the gap, then resumes tx B and asserts `RestoreTargetExpired`.
- Revalidated the resolved target commit row under `Serializable` inside `create_restore_point_for_resolved` before writing any Ready restore-point record, DB pin, pin count, or branch pin.
- Added a debug-only restore-point pause hook and recorded the invariant in depot `CLAUDE.md` plus `.agent/notes/sqlite-review-issues.md`.
- Files changed: `engine/packages/depot/src/conveyer/restore_point.rs`, `engine/packages/depot/src/conveyer/restore_point/pinned.rs`, `engine/packages/depot/src/conveyer/restore_point/test_hooks.rs`, `engine/packages/depot/tests/conveyer_restore_point.rs`, `engine/packages/depot/CLAUDE.md`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot create_restore_point_revalidates_target_commit_after_resolve_race -- --nocapture`, `cargo check -p depot`, and `cargo test -p depot`.
- **Learnings for future iterations:**
  - A Ready restore-point record is only valid if the target `COMMITS/{txid}` row still exists in the same transaction that writes the pin.
  - Debug-only pause hooks are useful for Depot race regressions when the production race is between two UDB transactions.
---
## 2026-05-01 17:33:35 PDT - US-031
- Folded `restore_database` rollback and undo restore-point creation into a single UDB transaction.
- Split reusable transaction-level helpers out of branch rollback and restore-point pin creation so the public restore flow can compose them atomically.
- Added `restore_database_rollback_and_undo_pin_are_atomic`, which injects a failure after rollback work but before undo pinning and asserts DBPTR stays unchanged with no undo restore point.
- Made restore-point debug hooks one-shot and gave hooked tests unique database ids so parallel restore-point tests cannot steal hooks or park after a release.
- Files changed: `engine/packages/depot/AGENTS.md`, `engine/packages/depot/src/conveyer/branch.rs`, `engine/packages/depot/src/conveyer/branch/lifecycle.rs`, `engine/packages/depot/src/conveyer/restore_point/pinned.rs`, `engine/packages/depot/src/conveyer/restore_point/restore.rs`, `engine/packages/depot/src/conveyer/restore_point/test_hooks.rs`, `engine/packages/depot/tests/conveyer_restore_point.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot restore_database_rollback_and_undo_pin_are_atomic -- --nocapture`; `cargo test -p depot --test conveyer_restore_point -- --nocapture`; `cargo check -p depot`; `cargo test -p depot`.
- **Learnings for future iterations:**
  - Restore rollback and undo restore-point pinning are one durability boundary; keep the DBPTR swap and undo pin write in the same UDB transaction.
  - Debug-only restore-point hooks should be one-shot and scoped to unique test database ids because restore-point tests run in parallel.
---
## 2026-05-01 17:37:47 PDT - US-032
- Added `reclaimer_eviction_preserves_future_pin_reads_via_cold_ref` to cover shard-cache eviction when a restore-point pin is newer than the evicted SHARD.
- The test creates a real DB, pins txid 2, seeds a matching cold-backed SHARD at txid 1, forces reclaim to evict the FDB SHARD row, clears hot PIDX for page 1, and asserts the read returns the nonzero page through `CMP/cold_shard`.
- Files changed: `engine/packages/depot/tests/workflow_compaction_skeletons.rs`, `engine/packages/depot/AGENTS.md` (via `CLAUDE.md` symlink), `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot reclaimer_eviction_preserves_future_pin_reads_via_cold_ref -- --nocapture`, `cargo check -p depot`, and `cargo test -p depot` passed.
- **Learnings for future iterations:**
  - A DB pin at txid 2 does not exact-match-retain SHARD txid 1; safety comes from the reclaimer requiring matching `CMP/cold_shard` coverage before FDB SHARD eviction.
  - To prove the post-eviction read uses cold coverage, clear the hot PIDX for the requested page and assert a nonzero page value that only the cold-backed SHARD can supply.
---
## 2026-05-01 17:43:55 PDT - US-033
- Added `BumpSubSubject::WorkflowCreated { tag }` and safe tag-to-subject encoding in Gasoline.
- Published workflow-created bumps after top-level workflow and sub-workflow creation commits; unique dispatches that return an existing workflow do not emit created bumps.
- Added `workflow_created_bump_fires_for_workflow_tag`, which subscribes before dispatching a tagged workflow and asserts the bump arrives without polling.
- Fixed the existing `join_signal!` doctest snippet so the required full Gasoline package test passes.
- Files changed: `engine/packages/gasoline/src/db/mod.rs`, `engine/packages/gasoline/src/db/kv/mod.rs`, `engine/packages/gasoline/src/db/kv/subjects.rs`, `engine/packages/gasoline/src/signal.rs`, `engine/packages/gasoline/tests/db_bump.rs`, `.agent/notes/sqlite-review-issues.md`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p gasoline workflow_created_bump_fires_for_workflow_tag`; `cargo test -p gasoline`; `cargo check -p gasoline`.
- **Learnings for future iterations:**
  - Use `BumpSubSubject::WorkflowCreated { tag }` to wait for tagged workflow rows instead of polling `find_workflow`.
  - Workflow-created bumps are emitted after the creation transaction commits, so waiters should still re-read the row after receiving the notification.
---
## 2026-05-01 17:48:05 PDT - US-034
- Replaced the workflow-row wait in `workflow_compaction_skeletons.rs` with a `BumpSubSubject::WorkflowCreated { tag }` subscription.
- `wait_for_workflow` now subscribes before its first `find_workflow` check and waits on workflow-created bumps instead of the 25ms `wait_until` polling loop.
- Left `wait_until` in place for signal debug rows and UDB observation rows that still lack notification hooks.
- Files changed: `engine/packages/depot/tests/workflow_compaction_skeletons.rs`, `engine/packages/depot/AGENTS.md` (via `CLAUDE.md` symlink), `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo test -p depot --test workflow_compaction_skeletons manager_spawns_companions_and_records_deltas_available -- --nocapture`; `cargo test -p depot`; `cargo check -p depot`.
- **Learnings for future iterations:**
  - Workflow row waits should subscribe to `BumpSubSubject::WorkflowCreated { tag }` before the first lookup so creation between checks cannot be missed.
  - The generic `wait_until` helper is still appropriate for Depot test observations without a Gasoline or UDB notification source.
---
## 2026-05-01 18:03:33 PDT - US-035
- Added internal api-peer Depot inspect routes under `/depot/inspect/...` without touching api-public or generated public SDKs.
- Implemented `depot::inspect` for bounded summary/catalog/bucket/database/branch JSON, paginated branch row-family scans, raw key lookup, raw scans, and best-effort key/value decode with unpadded base64url keys/cursors.
- Added focused coverage for route registration, limit cap enforcement, cursor pagination, raw scan cursors, and decoded `pidx` rows.
- Files changed: `.agent/specs/depot-inspect-api.md`, `Cargo.lock`, `engine/packages/api-peer/Cargo.toml`, `engine/packages/api-peer/src/{depot_inspect,lib,router}.rs`, `engine/packages/api-peer/tests/depot_inspect.rs`, `engine/packages/depot/{Cargo.toml,CLAUDE.md}`, `engine/packages/depot/src/{inspect,lib}.rs`, `engine/packages/depot/src/conveyer/keys.rs`, `engine/packages/depot/tests/inspect.rs`, `scripts/ralph/prd.json`, `scripts/ralph/progress.txt`.
- Verification: `cargo check -p depot -p rivet-api-peer`; `cargo test -p depot --test inspect`; `cargo test -p rivet-api-peer --test depot_inspect`; `cargo test -p depot`.
- **Learnings for future iterations:**
  - Keep Depot inspect scan/decode behavior in the Depot crate so tests can exercise pagination and decoding without going through HTTP.
  - Use unpadded base64url for raw Depot inspect keys and cursors; branch row cursors are raw FDB keys and must stay inside the requested family prefix.
  - `api-peer` is the only HTTP mount point for Depot inspect routes; do not add these internal diagnostics to api-public or generated SDKs.
---
