The roadmap.

This roadmap separates reliability, isolation, runtime, triggers, and Cloud. It does not hide risky work inside optimistic dates. Every big promise has a gate and a test contract.

The shape

Track                         Gate / estimate
M0 Plan cleanup               1 week
M1a Durable turns             4-6 weeks after schema lock
M1b Detached turn worker      3-4 weeks if "never die" is a product promise
M2 Scopes + credentials       3-5 weeks, can overlap M1 with separate owner
M3a Runtime spike             1-2 weeks before any M3 date is promised
M3b Runtime build             3-4 months only if spike passes
M4 Inbound triggers           after M1a, before Cloud
M5 Cloud private beta         3-5 months with 2 engineers, separate RFC
Apple entitlement             start immediately, async weeks to months

M0 — Make the plan true (1 week)

M1a — Durable turn acceptance and replay (4-6 weeks)

Promise after M1a: accepted turns are recorded before side effects. Committed stream chunks replay after refresh or network reconnect. Stale work surfaces retry/cancel instead of disappearing.

This is not "conversations never die." It is "Houston never lies about accepted work, committed chunks, or retry state."

M1b — Detached turn worker (3-4 weeks)

Promise after M1b: app close, engine restart, and engine bounce do not kill an in-flight provider turn.

If the product says "conversations never die," M1b is mandatory. If not, make the promise weaker and ship M1a first.

M2 — Scopes and per-agent credentials (3-5 weeks)

Promise after M2: engine API calls are scoped, and provider credentials are separated per agent. This still is not kernel filesystem isolation.

M3a — Runtime spike (1-2 weeks)

Gate: do this before promising Linux runtime dates.

Fail gate means do not build M3b yet. Keep native engine and ship M1/M2 wins.

M3b — Runtime build (3-4 months, only if spike passes)

Promise after M3b: per-agent Linux users provide real filesystem isolation on supported machines.

M4 — Inbound triggers (after M1a)

Promise after M4: external systems can safely start agent turns without opening a memory DoS or bypassing scopes.

M5 — Cloud private beta (3-5 months, 2 engineers)

Promise after M5: Cloud can host private-beta customers with provisioning, wake, backup, restore, billing, support, and observability.

Parallel track: Apple entitlement paperwork

Start immediately. Do not wait for M3. Using Virtualization.framework in a Developer ID app requires the com.apple.security.virtualization entitlement, which Apple grants by request. Timeline: a few weeks if clean, longer if Apple has questions. Filing late blocks the entire M3 release.

Action: open a ticket at developer.apple.com (Code Signing & Entitlements). Describe Houston as a desktop app that runs user-owned agent workloads inside a Linux guest for isolation. Reference VZ.framework. Wait. Then add the entitlement to app/src-tauri/entitlements.plist the moment it's granted.

Decisions that must be explicit

M1: canonical turn_stream writes with one-release read fallback, or hard migration only?

M1b: do we make "never die" a product promise now, or delay it and say "retryable" after engine crash?

M2: brokered desktop default plus strict mode, or strict per-agent by default everywhere?

M3: what happens when Windows virtualization is disabled by BIOS or corporate policy?

M5: which Cloud platform, backup objective, region promise, support SLA, and vendor-exit plan?

What "shipped" means by milestone

Minimum serious schedule: M1a and M2 can overlap with two engineers. M1b, M3b, and M5 each need focused ownership. One engineer stretches the plan linearly and should not promise Cloud plus runtime in the same half-year.