One agent, one tiny computer.

That is the whole architecture. Every agent in Houston Cloud gets its own miniature Linux machine, sealed off from everything else, asleep until someone talks to it. This chapter is the one page version. Every chapter after this zooms in on one piece.

The core decision

Each agent runs in its own Kata container, which is a fancy way of saying its own little virtual computer powered by Firecracker. Those little computers run on a cluster managed by Kubernetes. When an agent isn't being used, its computer is turned off and costs zero. When you send a message, the computer boots in under a second, does the work, then turns off again.

Everything else in this guide flows from that one decision.

The pieces

Frontend

Control plane

The brain in front of every agent. About 3 to 5 copies running at all times.

Agent pods (the meat)

Per team isolation

Per agent isolation

The pod boundary is the agent boundary.

State

Permissions

The stack at a glance

LayerWhat we usePlain English
ClusterEKS or GKERented Kubernetes from Amazon or Google.
Runtime isolationKata Containers with FirecrackerTiny VMs that wrap each pod with a real hardware wall.
Scale to zeroKnative ServingTurns pods off when idle, boots on demand.
Network policyCilium or CalicoDecides who is allowed to talk to whom.
AuthSupabaseHandles login, sessions, SSO.
Metadata DBPostgresLong term memory. Users, agents, permissions, billing.
Session cacheRedisShort term memory. Live chat state.
Object storageS3 or GCSFiles, attachments, backups.
AnalyticsPostHogWho clicked what.
ErrorsSentryWhat broke and where.
FrontendReact (@houston-ai/*)The same components the desktop already uses.
Agent runtimehouston-engine + CLIsThe same Rust binary that runs on the desktop today.

Why this shape

Problem 1

Agents can read each other's stuff. If two agents share a machine and one of them is told "read the other guy's files," it can. Solution: each agent gets its own machine. Done.

Problem 2

Idle agents cost money. A sales bot used once a day shouldn't cost the same as one running 24 by 7. Solution: turn off the box when nobody's talking. Knative does it for us.

Problem 3

Different teams can't share a machine. Acme's HR agent should never accidentally route to Globex. Solution: each team in its own Kubernetes namespace, walled off with NetworkPolicy.

Problem 4

Agent state has to survive a restart. The agent's notes, memory, OAuth tokens. Solution: per agent persistent volume. Disk lives even when the pod doesn't.

What we are not doing

Three architectural rejects, so we don't relitigate them later.

What this buys us

How to read the rest of this guide

Chapters 2 through 9 explain each piece of the stack from the ground up. Chapter 10 walks through a single message end to end. Chapter 11 is the build order with gates. If you only have ten minutes, you just read the most important chapter.