Kubernetes is a robot landlord.
Houston runs ten thousand tiny programs. Something has to decide where each one lives, restart them when they crash, and move them around when a machine dies. That thing is Kubernetes. It is a robot landlord for software.
The 30 second version
You have a bunch of computers in a datacenter. You have a bunch of programs you want to run. You don't want to log into each computer and start each program by hand. So you tell Kubernetes "here are 50 servers, here are 5,000 programs, figure it out." Kubernetes figures it out. When a program crashes, Kubernetes restarts it. When a server dies, Kubernetes moves the programs to a different server. When you need more programs, Kubernetes finds room.
That's the whole pitch. Everything else is jargon.
The words
Pod
A pod is one running program. In our case, one pod equals one agent. If we have 10,000 agents online, we have 10,000 pods. A pod is the smallest thing Kubernetes manages.
Node
A node is one physical computer (or one rented virtual computer from Amazon). Many pods live on one node. Think of a node as an apartment building and pods as the tenants.
Cluster
A cluster is the whole set of nodes managed together. One cluster might have 50 nodes running 5,000 pods. The cluster is the city the robot landlord runs.
Namespace
A namespace is a fence inside the cluster. Pods in different namespaces can be told to ignore each other entirely. Houston uses one namespace per team workspace. So Acme's pods and Globex's pods can run on the same physical machines but never see each other.
Service
A pod's address changes every time it restarts. A service is a permanent doorbell that points to whichever pod is currently alive. "Always knock here, I'll forward you to the right pod."
Deployment
A wish. "I want 3 copies of this program running at all times." Kubernetes keeps that wish true. If a copy dies, it makes a new one. If you change the wish to 5 copies, it spawns 2 more. If you change the program version, it rolls out the new one a few pods at a time.
GKE? EKS? What are those?
Running Kubernetes yourself is a part time job. You'd need people to update it, secure it, fix it when it breaks. Nobody wants that. So Amazon and Google rent it to you.
- GKE is Google's rented Kubernetes. "Google Kubernetes Engine." You give them a credit card, they give you a cluster.
- EKS is Amazon's version. "Elastic Kubernetes Service." Same idea.
- AKS is Microsoft's. Same again.
Pick one based on which cloud's invoices you already hate. We're planning on GKE because Houston already has Google Cloud credits from BigQuery and hosting.
What Kubernetes gives us specifically
- Restart for free. If
houston-enginecrashes inside an agent pod, Kubernetes restarts it without anyone noticing. - Move for free. If a node dies in the middle of the night, Kubernetes spreads its pods across the surviving nodes. The on-call engineer's pager doesn't go off.
- Scale for free. When 1,000 new customers sign up tomorrow, we ask the cluster to add more nodes. New pods land on them. No code changes.
- Walls for free. Namespaces and NetworkPolicy keep Acme and Globex apart without us writing any code.
- Ecosystem for free. Knative (auto sleep), Kata (microVMs), Cilium (network), Helm (packages), cert-manager (TLS), Prometheus (metrics). Every piece of cloud infra plugs into K8s.
What Kubernetes does not give us
- Isolation between pods on the same node. Two pods on the same physical machine share the kernel. If one of them escapes, it can attack its neighbor. We fix this with Kata + Firecracker (Chapter 3).
- Scale to zero. Default Kubernetes keeps your pods running even when nobody uses them. We fix this with Knative (Chapter 4).
- An opinion about your code. K8s is dumb on purpose. It runs whatever you put in a container. We have to bring our own engine, our own control plane, our own everything.
What our cluster will roughly look like
For a 10-pod side project, Docker Compose wins. For 10,000-pod multi tenant with per agent isolation, audit logging, RBAC, and the ecosystem of every tool we want to use (Kata, Knative, Cilium), Kubernetes is the boring answer. Boring wins enterprise sales calls.
EKS control plane is ~$72/month. GKE is comparable. Node cost is whatever VMs we use — likely a mix of small always-on nodes for the control plane and bigger autoscaling nodes for agents. Real cost lives in the nodes, not the cluster fee.