45

StatefulSets

CKA prep • Stable identity, ordered rollout, headless Service, volumeClaimTemplates

Key terms

TermMeaning
StatefulSetWorkload for pods that need stable identity + storage
Stable network IDEach pod gets a fixed ordinal name and DNS record
Headless ServiceclusterIP: None Service that gives per-pod DNS
volumeClaimTemplatesPer-pod PVCs created automatically and kept on rescheduling
Ordinal indexThe -0, -1, -2 suffix that orders pods
OrderedReadyDefault: create/scale/update pods one at a time, in order
PartitionrollingUpdate field to stage an update by ordinal

Problem & solution

Deployments treat pods as interchangeable cattle — random names, shared or no persistent storage, parallel rollout. That breaks stateful systems (databases, queues, clustered stores) which need a stable name, their own durable disk, and a predictable start order (e.g. bring up the primary before replicas).

Solution: A StatefulSet gives each pod a stable ordinal identity, its own PVC via volumeClaimTemplates, stable DNS through a headless Service, and ordered, one-at-a-time deploy/scale/update.

The analogy

At the port, a line of numbered ships, Ship-0, Ship-1, Ship-2, each keeps a fixed berth and a name posted on the harbor board so anyone can find it directly, and each owns its own named warehouse unit on the quay. When a ship sails and returns, it gets the same number, the same berth listing, and the same warehouse reattached, nothing is shuffled. A StatefulSet is exactly this: pods get stable ordinal names like web-0, a headless Service gives each a fixed DNS name, and volumeClaimTemplates give each its own PVC that survives and reattaches by name across restarts.

StatefulSet vs Deployment

The quickest way to grasp StatefulSets is to compare them with Deployments side by side. The differences all stem from identity and storage:

   +---------------------+----------------------+--------------------------+
   |                     | Deployment           | StatefulSet              |
   +---------------------+----------------------+--------------------------+
   | pod names           | random hash          | ordinal: web-0, web-1    |
   | pod identity        | interchangeable      | stable + sticky          |
   | DNS                 | via Service VIP      | per-pod via headless svc |
   | storage             | shared / ephemeral   | one PVC per pod          |
   | rollout / scale     | parallel             | ordered, one at a time   |
   | PVC on pod delete   | usually gone         | retained + reattached    |
   | use for             | stateless web/api    | databases, clustered apps|
   +---------------------+----------------------+--------------------------+

Stable identity and DNS

Pods are named <set>-<ordinal> (web-0, web-1, ...). With a headless Service each pod gets a deterministic DNS name, so peers can find each other by name regardless of IP changes.

   pod DNS:   web-0.web.default.svc.cluster.local
              web-1.web.default.svc.cluster.local
              <pod>.<headless-svc>.<namespace>.svc.cluster.local

The headless Service

A headless Service (clusterIP: None) returns the pod A records directly instead of a single VIP — that is what powers per-pod DNS.

apiVersion: v1
kind: Service
metadata:
  name: web                # serviceName referenced by the StatefulSet
spec:
  clusterIP: None          # headless -> per-pod DNS
  selector: { app: web }
  ports:
    - port: 80
      name: http

A StatefulSet with volumeClaimTemplates

Each replica gets its own PVC rendered from the template (data-web-0, data-web-1, ...). Those PVCs survive pod deletion and reattach by name.

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: web          # MUST match the headless Service name
  replicas: 3
  selector:
    matchLabels: { app: web }
  template:
    metadata:
      labels: { app: web }
    spec:
      containers:
        - name: nginx
          image: nginx:1.27
          ports:
            - { containerPort: 80, name: http }
          volumeMounts:
            - name: data
              mountPath: /usr/share/nginx/html
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi

Ordered deploy, scale, and update

StatefulSets change pods in a strict order rather than all at once. This is how create, scale, and update sequence the ordinals:

   create / scale up:   web-0 (Ready) -> web-1 (Ready) -> web-2     (in order)
   scale down:          web-2 -> web-1 -> web-0                     (reverse order)
   rolling update:      highest ordinal first, down to web-0, one at a time
kubectl get statefulset web
kubectl get pods -l app=web -o wide        # web-0, web-1, web-2 in order
kubectl scale statefulset web --replicas=5 # adds web-3 then web-4
kubectl get pvc                            # data-web-0 ... persist independently

# stage an update: only ordinals >= partition are updated
kubectl patch statefulset web -p \
  '{"spec":{"updateStrategy":{"rollingUpdate":{"partition":2}}}}'

Operational details that bite

A few behaviors surprise people in practice, especially around PVCs and ordering. Keep these in mind:

   - PVCs are NOT deleted when you delete the StatefulSet (delete them by hand)
   - deleting a pod reschedules it with the SAME name and reattaches its PVC
   - podManagementPolicy: Parallel skips ordering for faster, order-agnostic apps
   - updateStrategy: OnDelete = you delete pods manually to pick up a new template
   - a stuck web-0 (not Ready) blocks web-1 from ever starting (OrderedReady)

End-to-end: ordered bring-up of a 3-node set

End-to-end example: a 3-replica StatefulSet with stable DNS

A full walkthrough: deploy a headless Service plus a 3-replica StatefulSet with volumeClaimTemplates, watch ordered creation, resolve a peer by its stable DNS name, write to one pod's PVC, scale, and prove the PVC reattaches by name.

  1. Apply the headless Service and the StatefulSet together:
apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  clusterIP: None
  selector: { app: web }
  ports:
    - { port: 80, name: http }
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: web
  replicas: 3
  selector:
    matchLabels: { app: web }
  template:
    metadata:
      labels: { app: web }
    spec:
      containers:
        - name: nginx
          image: nginx:1.27
          ports: [{ containerPort: 80, name: http }]
          volumeMounts:
            - { name: data, mountPath: /usr/share/nginx/html }
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 1Gi
  1. Watch the pods come up one at a time, in order:
kubectl get pods -l app=web -w
# expected order: web-0 Running -> web-1 Running -> web-2 Running
  1. Confirm each replica got its own PVC, rendered from the template:
kubectl get pvc
# expected: data-web-0, data-web-1, data-web-2  (each Bound, 1Gi)
  1. Write a unique file into web-0's volume, then resolve web-0 by its stable DNS name from another pod:
kubectl exec web-0 -- sh -c 'echo "I am web-0" > /usr/share/nginx/html/index.html'

kubectl run client --rm -it --restart=Never --image=curlimages/curl -- \
  curl -s http://web-0.web.default.svc.cluster.local
# expected: I am web-0
  1. Scale out and confirm the new ordinals append in order:
kubectl scale statefulset web --replicas=5
kubectl get pods -l app=web
# expected: web-3 then web-4 created after web-0..web-2 stay Ready
kubectl get pvc
# expected: data-web-3, data-web-4 now also Bound
  1. Prove sticky identity: delete web-0, watch it return with the SAME name and its data intact:
kubectl delete pod web-0
kubectl wait --for=condition=Ready pod/web-0 --timeout=60s
kubectl run client --rm -it --restart=Never --image=curlimages/curl -- \
  curl -s http://web-0.web.default.svc.cluster.local
# expected: I am web-0   (PVC data-web-0 reattached by name)
  1. Scale down and observe reverse-order termination (PVCs remain):
kubectl scale statefulset web --replicas=3
kubectl get pods -l app=web
# expected: web-4 then web-3 terminate first (reverse order)
kubectl get pvc
# expected: data-web-3, data-web-4 still present (StatefulSet keeps PVCs)

Common pitfalls

These are the failures you are most likely to meet with StatefulSets:

   - pods stuck Pending          -> no headless Service, or serviceName mismatch
   - web-1 never starts          -> web-0 not Ready blocks ordered rollout
   - storage shared by mistake   -> use volumeClaimTemplates, not one shared PVC
   - PVCs linger after delete    -> StatefulSet leaves PVCs; clean up manually
   - no per-pod DNS              -> Service must be clusterIP: None (headless)
   - rollout too slow            -> set podManagementPolicy: Parallel if order-agnostic

Key takeaways

  • StatefulSets give pods stable ordinal names and sticky identity.
  • A headless Service (clusterIP: None) provides per-pod DNS.
  • volumeClaimTemplates create one durable PVC per pod that reattaches by name.
  • Deploy/scale is ordered (up in order, down in reverse); a blocked pod stalls the rest.
  • PVCs persist after pod or StatefulSet deletion — clean them up deliberately.
  • Use partition to stage updates and Parallel policy when ordering is unneeded.

Checklist

  • [ ] Explained StatefulSet vs Deployment (identity, storage, ordering)
  • [ ] Created a headless Service and matched serviceName
  • [ ] Deployed a StatefulSet with volumeClaimTemplates and saw per-pod PVCs
  • [ ] Scaled up/down and observed ordered create/reverse delete
  • [ ] Resolved a pod by its web-0.web...svc DNS name
  • [ ] Staged a rolling update with a partition