Health Probes: Liveness vs Readiness (vs Startup)

Video: Day 18/40 — Kubernetes Health Probes | Liveness vs Readiness Probes • https://www.youtube.com/watch?v=x2e6pIBLKzw • Duration: ~29 min

Published 21 Jun 2026

Key terms

Term	Meaning
Liveness probe	Restart the container if it fails
Readiness probe	Hold traffic until the pod is ready
Startup probe	Protect slow-starting apps from early restarts
httpGet/tcpSocket/exec	The three probe check methods
initialDelaySeconds	Wait before the first probe
periodSeconds/threshold	Probe timing and failure tolerance

Problem & solution

A container can be "running" yet broken (deadlocked) or still warming up. Without health checks, Kubernetes will route traffic to bad pods and never restart hung ones, causing silent outages.

Solution: Add liveness (restart if stuck), readiness (gate traffic), and startup probes so Kubernetes only sends traffic to healthy containers.

The analogy

When a ship arrives, the harbor doctor boards it and asks three questions. Is the ship still alive at all, and if not, send it back for repairs. Is it ready to load cargo right now, and if not, hold the trucks back but leave it docked. And has it finished starting up after a long voyage, so a slow boat gets time before the other two checks judge it. In Kubernetes the doctor is the kubelet, the alive check is the liveness probe that restarts the container, the ready check is the readiness probe that gates Service traffic, and the boot check is the startup probe that protects slow starters.

Where this fits in the cluster

Probes are defined per container. The kubelet on the node runs them; results drive container restarts (liveness) and Service endpoint membership (readiness).

Why probes?

A container can be running but broken (deadlocked) or not yet ready (still warming up). Probes let the kubelet check actual health and react.

The three probes

Kubernetes offers three probe kinds, each answering a different question and triggering a different reaction when it fails.

   Liveness   -> "Is it alive?"     fail -> RESTART the container
   Readiness  -> "Can it serve?"    fail -> remove from Service endpoints
   Startup    -> "Has it booted?"   fail -> kill; gates the other two

Liveness vs Readiness (the key distinction)

The two probes act on different axes: readiness decides whether a pod gets traffic, while liveness decides whether the container is restarted.

   Readiness controls TRAFFIC   (in/out of the load balancer)
   Liveness   controls LIFECYCLE (restart the container)

A pod can be Live but not Ready (alive, still warming up -> no traffic yet).

Probe types (how the check is done)

Each probe can run the health check in one of three ways, depending on what the app exposes.

   httpGet   -> GET a path/port; 2xx-3xx = pass
   tcpSocket -> can we open the TCP port? = pass
   exec      -> run a command in container; exit 0 = pass

YAML — all three

A single container can declare all three probes together; the startupProbe gates the others until the app has booted.

apiVersion: v1
kind: Pod
metadata:
  name: petclinic
spec:
  containers:
    - name: petclinic
      image: springcommunity/spring-petclinic-rest:3.0.2   # real Spring Boot app
      ports:
        - containerPort: 8080
      startupProbe:                                          # gate the slow JVM boot first
        httpGet: { path: /actuator/health, port: 8080 }
        failureThreshold: 30
        periodSeconds: 10                                    # allows up to 300s to boot
      readinessProbe:
        httpGet: { path: /actuator/health/readiness, port: 8080 }  # Spring Boot readiness group
        initialDelaySeconds: 5
        periodSeconds: 10
      livenessProbe:
        httpGet: { path: /actuator/health/liveness, port: 8080 }   # Spring Boot liveness group
        initialDelaySeconds: 10
        periodSeconds: 10

tcpSocket and exec variants

Instead of an HTTP check you can probe by opening a TCP port or running a command inside the container.

      readinessProbe:
        tcpSocket: { port: 5432 }                       # Postgres accepting connections
      livenessProbe:
        exec:
          command: ["pg_isready", "-U", "postgres"]     # Postgres liveness check

Tuning fields

These timing fields control how patient or aggressive a probe is before it declares success or failure.

   initialDelaySeconds  wait this long before first check
   periodSeconds        how often to check
   timeoutSeconds       per-check timeout
   successThreshold     consecutive passes to be considered healthy
   failureThreshold     consecutive fails before acting (restart/unready)

Why startupProbe exists (slow apps)

Apps with long boot times need a startupProbe so liveness checks don't kill them mid-startup in a crash loop.

Timeline (ASCII)

This is the order in which the probes activate over a pod's life, from boot to serving traffic to a possible restart.

Inspect

Use these commands to see probe results, restart counts, and which pods are currently in a Service's endpoints.

kubectl describe pod app        # see probe results + restart reasons
kubectl get pod app             # READY column, RESTARTS count
kubectl get endpoints <svc>     # readiness controls who is listed here

End-to-end example: Spring Boot PetClinic readiness gates traffic

Deploy the real Spring PetClinic REST app behind a Service with all three probes. Spring Boot Actuator marks the readiness group DOWN when its Postgres database is unreachable but keeps the liveness group UP, so breaking the DB drops the pod out of the Service endpoints without restarting it.

Graph legend — each node maps to a concrete PetClinic probe outcome:

Graph node	Maps to	What it does
readiness pass	`readinessProbe.httpGet /actuator/health/readiness` returns 200	Marks the pod Ready so the Service can route to it
petclinic pod in Service endpoints	the Service's `Endpoints` object	Lists the Ready pod IP as a backend
readiness fail (Postgres down)	Spring Boot readiness group flips `DOWN`	The probe returns 503; the pod becomes NotReady
pod removed from endpoints	kube-proxy drops the IP from `Endpoints`	Service stops sending traffic, pod keeps running
liveness fail	`livenessProbe.httpGet /actuator/health/liveness` returns 500	The kubelet kills and restarts the container

apiVersion: apps/v1
kind: Deployment
metadata: { name: petclinic }
spec:
  replicas: 1
  selector: { matchLabels: { app: petclinic } }
  template:
    metadata: { labels: { app: petclinic } }
    spec:
      containers:
        - name: petclinic
          image: springcommunity/spring-petclinic-rest:3.0.2
          ports: [{ containerPort: 8080 }]
          readinessProbe:                                       # DOWN when Postgres is unreachable
            httpGet: { path: /actuator/health/readiness, port: 8080 }
            periodSeconds: 5
          livenessProbe:                                        # stays UP even when the DB is down
            httpGet: { path: /actuator/health/liveness, port: 8080 }
            initialDelaySeconds: 30
            periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata: { name: petclinic }
spec:
  selector: { app: petclinic }
  ports: [{ port: 80, targetPort: 8080 }]

kubectl apply -f petclinic-probes.yaml
kubectl get endpoints petclinic            # pod IP listed once readiness passes
kubectl scale deploy postgres --replicas=0 # break the DB dependency
kubectl get pod -l app=petclinic           # READY 0/1, but still Running (liveness OK)
kubectl get endpoints petclinic            # pod IP removed -> Service sends no traffic
kubectl scale deploy postgres --replicas=1 # restore -> readiness passes, pod re-added

End-to-end flow

Probes gate a container's life: startup first, then readiness controls traffic and liveness controls restarts.

Graph legend — each node maps to a concrete probe field or PetClinic outcome:

Graph node	Maps to	What it does
startupProbe GET /actuator/health	`startupProbe.httpGet`	Gates the slow JVM boot before the other probes run
readiness and liveness begin	startup `failureThreshold` satisfied	Startup passed, so readiness and liveness checks start
kill and restart (JVM never booted)	startup exhausts `failureThreshold`	Container is killed and restarted if it never boots
readinessProbe /actuator/health/readiness	`readinessProbe.httpGet`	Decides whether the pod receives Service traffic
added to Service petclinic endpoints	the Service `Endpoints`	Ready pod IP becomes a routable backend
livenessProbe /actuator/health/liveness	`livenessProbe.httpGet`	Decides whether the container is restarted
kubelet restarts the container	the node's kubelet	Restarts the container on repeated liveness failure

Key takeaways

Liveness restarts; Readiness gates traffic; Startup protects slow boots.
A pod can be Live but not Ready (running, no traffic).
Probe via httpGet / tcpSocket / exec; tune with delay/period/thresholds.

Checklist

[ ] Added readiness + liveness probes to a pod
[ ] Forced readiness to fail and saw it drop from Service endpoints
[ ] Forced liveness to fail and saw RESTARTS increment
[ ] Used a startupProbe for a slow-starting app