Cluster Version Upgrade with kubeadm

Video: Day 34/40 — Perform a version upgrade on a Kubernetes cluster using kubeadm • 55 Days of Kubernetes playlist: • https://www.youtube.com/playlist?list=PLl4APkPHzsUUOkOv3i62UidrLmSB8DcGC

Published 21 Jun 2026

Key terms

Term	Meaning
kubeadm upgrade	Drives the cluster version bump
upgrade plan/apply	Preview / perform the upgrade
Skew policy	Upgrade one minor version at a time
drain/cordon/uncordon	The node maintenance flow
apt-mark hold	Pins package versions
kubelet	Upgraded per node after the plan applies

Problem & solution

Kubernetes ships a new minor release every ~4 months and only supports the last three. Clusters must be upgraded — safely, with no downtime, and in the right order. Skipping minors or upgrading nodes before the control plane breaks the cluster. kubeadm upgrade orchestrates this correctly.

Solution: Upgrade one minor at a time with kubeadm, control plane first (upgrade apply), then drain, upgrade, and uncordon each worker.

The analogy

You refit a busy port without ever closing it: you modernise the harbor master's office first, then take one berth out of service at a time, move its ships to other berths, refit it, and reopen it before touching the next berth. Because only one berth is ever shut, the port keeps handling cargo the whole time. Kubernetes upgrades the same way: bump the control plane first, then drain, upgrade, and uncordon each worker node one by one so the cluster never goes fully offline.

Where this fits in the cluster

The same cluster entities appear in every day's notes; the diagram below shows where this day's topic fits.

The golden rules

Break any of these and the upgrade can leave the cluster broken or unsupported, so treat them as hard constraints before you start.

   1. ONE minor at a time:   1.29 -> 1.30 -> 1.31  (never 1.29 -> 1.31)
   2. Control plane FIRST, then workers.
   3. components must stay within ONE minor of the api-server (kubelet skew: -3).
   4. Back up etcd BEFORE you start (see Day 35).

End-to-end: the upgrade flow

The diagram below shows the whole upgrade in order: the control plane goes first, then each worker is drained, upgraded, and made schedulable again one at a time, so the cluster never goes fully offline.

One minor at a time, control plane first, then nodes one by one.

Step 0 — point the package repo at the new minor (every node)

The community package repo (pkgs.k8s.io) is versioned per minor: the v1.29 repo only carries 1.29 packages. Before installing anything you must edit the repo file to the target minor on each node you upgrade (control plane and workers), or apt-get install kubeadm=1.30.x fails with "no installable candidate".

# change 1.29 -> 1.30 in the repo URL, then refresh the package list
sudo vi /etc/apt/sources.list.d/kubernetes.list   # .../core:/stable:/v1.29/... -> v1.30
sudo apt-get update
sudo apt-cache madison kubeadm                     # confirm 1.30.x is now available

Step 1 — first control-plane node

Start on the first control-plane node, because the control plane must lead the version bump. You install the new kubeadm, preview and apply the upgrade, then upgrade this node's own kubelet and kubectl.

# repo already bumped to v1.30 (Step 0); unhold + install the new kubeadm only
sudo apt-mark unhold kubeadm
sudo apt-get update && sudo apt-get install -y kubeadm=1.30.2-1.1
sudo apt-mark hold kubeadm

# see the plan, then apply
sudo kubeadm upgrade plan
sudo kubeadm upgrade apply v1.30.2        # upgrades apiserver/scheduler/etc + etcd

# now upgrade this node's kubelet + kubectl, then restart kubelet
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.30.2-1.1 kubectl=1.30.2-1.1
sudo apt-mark hold kubelet kubectl
sudo systemctl daemon-reload && sudo systemctl restart kubelet

Step 2 — additional control-plane nodes

Same as above but use kubeadm upgrade node (not apply):

sudo kubeadm upgrade node        # then upgrade kubelet/kubectl + restart as above

Step 3 — each worker (one at a time)

Only after the control plane is upgraded do you touch the workers, and just one at a time to keep capacity up. You drain it to move its pods elsewhere, upgrade the node, then allow scheduling again.

# from the control plane: evict workloads safely
kubectl drain <worker> --ignore-daemonsets --delete-emptydir-data

# on the worker: bump the repo to v1.30 (Step 0), then upgrade kubeadm, node config, kubelet
sudo vi /etc/apt/sources.list.d/kubernetes.list   # v1.29 -> v1.30, then apt-get update
sudo apt-mark unhold kubeadm && sudo apt-get install -y kubeadm=1.30.2-1.1 && sudo apt-mark hold kubeadm
sudo kubeadm upgrade node
sudo apt-mark unhold kubelet kubectl
sudo apt-get install -y kubelet=1.30.2-1.1 kubectl=1.30.2-1.1
sudo apt-mark hold kubelet kubectl
sudo systemctl daemon-reload && sudo systemctl restart kubelet

# back on the control plane: allow scheduling again
kubectl uncordon <worker>

Verify

After the upgrade, confirm every component actually moved to the target version and that nothing broke in the process.

kubectl version                       # server == target
kubectl get nodes                     # every node VERSION = v1.30.2, all Ready
kubectl get pods -A                    # nothing crashing after the upgrade

drain / cordon / uncordon

These three commands are the node-maintenance flow used in every worker upgrade. Here is exactly what each one does to the node.

   cordon    mark node unschedulable (no NEW pods); existing pods stay
   drain     cordon + EVICT existing pods (respecting PodDisruptionBudgets)
   uncordon  mark schedulable again (after the node is upgraded)

Common pitfalls

These are the mistakes that most often derail a kubeadm upgrade, each with the symptom it produces so you can recognise it quickly.

   - repo still on the old minor   -> apt can't find 1.30 packages; bump the
                                      repo file first (Step 0), on every node
   - skipping a minor              -> unsupported; upgrade one minor at a time
   - upgrading workers first       -> version skew breaks the kubelet/api-server
   - forgetting apt-mark hold      -> an unattended apt upgrade skews versions
   - no etcd backup first          -> a failed upgrade with no rollback path
   - drain hangs                   -> a PodDisruptionBudget blocks eviction; check PDBs
   - kubelet not restarted         -> node still reports the old version

End-to-end flow

A safe upgrade: back up etcd, do the control plane first, then drain and upgrade each worker.

Graph legend — each node is a real command or phase of the kubeadm upgrade:

Graph node	Maps to	What it does
Back up etcd first	`etcdctl snapshot save` (Day 35)	Gives a rollback point before any change
Install new kubeadm on the first control-plane node	`apt-get install -y kubeadm=1.30.2-1.1`	Brings in the new kubeadm only
kubeadm upgrade apply v1.30.x	`kubeadm upgrade apply`	Upgrades apiserver/scheduler/controller-manager + etcd
Upgrade kubelet and kubectl; restart kubelet	`apt-get install kubelet kubectl` + `systemctl restart kubelet`	Moves this node's components to the target version
Other control-plane nodes	`kubeadm upgrade node`	Upgrades remaining control-plane nodes
Per worker: kubectl drain	`kubectl drain <worker>`	Evicts pods respecting PodDisruptionBudgets
kubeadm upgrade node; upgrade kubelet; restart	`kubeadm upgrade node` + kubelet upgrade	Upgrades the drained worker
kubectl uncordon the worker	`kubectl uncordon <worker>`	Makes the node schedulable again
All nodes show the target version	the api-server	Confirms the upgrade completed cleanly

Key takeaways

Upgrade one minor at a time, control plane before workers.
kubeadm upgrade apply on the first control-plane node; upgrade node elsewhere.
Per node: install kubeadm -> upgrade -> install kubelet/kubectl -> restart.
drain before, uncordon after, for every worker.
Back up etcd first (Day 35); keep versions held between upgrades.

Checklist

[ ] Backed up etcd before starting
[ ] Bumped the package repo to the new minor and ran apt-cache madison kubeadm (each node)
[ ] Ran kubeadm upgrade plan and apply on the first control-plane node
[ ] Upgraded kubelet/kubectl and restarted the kubelet on each control-plane node
[ ] Drained, upgraded, and uncordoned each worker one at a time
[ ] Verified every node shows the target version and is Ready