Upgrading
Note: Kubernetes 1.24 includes some major changes. It is important to read and understand the changes before you attempt to upgrade.
Some of the important changes to note for this release:
- The
kubernetes-master
charm has been renamedkubernetes-control-plane
in line with upstream inclusive naming initiatives. This upgrade will take you through the process of upgrading to the new charm, but because Juju will not rename a deployed unit, it will still appear in your model askubernetes-master
. - All the charms have relocated from the old Juju Charm Store to the new
Charmhub. This means that upgrading each charm will
require the use of
--switch
during the upgrade, as detailed in the following instructions. - The default CNI for new installs is now Calico, instead of Flannel. The Flannel charm is still supported and you can upgrade to the lastest version as normal.
Before you begin
As with all upgrades, there is a possibility that there may be unforeseen difficulties. It is highly recommended that you make a backup of any important data, including any running workloads. For more details on creating backups, see the separate documentation on backups.
You should also make sure:
- The machine from which you will perform the backup has sufficient internet access to retrieve updated software
- Your cluster is running normally
- Your Juju client and controller/models are running the latest versions (see the Juju docs)
- You read the Upgrade notes to see if any caveats apply to the versions you are upgrading to/from
- You read the Release notes for the version you are upgrading to, which will alert you to any important changes to the operation of your cluster
- You read the Upstream release notes for details of deprecation notices and API changes for Kubernetes 1.24 which may impact your workloads.
It is also important to understand that Charmed Kubernetes will only upgrade and if necessary migrate, components relating specifically to elements of Kubernetes installed and configured as part of Charmed Kubernetes. This may not include any customised configuration of Kubernetes, or user generated objects (e.g. storage classes) or deployments which rely on deprecated APIs.
Infrastructure updates
The applications which run alongside the core Kubernetes components can be upgraded at any time. These applications are widely used and may frequently receive upgrades outside of the cycle of new releases of Kubernetes.
This includes:
- containerd
- easyrsa
- etcd
- calico, flannel or other CNI charms
Note that this may include other applications which you may have installed, such as Elasticsearch, Prometheus, Nagios, Helm, etc.
Upgrading Containerd
By default, Charmed Kubernetes 1.15 and later use Containerd as the container runtime. This subordinate charm can be upgraded with the command:
juju upgrade-charm containerd --switch ch:containerd --channel 1.24/stable
Migrating to Containerd
Upstream support for the Docker container runtime was removed in the 1.24 release. Thus, the
docker
subordinate charm will no longer function from Charmed Kubernetes 1.24 onwards.
If you are upgrading from a version of Charmed Kubernetes that uses the docker
subordinate charm for the container runtime, transition to containerd
by following
the steps outlined in this section of the upgrade notes.
Upgrading etcd
As etcd manages critical data for the cluster, it is advisable to create a snapshot of this data before running an upgrade. This is covered in more detail in the documentation on backups, but the basic steps are:
1. Run the snapshot action on the charm
juju run-action etcd/0 snapshot --wait
You should see confirmation of the snapshot being created, and the command needed to download the snapshot from the etcd unit. See the following truncated, example output:
...
copy:
cmd: juju scp etcd/40:/home/ubuntu/etcd-snapshots/etcd-snapshot-2020-11-18-21.37.11.tar.gz
.
...
2. Fetch a local copy of the snapshot
You can use the juju scp
command from the output above to download a local copy. For example:
juju scp etcd/40:/home/ubuntu/etcd-snapshots/etcd-snapshot-2020-11-18-21.37.11.tar.gz .
Substitute in your own etcd unit number and filename, or copy and paste the command from the previous output. Remember to add the ` .` at the end to copy to your local directory!
3. Upgrade the charm
You can now upgrade the etcd charm:
juju upgrade-charm etcd --switch ch:etcd --channel 1.24/stable
4. Upgrade etcd
To upgrade etcd itself, you will need to set the etcd charm’s channel config.
For 1.24, the etcd charm is configured to use the 3.4/stable channel as in the previous release, but it is worth checking the configuration:
juju config etcd
If you need to update it, you can set the etcd charm’s channel config:
juju config etcd channel=3.4/stable
Upgrading additional components
The other infrastructure applications can be upgraded by running the upgrade-charm
command. However, unlike previous upgrades, you will need to use --switch
to reset the source to charmhub.io:
juju upgrade-charm easyrsa --switch ch:easyrsa --channel 1.24/stable
Any other infrastructure charms should be upgraded in a similar way. For example, if you are using the flannel CNI:
juju upgrade-charm flannel --switch ch:flannel --channel 1.24/stable
Note: Some services may be briefly interrupted during the upgrade process. Upgrading your CNI (e.g. flannel) will cause a small amount of network downtime. Upgrading easyrsa will not cause any downtime. The behaviour of other components you have added to your cluster may vary - check individual documentation for these charms for more information on upgrades.
Upgrading Kubernetes
Before you upgrade the Kubernetes components, you should be aware of the exact release you wish to upgrade to.
The Kubernetes charms use snap channels to manage the version of Kubernetes to use. Channels are explained in more detail in the official snap documentation, but in terms of Kubernetes all you need to know are the major and minor version numbers and the ‘risk-level’:
Risk level | Description |
---|---|
stable | The latest stable released version of Kubernetes |
candidate | Release candidate versions of Kubernetes |
beta | Latest alpha/beta of Kubernetes for the specified release |
edge | Nightly builds of the specified release of Kubernetes |
For most use cases, it is strongly recommended to use the ‘stable’ version of charms.
Upgrading the kube-api-loadbalancer
A core part of Charmed Kubernetes is the kubeapi-load-balancer component. To ensure API service continuity this upgrade should precede any upgrades to the Kubernetes master and worker units.
juju upgrade-charm kubeapi-load-balancer --switch ch:kubeapi-load-balancer --channel 1.24/stable
The load balancer itself is based on NGINX, and the version reported by juju status
is
that of NGINX rather than Kubernetes. Unlike the other Kubernetes components, there
is no need to set a specific channel or version for this charm.
Upgrading the kubernetes-master units
As noted at the beginning of this page, kubernetes-master
has been renamed kubernetes-control-plane
. Following the upgrade, the deployed charm will STILL be known as kubernetes-master
to Juju, as it is impossible to change the name of deployed charms.
To start upgrading the Kubernetes master units, first upgrade the charm:
juju upgrade-charm kubernetes-master --switch ch:kubernetes-control-plane --channel 1.24/stable
Once the charm has been upgraded, it can be configured to select the desired Kubernetes channel, which takes the form Major.Minor/risk-level
. This is then passed as a configuration option to the charm. So, for example, to select the stable 1.24 version of Kubernetes, you would enter:
juju config kubernetes-master channel=1.24/stable
Note that although the kubernetes-control-plane
charm was used, it is still referred to as kubernetes-master
by Juju, and you will need to use that name for any config, scaling or relation operations.
Once the desired version has been configured, the upgrades should be performed. This is done by running the upgrade
action on each master unit in the cluster:
juju run-action kubernetes-master/0 upgrade
juju run-action kubernetes-master/1 upgrade
If you have more master units in your cluster, you should continue and run this process on every one of them.
You can check the progress of the upgrade by running:
juju status | grep master
Ensure that all the master units have upgraded and are reporting normal status before continuing to upgrade the worker units.
ceph-storage relation deprecated
The kubernetes-control-plane:ceph-storage
relation has been deprecated. After
upgrading the kubernetes-control-plane charm, the charm may enter blocked
status with the message:
ceph-storage relation deprecated, use ceph-client instead
.
If you see this message, you can resolve it by removing the ceph-storage relation:
juju remove-relation kubernetes-control-plane:ceph-storage ceph-mon
Upgrading the kubernetes-worker units
For a running cluster, there are two different ways to proceed:
- Blue-green upgrade - This requires more resources, but should ensure a safe, zero-downtime transition of workloads to an updated cluster
- In-place upgrade - this simply upgrades the workers in situ, which may involve some service interruption but doesn’t require extra resources
Both methods are outlined below. The blue-green method is recommended for production systems.
Blue-green upgrade
To begin, upgrade the kubernetes-worker charm itself:
juju upgrade-charm kubernetes-worker --switch ch:kubernetes-worker --channel 1.24/stable
Next, run the command to configure the workers for the version of Kubernetes you wish to run (as you did previously for the master units). For example:
juju config kubernetes-worker channel=1.24/stable
Now add additional units of the kubernetes-worker. You should add as many units as you are replacing. For example, to add three additional units:
juju add-unit kubernetes-worker -n 3
This will create new units to migrate the existing workload to. As you configured the version prior to adding the units, they will be using the newly-selected version of Kubernetes.
Now we can pause the existing workers, which will cause the workloads to migrate to the new units recently added. A worker unit is paused by running the corresponding action on that unit:
juju run-action kubernetes-worker/0 pause
juju run-action kubernetes-worker/1 pause
juju run-action kubernetes-worker/2 pause
...
Continue until all the ‘old’ units have been paused. You can check on the workload status by running the command:
kubectl get pod -o wide
Once the workloads are running on the new units, it is safe to remove the old units:
juju remove-unit kubernetes-worker/0
Removing these units from the model will also release the underlying machines/instances they were running on, so no further clean up is required.
Note: A variation on this method is to add, pause, remove and recycle units one at a time. This reduces the resource overhead to a single extra instance.
In-place upgrade
To proceed with an in-place upgrade, first upgrade the charm itself:
juju upgrade-charm kubernetes-worker --switch ch:kubernetes-worker --channel 1.24/stable
Next, run the command to configure the workers for the version of Kubernetes you wish to run (as you did previously for the control-plane units). For example:
juju config kubernetes-worker channel=1.24/stable
All the units can now be upgraded by running the upgrade
action on each one:
juju run-action kubernetes-worker/0 upgrade
juju run-action kubernetes-worker/1 upgrade
...
Upgrading the Machine’s Series
All of the charms support upgrading the machine’s series via Juju.
As each machine is upgraded, the applications on that machine will be stopped and the unit will
go into a blocked
status until the upgrade is complete. For the worker units, pods will be drained
from the node and onto one of the other nodes at the start of the upgrade, and the node will be removed
from the pool until the upgrade is complete.
Verify an Upgrade
Once an upgrade is complete and units settle, the output from:
juju status
… should indicate that all units are active and the correct version of Kubernetes is running.
It is recommended that you run a cluster validation to ensure that the cluster is fully functional.
Known Issues
A current bug in Kubernetes could prevent the upgrade from properly deleting old pods. You can see such an issue here:
kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default nginx-ingress-kubernetes-worker-controller-r8d2v 0/1 Terminating 0 17m
ingress-nginx-kubernetes-worker default-http-backend-kubernetes-worker-5d9bb77bc5-76c8w 1/1 Running 0 10m
ingress-nginx-kubernetes-worker nginx-ingress-controller-kubernetes-worker-5dcf47fc4c-q9mh6 1/1 Running 0 10m
kube-system heapster-v1.6.0-beta.1-6db4b87d-phjvb 4/4 Running 0 16m
kube-system kube-dns-596fbb8fbd-bp8lz 3/3 Running 0 18m
kube-system kubernetes-dashboard-67d4c89764-nwxss 1/1 Running 0 18m
kube-system metrics-server-v0.3.1-67bb5c8d7-x9nzx 2/2 Running 0 17m
kube-system monitoring-influxdb-grafana-v4-65cc9bb8c8-mwvcm 2/2 Running 0 17m
In this case the nginx-ingress-kubernetes-worker-controller-r8d2v
has been stuck in the Terminating
state for roughly 10 minutes. The workaround for such a problem is to force a deletion:
kubectl delete po/nginx-ingress-kubernetes-worker-controller-r8d2v --force --grace-period=0
This will result in output similar to the following:
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
pod "nginx-ingress-kubernetes-worker-controller-r8d2v" force deleted
You should verify that the pod has been sucessfully removed:
kubectl get po --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ingress-nginx-kubernetes-worker default-http-backend-kubernetes-worker-5d9bb77bc5-76c8w 1/1 Running 0 11m
ingress-nginx-kubernetes-worker nginx-ingress-controller-kubernetes-worker-5dcf47fc4c-q9mh6 1/1 Running 0 11m
kube-system heapster-v1.6.0-beta.1-6db4b87d-phjvb 4/4 Running 0 17m
kube-system kube-dns-596fbb8fbd-bp8lz 3/3 Running 0 19m
kube-system kubernetes-dashboard-67d4c89764-nwxss 1/1 Running 0 19m
kube-system metrics-server-v0.3.1-67bb5c8d7-x9nzx 2/2 Running 0 18m
kube-system monitoring-influxdb-grafana-v4-65cc9bb8c8-mwvcm 2/2 Running 0 18m