Skip to content

CRI Purge - Cleanup of Cached Kubernetes CRI Images

Notifications You must be signed in to change notification settings

reefland/cri-purge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

CRI Purge - Cleanup Cached Kubernetes Container Images

CRI Purge, is a BASH script I wrote to help cleanup disk space of cached Kubernetes container images. The script interacts with crictl images command to generate a list of cached images. Unlike crictl images --prune option which will simply delete all cached images (requiring all active images to be downloaded again on next pod restart), the cri-purge.sh script tries to add a little intelligence to how it purges images.

  • cri-purge.sh will process each image's list of cached versions in a best effort to determine semantic version from oldest to newest (v1.7.1, v1.8.0, v1.8.1, v1.8.2, v1.9.0) with the goal to determine and retain only the newest cached version downloaded -- Keeping the image most likely to be needed on pod restart.

  • Images with multiple tag <none> (dangling) can optionally be purged. At least one image will be retained. If one of the images hashes can be associated to a running container it be selected to be retained and remainder of the images purged.

  • cri-purge.sh makes no attempt to determine if the cached images are still deployed. If you have uninstalled an application, its cached image may still be retained. cri-purge.sh job is just to determine the latest of whatever is currently cached, keep the latest and purge the older version(s).

  • All interactions with images is via crictl. The cri-purge.sh script does not do any kind of direct manipulation of images stored or any other file system changes.

Keep in mind the latest cached downloaded image is specific to an individual node. Other nodes may have the same version or newer or older. cri-purge.sh would be executed on each node individually.

  • Each nodes perspective on what the latest version is will be dependant on how pods were scheduled in the past.

Lastly, not all containers use semantic versioning. Worst case, if cri-purge.sh gets it wrong and the latest version is purged, then the image is simply downloaded again upon next pod restart. No worse than using crictl images --prune.


Installation

curl -LO https://raw.githubusercontent.com/reefland/cri-purge/master/cri-purge.sh

sudo install -o root -g root -m 0755 cri-purge.sh  /usr/local/sbin/cri-purge

Usage

$ cri-purge

* ERROR: ROOT privilege required to access CRICTL binaries.

  cri-purge.sh | Version: 0.1.2 | 06/10/2024 | Richard J. Durso

  List and Purge downloaded cached images from containerd.
  -----------------------------------------------------------------------------

  This script requires sudo access to CRICTL (or CRIO-STATUS for OpenShift) to
  obtain a list of cached downloaded images and remove specific older images.
  It will do best effort to honor semantic versioning always leaving the newest
  version of the downloaded image and only purge previous version(s).

  -h,   --help            : This usage statement.
  -dp,  --dry-run         : Dry run --purge, Do not actually purge any images.
  -dpd, --dry-run-dangling: Same as --dry-run, but include dangling images.
  -l,   --list            : List cached images and purgable versions.
  -p,   --purge           : List images and PURGE/PRUNE older versioned images.
  -pd,  --purge-dangling  : Same as --purge, but include dangling images.
  -s,   --show-dangling   : List dangling images with '<none>' tag.

Kubernetes Garbage Collection

Kubelet which is the primary node agent that runs on each node has two parameters which can help maintain disk space:

  • --image-gc-high-threshold
  • --image-gc-low-threshold

By default these ranges are set to 85% and 80%, meaning when the disk is 85% full start garbage collection of images until space drops below 80%. These values are too high for my comfort, I reduced them to 65% and 50% to keep them well below Prometheus monitoring levels.

I still use the cri-purge.sh script to review which images are cached and manually execute it when needed, such as while doing node maintenance before applying Ubuntu patches.


To list Cached Images

sudo cri-purge --list

The output will show the Total Images cached (all versions) and Unique Images Names cached:

Total Images: 129 Unique Images Names: 67

The output will then present the images. It will indicate the version tag to KEEP and indicate which version tags could be purged. Below shows where Grafana is image 7 of 67 Unique Image Names:

7 / 67 : Image: docker.io/grafana/grafana - Keep TAG: 9.3.1 (97.9MB)
- Purgeable TAG: 9.1.4 (92.4MB)
- Purgeable TAG: 9.2.3 (108MB)
- Purgeable TAG: 9.2.4 (106MB)
- Purgeable TAG: 9.3.0 (97.8MB)

Images that do not have older versions to purge will simply list the current version to KEEP:

62 / 67 : Image: registry.k8s.io/sig-storage/csi-attacher - Keep TAG: v4.0.0 (23.8MB)
63 / 67 : Image: registry.k8s.io/sig-storage/csi-node-driver-registrar - Keep TAG: v2.5.1 (9.13MB)
64 / 67 : Image: registry.k8s.io/sig-storage/csi-provisioner - Keep TAG: v3.3.0 (25.5MB)
65 / 67 : Image: registry.k8s.io/sig-storage/csi-resizer - Keep TAG: v1.6.0 (24.1MB)
66 / 67 : Image: registry.k8s.io/sig-storage/csi-snapshotter - Keep TAG: v6.1.0 (23.9MB)

To purge Cached Images

sudo cri-purge --purge

The output will look much the same as list however, this time it will indicate the version tag(s) PURGED:

7 / 67 : Image: docker.io/grafana/grafana - Keep TAG: 9.3.1 (97.9MB)
- Purge TAG: 9.1.4 (92.4MB)
- Purge TAG: 9.2.3 (108MB)
- Purge TAG: 9.2.4 (106MB)
- Purge TAG: 9.3.0 (97.8MB)

Here cri-purge.sh has instructed crictl to remove older Grafana images 9.1.4, 9.2.3, 9.2.4 and 9.3.0 but keep image tag 9.3.1.

At the end cri-purge.sh tries to estimate how much disk space was cleaned up:

Disk Space Change:  9.78G

To purge Cached and Dangling Images

sudo cri-purge --purge-dangling

Dangling images have a <none> tag and not a version number tag. The output will look much the same as purge however, this time it will include processing of dangling images as wall as versioned images.

39 / 67 : Image: quay.io/ceph/ceph - Keep TAG: 3c937764e6f5d (446MB)
- Purge TAG: c250cbc76d9d8 (446MB)
- Purge TAG: 92ed5260ce1f6 (446MB)
- Purge TAG: 9476b199af81e (446MB)

Here cri-purge.sh has checked crictl and determined hash 3c937764e6f5d is associated with a running container and will be retained and the remaining hashes are purged.


To dry-run or dry-run-dangling

Dry run will go through all the steps to determine images to be purged, but will NOT actually purge any images.

sudo cri-purge --dry-run
  • Dry run of versioned image purge process.
sudo cri-purge --dry-run-dangling
  • Dry run of dangling images (tag: <none>) purge process.

The dry-run process will include header output of DRY-RUN - No images will be purged.


Using it with OpenShift (Code Ready Containers)

The steps below have been verified on OpenShift CRC 4.13 on RedHat CoreOS:

CRC version: 2.27.0+71615e
OpenShift version: 4.13.12
Podman version: 4.4.4

Snapshot the CRC VM:

sudo virsh --connect=qemu:///system detach-device-alias crc fs0 --live

sleep 3

sudo virsh --connect=qemu:///system snapshot-create-as --atomic --domain crc --name backup

Log-in to the CRC host:

ssh -F/dev/null -i ~/.crc/machines/crc/id_ecdsa [email protected]

Install the script utility as described above. Then purge unversioned (dangling) images:

sudo cri-purge --purge

Shutdown the CRC vm, free up the released disk space, then start it back:

crc stop

sudo virt-sparsify --in-place ~/.crc/machines/crc/crc.qcow2

crc start