Troubleshooting

Node bootstrap failures when using CABPK with cloud-init

Failures during Node bootstrapping can have a lot of different causes. For example, Cluster API resources might be misconfigured or there might be problems with the network. The following steps describe how bootstrap failures can be troubleshooted systematically.

Access the Node via ssh.
Take a look at cloud-init logs via less /var/log/cloud-init-output.log or journalctl -u cloud-init --since "1 day ago". (Note: cloud-init persists logs of the commands it executes (like kubeadm) only after they have returned.)
It might also be helpful to take a look at journalctl --since "1 day ago".
If you see that kubeadm times out waiting for the static Pods to come up, take a look at:
1. containerd: crictl ps -a, crictl logs, journalctl -u containerd
2. Kubelet: journalctl -u kubelet --since "1 day ago" (Note: it might be helpful to increase the Kubelet log level by e.g. setting --v=8 via systemctl edit --full kubelet && systemctl restart kubelet)
If Node bootstrapping consistently fails and the kubeadm logs are not verbose enough, the kubeadm verbosity can be increased via KubeadmConfigSpec.Verbosity.

Labeling nodes with reserved labels such as `node-role.kubernetes.io` fails with kubeadm error during bootstrap

Self-assigning Node labels such as node-role.kubernetes.io using the kubelet --node-labels flag (see kubeletExtraArgs in the CABPK examples) is not possible due to a security measure imposed by the NodeRestriction admission controller that kubeadm enables by default.

Assigning such labels to Nodes must be done after the bootstrap process has completed:

kubectl label nodes <name> node-role.kubernetes.io/worker=""

For convenience, here is an example one-liner to do this post installation

kubectl get nodes --no-headers -l '!node-role.kubernetes.io/master' -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}' | xargs -I{} kubectl label node {} node-role.kubernetes.io/worker=''

Cluster API with Docker

When provisioning workload clusters using Cluster API with Docker infrastructure, provisioning might be stuck:

if there are stopped containers on your machine from previous runs. Clean unused containers with docker rm -f .
if the docker space on your disk is being exhausted
- Run docker system df to inspect the disk space consumed by Docker resources.
- Run docker system prune --volumes to prune dangling images, containers, volumes and networks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

troubleshooting.md

troubleshooting.md

Troubleshooting

Node bootstrap failures when using CABPK with cloud-init

Labeling nodes with reserved labels such as `node-role.kubernetes.io` fails with kubeadm error during bootstrap

Cluster API with Docker

Files

troubleshooting.md

Latest commit

History

troubleshooting.md

File metadata and controls

Troubleshooting

Node bootstrap failures when using CABPK with cloud-init

Labeling nodes with reserved labels such as node-role.kubernetes.io fails with kubeadm error during bootstrap

Cluster API with Docker

Labeling nodes with reserved labels such as `node-role.kubernetes.io` fails with kubeadm error during bootstrap