Failures during Node bootstrapping can have a lot of different causes. For example, Cluster API resources might be misconfigured or there might be problems with the network. The following steps describe how bootstrap failures can be troubleshooted systematically.
- Access the Node via ssh.
- Take a look at cloud-init logs via
less /var/log/cloud-init-output.log
orjournalctl -u cloud-init --since "1 day ago"
. (Note: cloud-init persists logs of the commands it executes (like kubeadm) only after they have returned.) - It might also be helpful to take a look at
journalctl --since "1 day ago"
. - If you see that kubeadm times out waiting for the static Pods to come up, take a look at:
- containerd:
crictl ps -a
,crictl logs
,journalctl -u containerd
- Kubelet:
journalctl -u kubelet --since "1 day ago"
(Note: it might be helpful to increase the Kubelet log level by e.g. setting--v=8
viasystemctl edit --full kubelet && systemctl restart kubelet
)
- containerd:
- If Node bootstrapping consistently fails and the kubeadm logs are not verbose enough, the
kubeadm
verbosity can be increased viaKubeadmConfigSpec.Verbosity
.
Labeling nodes with reserved labels such as node-role.kubernetes.io
fails with kubeadm error during bootstrap
Self-assigning Node labels such as node-role.kubernetes.io
using the kubelet --node-labels
flag
(see kubeletExtraArgs
in the CABPK examples)
is not possible due to a security measure imposed by the
NodeRestriction
admission controller
that kubeadm enables by default.
Assigning such labels to Nodes must be done after the bootstrap process has completed:
kubectl label nodes <name> node-role.kubernetes.io/worker=""
For convenience, here is an example one-liner to do this post installation
kubectl get nodes --no-headers -l '!node-role.kubernetes.io/master' -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}' | xargs -I{} kubectl label node {} node-role.kubernetes.io/worker=''
When provisioning workload clusters using Cluster API with Docker infrastructure, provisioning might be stuck:
-
if there are stopped containers on your machine from previous runs. Clean unused containers with docker rm -f .
-
if the docker space on your disk is being exhausted
- Run docker system df to inspect the disk space consumed by Docker resources.
- Run docker system prune --volumes to prune dangling images, containers, volumes and networks.