diff --git a/docs/book/src/user/troubleshooting.md b/docs/book/src/user/troubleshooting.md index 2039fa18c1d7..11ef656fc24f 100644 --- a/docs/book/src/user/troubleshooting.md +++ b/docs/book/src/user/troubleshooting.md @@ -1,5 +1,23 @@ # Troubleshooting +## Node bootstrap failures when using CABPK with cloud-init + +Failures during Node bootstrapping can have a lot of different causes. For example, Cluster API resources might be +misconfigured or there might be problems with the network. The following steps describe how bootstrap failures can +be troubleshooted systematically. + +1. Access the Node via ssh. +1. Take a look at cloud-init logs via `less /var/log/cloud-init-output.log` or `journalctl -u cloud-init --since "1 day ago"`. + (Note: cloud-init persists logs of the commands it executes (like kubeadm) only after they have returned.) +1. It might also be helpful to take a look at `journalctl --since "1 day ago"`. +1. If you see that kubeadm times out waiting for the static Pods to come up, take a look at: + 1. containerd: `crictl ps -a`, `crictl logs`, `journalctl -u containerd` + 1. Kubelet: `journalctl -u kubelet --since "1 day ago"` + (Note: it might be helpful to increase the Kubelet log level by e.g. setting `--v=8` via + `systemctl edit --full kubelet && systemctl restart kubelet`) +1. If Node bootstrapping consistently fails and the kubeadm logs are not verbose enough, the `kubeadm` verbosity + can be increased via `KubeadmConfigSpec.Verbosity`. + ## Labeling nodes with reserved labels such as `node-role.kubernetes.io` fails with kubeadm error during bootstrap Self-assigning Node labels such as `node-role.kubernetes.io` using the kubelet `--node-labels` flag