Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solved: Request for help with debugging on Raspberry Pi - 64bit / arm64 #2193

Closed
egandro opened this issue Apr 11, 2021 · 21 comments
Closed
Assignees
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@egandro
Copy link

egandro commented Apr 11, 2021

Hello,

I need a bit help with debugging a not starting arm64bit kindes/node on Raspberry PI 64.

I am using the most recent 64bit (beta!) version of RaspiOS- https://downloads.raspberrypi.org/raspios_arm64/images/raspios_arm64-2021-04-09/

As kind doesn't provide us with official kindes/node image, I am using this https://hub.docker.com/r/rossgeorgiev/kind-node-arm64. This is - when posting the article here - about 1-2 days old having the most recent version of kubernetes (1.21 at the moment).

Please notice - the rossgerorgiev version is build and run on Ubuntu64 for the Pi - I am using the Raspberry Pi64.

I will attach a docker logs message and a kind log message with -v 99.

It would be so cool if you guys can push me in a direction on how to debug this.

(I tried to use the official image of rossgeorgiev and also to build my own version with rossgeorgiev patch to kind - but the logs are equivalent).

@egandro egandro added the kind/support Categorizes issue or PR as a support question. label Apr 11, 2021
@egandro
Copy link
Author

egandro commented Apr 11, 2021

$docker logs kind-control-plane

There is no further log output here until kind timeouts.

INFO: ensuring we can execute mount/umount even with userns-remap
INFO: remounting /sys read-only
INFO: making mounts shared
INFO: fix cgroup mounts for all subsystems
INFO: clearing and regenerating /etc/machine-id
Initializing machine ID from random generator.
INFO: setting iptables to detected mode: nft
INFO: Detected IPv4 address: 172.18.0.2
INFO: Detected IPv6 address: fc00:f853:ccd:e793::2
systemd 246.2-1ubuntu1 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
Detected virtualization docker.
Detected architecture arm64.

Welcome to �[1mUbuntu Groovy Gorilla (development branch)�[0m!

...

Edit: removed logfile because it didn't add any value.

@egandro
Copy link
Author

egandro commented Apr 11, 2021

$ kind create cluster -v 99 --image kindest/node # tried with selfbuild node or rossgeorgiev/kind-node-arm64

Creating cluster "kind" ...
 • Ensuring node image (kindest/node) 🖼  ...
DEBUG: docker/images.go:58] Image: kindest/node present locally
 ✓ Ensuring node image (kindest/node) 🖼
 • Preparing nodes 📦   ...
 ✓ Preparing nodes 📦 
 • Writing configuration 📜  ...
DEBUG: config/config.go:96] Using the following kubeadm config for node kind-control-plane:
apiServer:
  certSANs:
  - localhost
 ...

Edit: removed logfile because it didn't add any value.

@BenTheElder
Copy link
Member

As kind doesn't provide us with official kindes/node image, I am using this https://hub.docker.com/r/rossgeorgiev/kind-node-arm64. This is - when posting the article here - about 1-2 days old having the most recent version of kubernetes (1.21 at the moment).

1.21 will require a kind binary from HEAD (build from latest sources), as Kubernetes made a breaking change: #2189

@egandro
Copy link
Author

egandro commented Apr 11, 2021

Thanks for pointing me into that direction. Unfortunately I get the same error with rossgeorgiev/kind-node-arm64 1.21.
I will try 1.20.

@egandro
Copy link
Author

egandro commented Apr 11, 2021

rossgeorgiev/kind-node-arm64:1.20 doesn't work either with head.

Got the same error.

Edit: I will add --wait 30m if that doesn't work, I will try 1.19

@egandro
Copy link
Author

egandro commented Apr 11, 2021

I can confirm the following.
rossgeorgiev/kind-node-arm64:1.19, rossgeorgiev/kind-node-arm64:1.20, rossgeorgiev/kind-node-arm64:1.21

Do not work with kind "head" nor kind-v0.10.0

I get the same timeout as mentioned in the logfiles.

@BenTheElder
Copy link
Member

This suggest kubelet is still unhealthy for some other reason.

kind create cluster --retain will prevent cleanup and then kind export logs will give us the kubelet logs etc.

There's a lot of info that will provide that is missing here like the container runtime used etc.

Please also check the known issues page against your host configuration.

@egandro
Copy link
Author

egandro commented Apr 12, 2021

@BenTheElder Thank you!

I don't use this sentence very often - but this is a "whatever it takes" - because it's a 99% thing at this moment.

We want to get rid of AMD64/Xeons and use ARM64 - just energy reasons and for protecting the enviornment. So yes :)

I try the kind-head and the rossgeorgiev/kind-node-arm64:1.21...

@egandro
Copy link
Author

egandro commented Apr 12, 2021

@BenTheElder would it be ok for you if I apply the rossgeorgiev patch to main?

This is outdated and needs to be moved to master.

main...rosti:v0.9.0-build-node-image-binary

What this basically does, it adds a --type bindir option to the kind image builder. It allows us to use the official kubernetes binaries:

wget https://dl.k8s.io/v1.20.0/kubernetes-server-linux-arm64.tar.gz
tar zxf kubernetes-server-linux-arm64.tar.gz
kind build node-image --type bindir --kube-root kubernetes/server/bin

@BenTheElder
Copy link
Member

--type is deprecated so you can apply it locally but we won't be merging it. We will be shipping arm images at head shortly and I'll be focusing my energy there #2176

@egandro
Copy link
Author

egandro commented Apr 12, 2021

Here the logfiles:

$ kind-head create cluster --retain --image rossgeorgiev/kind-node-arm64:v1.21 
$ kind export logs 

https://pastebin.com/bM1z4eY9

$ docker start kind-control-plane
$ kind export logs

https://pastebin.com/Ycapr0us
(shortened this because 512kb limit)


I think the relevant part is this here:

kind-control-plane/kubelet.log:Apr 12 17:18:42 kind-control-plane systemd[1]: Started kubelet: The Kubernetes Node Agent.
kind-control-plane/kubelet.log:Apr 12 17:18:45 kind-control-plane kubelet[125]: Flag --fail-swap-on has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
kind-control-plane/kubelet.log:Apr 12 17:18:45 kind-control-plane kubelet[125]: E0412 17:18:45.319267     125 server.go:204] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory" path="/var/lib/kubelet/config.yaml"
kind-control-plane/kubelet.log:Apr 12 17:18:45 kind-control-plane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
kind-control-plane/kubelet.log:Apr 12 17:18:45 kind-control-plane systemd[1]: kubelet.service: Failed with result 'exit-code'.

@egandro
Copy link
Author

egandro commented Apr 12, 2021

It looks like we have no /var/lib/kubelet in the "rossgeorgiev/kind-node-arm64:v1.21"

pi@raspberrypi:~ $ docker exec -it kind-control-plane bash
root@kind-control-plane:/# ls -la /var/lib/kubelet
ls: cannot access '/var/lib/kubelet': No such file or directory
root@kind-control-plane:/# ls -la /var/lib/
total 48
drwxr-xr-x 12 root root 4096 Apr 12 17:15 .
drwxr-xr-x 11 root root 4096 Apr 12 17:15 ..
drwxr-xr-x  5 root root 4096 Apr 12 17:13 apt
drwx--x--x 11 root root 4096 Apr 12 17:14 containerd
drwxr-xr-x  7 root root 4096 Apr 12 17:15 dpkg
drwxr-xr-x  2 root root 4096 Jul 27  2020 misc
drwxr-xr-x  4 root root 4096 Apr 12 17:15 nfs
drwxr-xr-x  2 root root 4096 Apr 12 17:15 pam
drwxr-xr-x  3 root root 4096 Apr 12 17:15 polkit-1
drwx------  2 root root 4096 Aug 26  2020 private
drwxr-xr-x  6 root root 4096 Apr 12 17:15 systemd
drwxr-xr-x  3 root root 4096 Apr 12 17:15 ucf

@egandro
Copy link
Author

egandro commented Apr 12, 2021

Cool - I give "bentheelder/kind-node:arm-test" a test!

@egandro
Copy link
Author

egandro commented Apr 12, 2021

@BenTheElder please have a look

$ kind-head create cluster -v 99 --image bentheelder/kind-node:arm-test
$ kind export logs

I uploaded the logfile here: https://workupload.com/file/DqqjxYCRcfm

It also fails - however - we have the config :)

$ docker exec -it kind-control-plane  ls -la /var/lib/kubelet/config.yaml
-rw-r--r-- 1 root root 1031 Apr 12 17:52 /var/lib/kubelet/config.yaml

@BenTheElder
Copy link
Member

kind-control-plane/journal.log:Apr 12 17:55:39 kind-control-plane kubelet[757]: F0412 17:55:39.429621 757 kubelet.go:1350] Failed to start ContainerManager system validation failed - Following Cgroup subsystem not mounted: [memory]

is your distro missing cgroups ...?

@egandro
Copy link
Author

egandro commented Apr 21, 2021

is your distro missing cgroups ...?

How do I check this?

I am currently using the latest rasberrypi os (still beta on 64bit): https://downloads.raspberrypi.org/raspios_arm64/images/raspios_arm64-2021-04-09/

Happy to check the kernel config and to recompile it - can you point me in the right direction?

@egandro
Copy link
Author

egandro commented Apr 21, 2021

It looks like you pointed me to the right tool:

https://www.raspberrypi.org/forums/viewtopic.php?t=203128

You need to enable this on Pi via /boot/cmdline.txt and some quriks:

https://blog.codybunch.com/2020/07/31/Fixing-cgroup-memory-on-Raspbian-Buster-for-Kernel-54x/

Happy to test / document this.

@BenTheElder
Copy link
Member

that last link looks like what you need -- most distros include cgroups mounted OOTB now, and you shouldn't need to compile the kernel for any of them, "just" changing the boot cmdline.

hopefully it will work after that 🤞

@egandro
Copy link
Author

egandro commented Apr 22, 2021

I can confirm this works.

@BenTheElder big thx! I spend you a beer if you are in DE or DK.

Here fore some documentation.

# RaspianOS - 64bit
# get a 64bit beta version from here -  https://downloads.raspberrypi.org/raspios_arm64/images/raspios_arm64-2021-04-09/
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo sed -i '$ s/$/ cgroup_memory=1 swapaccount=1 cgroup_enable=memory dwc_otg.lpm_enable=0/' /boot/cmdline.txt
$ sudo rpi-update
$ sudo reboot

To install kind:

$ docker pull bentheelder/kind-node:arm-test
$ kind-head create cluster --image bentheelder/kind-node:arm-test

@egandro egandro changed the title Request for help with debugging on Raspberry Pi - 64bit / arm64 Solved: Request for help with debugging on Raspberry Pi - 64bit / arm64 Apr 22, 2021
@BenTheElder
Copy link
Member

Thanks for following up! that looks right.

This is probably worth a guide somewhere, perhaps after we officially land OOTB arm support 😅

@BenTheElder
Copy link
Member

https://github.com/kubernetes-sigs/kind/releases/tag/v0.11.0#contributors is out now. we should probably still add an rpi guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

2 participants