-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stuck after reboot #1589
Comments
Hello @jknipper, thanks for raising this. Are you able to tell us since when (which release) did you notice this issue? It seems to be an issue with the disk, do you have a specific configuration for your instances? |
From what I can see in our alerts/logging it started in July or August on a regular basis. We are running the stable release and are updating all instances shortly after a new release is published. It's hard to see if this is on the hypervisor side or an issue with flatcar. There was no real change on how the instances are provisioned, maybe on VMWare side, I'll try to find out. |
All the instances reboot at the same time on the hypervisor? |
No, reboots are managed by some operator in Kubernetes and are pretty much random. What is interesting from that screenshot is that the filesystem check for that same device seem so succeed but the mounting afterwards fails. When looking at the release history of Flatcar, our observations seem to collide with the switch from kernel version 6.1.x to 6.6.x. but could also be a coincidence. |
Description
From time to time we see servers stuck in boot process after reboot. This only happens in rare cases after an OS update was applied.
Impact
The server is stuck after reboot and needs to be rebooted manually a second time to bring it up.
Environment and steps to reproduce
Expected behavior
The machine boots without interruption.
Additional information
From the attached screenshot of the server console it seems that the boot process got stuck while trying to mount sysroot.
This issue started some time ago and is hard to debug for us. Any suggestions how we could investigate further in this matter are greatly appreciated!
The text was updated successfully, but these errors were encountered: