Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minikube start fails with btrfs #12569

Closed
dabljues opened this issue Sep 25, 2021 · 11 comments
Closed

minikube start fails with btrfs #12569

dabljues opened this issue Sep 25, 2021 · 11 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. triage/duplicate Indicates an issue is a duplicate of other open issue.

Comments

@dabljues
Copy link

Steps to reproduce the issue:

  1. Have a working docker installation
  2. Install minikube
  3. minikube start

Run minikube logs --file=logs.txt and drag and drop the log file into this issue

Full output of failed command if not minikube start:

minikube start

😄 minikube v1.23.2 on Arch
✨ Automatically selected the docker driver. Other choices: ssh, none
👍 Starting control plane node minikube in cluster minikube
🚜 Pulling base image ...
💾 Downloading Kubernetes v1.22.2 preload ...
> preloaded-images-k8s-v13-v1...: 511.84 MiB / 511.84 MiB 100.00% 78.83 Mi
🔥 Creating docker container (CPUs=2, Memory=7800MB) ...
🐳 Preparing Kubernetes v1.22.2 on Docker 20.10.8 ...
▪ Generating certificates and keys ...
▪ Booting up control plane ...
💢 initialization failed, will try again: wait: /bin/bash -c "sudo env PATH=/var/lib/minikube/binaries/v1.22.2:$PATH kubeadm init --config /var/tmp/minikube/kubeadm.yaml --ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests,DirAvailable--var-lib-minikube,DirAvailable--var-lib-minikube-etcd,FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml,FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml,FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml,FileAvailable--etc-kubernetes-manifests-etcd.yaml,Port-10250,Swap,Mem,SystemVerification,FileContent--proc-sys-net-bridge-bridge-nf-call-iptables": Process exited with status 1
stdout:
[init] Using Kubernetes version: v1.22.2
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[certs] Using certificateDir folder "/var/lib/minikube/certs"
[certs] Using existing ca certificate authority
[certs] Using existing apiserver certificate and key on disk
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [localhost minikube] and IPs [192.168.49.2 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [localhost minikube] and IPs [192.168.49.2 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
	timed out waiting for the condition

This error is likely caused by:
	- The kubelet is not running
	- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
	- 'systemctl status kubelet'
	- 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.

Here is one example how you may list all Kubernetes containers running in docker:
	- 'docker ps -a | grep kube | grep -v pause'
	Once you have found the failing container, you can inspect its logs with:
	- 'docker logs CONTAINERID'

stderr:
[WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher

So now I'll list the things I've tried:

  1. Obviously reinstall, reboot etc.
  2. Install kubelet, kubectl, event kubeadm + enable and/or start kubelet service
  3. Checked with different docker storage drivers (my system is btrfs, so it was btrfs by default, changed to overlay2)
  4. Checked with different cgroup drivers - systemd, cgroupfs (both docker and kubelet)
  5. Tried to run minikube start with older k8s versions like 1.21.0, 1.20.0 etc.

Then, I went and installed everything on a virtual machine, same OS configuration etc. There I just installed docker, enabled its service, ran minikube start and boom, everything worked (same version of docker, same filesystem, same minikube version etc). I didn't need to install kubelet nor kubectl. The only thing that was different was the kernel - I have 5.14.7, on VM I had 5.13.9.

So I even downgraded the kernel, rebooted, restarted the docker service (so I could see the downgraded kernel under docker info). Same thing happens.

I don't know if that matters (because I've tested, and you don't need kubelet installed to run minikube start), but after I installed kubelet, I was checking the journalctl frequenty, I spotted this error occurring all the time:

failed to collect filesystem stats - rootDiskErr: could not stat "/var/lib/docker/overlay2/<container_id>

This happens on minikube stdout when I have enabled kubelet.service, but I don't start it (or I stop it basically)

"Failed to start ContainerManager" err="failed to get rootfs info: failed to get device for dir \"/var/lib/kubelet\"

I've been debugging this for 8 straight hours now, searched all the issues, SO questions, no luck, at least for me (though many of them are open since a long time ago). Do any of you guys know what may be happening here? If there's anything that I forgot to attach here, some logs, system info - I can provide

@RA489
Copy link

RA489 commented Sep 27, 2021

/kind support

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Sep 27, 2021
@slabko
Copy link

slabko commented Sep 29, 2021

I have very similar situation on Fedora, two machines: desktop and notebook. Minibike works perfectly on desktop configured couple of weeks ago. Today I tried to configure it on my laptop and it fails to start. When I called docker exec minikube sh -c "journalctl -xeu kubelet", I noticed this message

Sep 29 14:42:47 minikube kubelet[67785]: W0929 14:42:47.666843   67785 fs.go:588] stat failed on /dev/mapper/luks-ae886ed0-14ab-482b-a2ed-7db07cf1d34a with error: no such file or directory
Sep 29 14:42:47 minikube kubelet[67785]: E0929 14:42:47.666856   67785 kubelet.go:1423] "Failed to start ContainerManager" err="failed to get rootfs info: failed to get device for dir \"/var/lib/kubelet\": could not find device with major: 0, minor: 35 in cached partitions map"
Sep 29 14:42:47 minikube systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
-- Subject: Unit process exited
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- An ExecStart= process belonging to unit kubelet.service has exited.
-- 
-- The process' exit code is 'exited' and its exit status is 1.
Sep 29 14:42:47 minikube systemd[1]: kubelet.service: Failed with result 'exit-code'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- The unit kubelet.service has entered the 'failed' state with result 'exit-code'.
Sep 29 14:42:48 minikube systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 227.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- Automatic restarting of the unit kubelet.service has been scheduled, as the result for
-- the configured Restart= setting for the unit.
Sep 29 14:42:48 minikube systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
-- Subject: A stop job for unit kubelet.service has finished
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- A stop job for unit kubelet.service has finished.
-- 
-- The job identifier is 15588 and the job result is done.
Sep 29 14:42:48 minikube systemd[1]: Started kubelet: The Kubernetes Node Agent.
-- Subject: A start job for unit kubelet.service has finished successfully
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
-- 
-- A start job for unit kubelet.service has finished successfully.
-- 
-- The job identifier is 15588.

while on my working machine I don't see any restarts of kubelet.

Kernel versions are the same, the differences that come in my mind are: the good machine has nvidia runtime for docker enabled and additionally the hard drive is not encrypted, the nonworking laptop does not have nvidia runtime and the hard drive is encrypted. The good machine has btrfs driver enabled since day one and that has never changed, I tried both btrfs and overlay2 on the broken machine and it didn't help.

Logs, just in case minikube logs --file=minikube.log
minikube.log

and /tmp/minikube_logs_3d5a90a900b10e78f2089db95ed137817b8ee6e6_0.log (the error message recommended to add this one as well)
minikube_logs_3d5a90a900b10e78f2089db95ed137817b8ee6e6_0.log

@dabljues
Copy link
Author

dabljues commented Sep 30, 2021

Basically, what fixed the issue for me was reinstalling the OS with ext4 as the filesystem instead of btrfs, which I previously had. @slabko it doesn't matter which driver you specify for docker, your filesystem matters here, from what I understand. If you don't want to reinstall, this fix supposedly fixes the issue: #7923 (comment).

Oh, and by the way the encryption might be the issue here (I also encountered this problem on an encrypted btrfs filesystem), because when I tested btrfs on VMs without encryption - minikube worked without any problems. So for me it's either ext4 or btrfs without encryption to make it work

@slabko
Copy link

slabko commented Oct 2, 2021

@dabljues Thank you very much for you help, the following command actually solved my problem

minikube start  --feature-gates="LocalStorageCapacityIsolation=false

In the meantime I also tried kind, seems to work nicely as well with btrfs and encryption.

@sharifelgamal
Copy link
Collaborator

Yeah, btrfs is a known issue with minikube: #7923.

I'm glad there's a workaround!

@sharifelgamal sharifelgamal added triage/duplicate Indicates an issue is a duplicate of other open issue. kind/bug Categorizes issue or PR as related to a bug. and removed kind/support Categorizes issue or PR as a support question. labels Nov 3, 2021
@pslq
Copy link

pslq commented Dec 12, 2021

@dabljues Thank you very much for you help, the following command actually solved my problem

minikube start  --feature-gates="LocalStorageCapacityIsolation=false

In the meantime I also tried kind, seems to work nicely as well with btrfs and encryption.

Thank you

@spowelljr spowelljr changed the title minikube start fails due to kubelet minikube start fails with btrfs Dec 29, 2021
@spowelljr
Copy link
Member

Kubernetes 1.23 will support btrfs

kubernetes/system-validators#26

@HerHde
Copy link

HerHde commented Jan 15, 2022

@slabko Just to mention, your command lacks a " at the end, but thanks, it helped me, too ;-)

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 15, 2022
@HerHde
Copy link

HerHde commented Apr 21, 2022

No problem with kernel 5.15.32 and minikube v1.25.2 anymore

@spowelljr
Copy link
Member

Based on @HerHde's comment, this seems to be resolved, so I'm going to close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. triage/duplicate Indicates an issue is a duplicate of other open issue.
Projects
None yet
Development

No branches or pull requests

9 participants