-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
could not find a log line that matches "Reached target .*Multi-User System.*|detected cgroup v1" #2972
Comments
it is possible that your host is becoming constrained with both clusters and it causes the second to be much slower and fail? |
Nope, since it fails pretty fast, at least in 1st case of failure. The 2nd one, after restart looks totally different. The system is a laptop that has enough CPU (8 cores) and RAM (32 GM), with a fast (SSD) disk. |
Your system may have lots of physical resources but it can still become constrained on kernel limit dimensions like number of open files, number of inotify watches etc. Is this this most minimal cluster sizes and actions that produces this result? |
Generally speaking, sure. Different logical limits may be reached. Not sure what's specific to this system, as in the other system the max num of open files (aka To get into concrete and some actionable facts, the number of open files case is not applicable anymore. Error level entries…/1990444523 ❯ grep -Ri "level=error" *
istioinaction-control-plane/journal.log:Oct 19 08:33:45 istioinaction-control-plane containerd[105]: time="2022-10-19T08:33:45.617670890Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
istioinaction-control-plane/journal.log:Oct 19 08:33:45 istioinaction-control-plane containerd[105]: time="2022-10-19T08:33:45.618582558Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
istioinaction-control-plane/containerd.log:Oct 19 08:33:45 istioinaction-control-plane containerd[105]: time="2022-10-19T08:33:45.617670890Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
istioinaction-control-plane/containerd.log:Oct 19 08:33:45 istioinaction-control-plane containerd[105]: time="2022-10-19T08:33:45.618582558Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
istioinaction-worker/journal.log:Oct 19 08:33:45 istioinaction-worker containerd[105]: time="2022-10-19T08:33:45.578211772Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
istioinaction-worker/journal.log:Oct 19 08:33:45 istioinaction-worker containerd[105]: time="2022-10-19T08:33:45.579058794Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
istioinaction-worker/containerd.log:Oct 19 08:33:45 istioinaction-worker containerd[105]: time="2022-10-19T08:33:45.578211772Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
istioinaction-worker/containerd.log:Oct 19 08:33:45 istioinaction-worker containerd[105]: time="2022-10-19T08:33:45.579058794Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
istioinaction-worker2/journal.log:Oct 19 08:33:45 istioinaction-worker2 containerd[105]: time="2022-10-19T08:33:45.620740099Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
istioinaction-worker2/journal.log:Oct 19 08:33:45 istioinaction-worker2 containerd[105]: time="2022-10-19T08:33:45.621716545Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
istioinaction-worker2/containerd.log:Oct 19 08:33:45 istioinaction-worker2 containerd[105]: time="2022-10-19T08:33:45.620740099Z" level=error msg="failed to initialize a tracing processor \"otlp\"" error="no OpenTelemetry endpoint: skip plugin"
istioinaction-worker2/containerd.log:Oct 19 08:33:45 istioinaction-worker2 containerd[105]: time="2022-10-19T08:33:45.621716545Z" level=error msg="failed to load cni during init, please check CRI plugin status before setting up network for pods" error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"
…/1990444523 ❯ What else should I do for investigating this? Meanwhile, ran these tests: Test 1: Another cluster as the 1st one (2-worker-node cluster) - NOKCreating another cluster as the 1st one (a 2-worker-node cluster, without any ingress) failed as well.
Test 1 output…/istioinaction_cluster❯ kind create cluster --name test-same-cluster --config 2workers_kind_cluster --retain
Creating cluster "test-same-cluster" ...
✓ Ensuring node image (kindest/node:v1.25.2) 🖼
✓ Preparing nodes 📦 📦 📦
✓ Writing configuration 📜
✗ Starting control-plane 🕹️
ERROR: failed to create cluster: failed to init node with kubeadm: command "docker exec --privileged test-same-cluster-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1
Command Output: I1019 15:45:57.462473 137 initconfiguration.go:254] loading configuration from "/kind/kubeadm.conf"
W1019 15:45:57.464177 137 initconfiguration.go:331] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
[init] Using Kubernetes version: v1.25.2
... (content omitted) ...
I1019 15:45:59.739224 137 loader.go:374] Config loaded from file: /etc/kubernetes/admin.conf
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I1019 15:45:59.740158 137 round_trippers.go:553] GET https://test-same-cluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
I1019 15:46:00.243205 137 round_trippers.go:553] GET https://test-same-cluster-control-plane:6443/healthz?timeout=10s in 1 milliseconds
...
I1019 15:46:39.243528 137 round_trippers.go:553] GET https://test-same-cluster-control-plane:6443/healthz?timeout=10s in 1 milliseconds
[kubelet-check] Initial timeout of 40s passed.
I1019 15:46:39.742480 137 round_trippers.go:553] GET https://test-same-cluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I1019 15:46:40.242046 137 round_trippers.go:553] GET https://test-same-cluster-control-plane:6443/healthz?timeout=10s in 1 milliseconds
I1019 15:46:40.743596 137 round_trippers.go:553] GET https://test-same-cluster-control-plane:6443/healthz?timeout=10s in 1 milliseconds
...
I1019 15:47:54.742625 137 round_trippers.go:553] GET https://test-same-cluster-control-plane:6443/healthz?timeout=10s in 0 milliseconds
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI.
Here is one example how you may list all running Kubernetes containers by using crictl:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock ps -a | grep kube | grep -v pause'
Once you have found the failing container, you can inspect its logs with:
- 'crictl --runtime-endpoint unix:///run/containerd/containerd.sock logs CONTAINERID'
couldn't initialize a Kubernetes cluster
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/init.runWaitControlPlanePhase
cmd/kubeadm/app/cmd/phases/init/waitcontrolplane.go:108
...
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1594
…/istioinaction_cluster took 2m❯ It seems that it's timing out, waiting for the control plane components to start.
But indeed, a …/istioinaction_cluster ❯ docker exec -it test-same-cluster-control-plane ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 15:45 ? 00:00:03 /sbin/init
root 92 1 0 15:45 ? 00:00:02 /lib/systemd/systemd-journald
root 105 1 0 15:45 ? 00:00:05 /usr/local/bin/containerd
root 15697 0 0 15:57 pts/1 00:00:00 ps -ef
…/istioinaction_cluster ❯ compared with the 1st cluster (that was created just fine): A fully functional control-plane container processes…/istioinaction_cluster ❯ docker exec -it dxps-cluster-control-plane ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 08:28 ? 00:00:00 /sbin/init
root 92 1 0 08:28 ? 00:00:00 /lib/systemd/systemd-journal
root 105 1 0 08:28 ? 00:00:23 /usr/local/bin/containerd
root 339 1 0 08:28 ? 00:00:00 /usr/local/bin/containerd-sh
root 340 1 0 08:28 ? 00:00:00 /usr/local/bin/containerd-sh
root 381 1 0 08:28 ? 00:00:00 /usr/local/bin/containerd-sh
root 399 1 0 08:28 ? 00:00:00 /usr/local/bin/containerd-sh
65535 427 340 0 08:28 ? 00:00:00 /pause
65535 434 339 0 08:28 ? 00:00:00 /pause
65535 440 399 0 08:28 ? 00:00:00 /pause
65535 448 381 0 08:28 ? 00:00:00 /pause
root 520 399 0 08:28 ? 00:00:09 kube-scheduler --authenticat
root 584 381 0 08:28 ? 00:00:49 kube-controller-manager --al
root 585 339 0 08:28 ? 00:02:03 kube-apiserver --advertise-a
root 667 340 0 08:28 ? 00:01:06 etcd --advertise-client-urls
root 736 1 0 08:28 ? 00:01:06 /usr/bin/kubelet --bootstrap
root 850 1 0 08:29 ? 00:00:00 /usr/local/bin/containerd-sh
root 872 1 0 08:29 ? 00:00:00 /usr/local/bin/containerd-sh
65535 895 850 0 08:29 ? 00:00:00 /pause
65535 902 872 0 08:29 ? 00:00:00 /pause
root 943 850 0 08:29 ? 00:00:00 /usr/local/bin/kube-proxy --
root 980 872 0 08:29 ? 00:00:00 /bin/kindnetd
root 1248 1 0 08:29 ? 00:00:00 /usr/local/bin/containerd-sh
root 1249 1 0 08:29 ? 00:00:00 /usr/local/bin/containerd-sh
65535 1288 1248 0 08:29 ? 00:00:00 /pause
65535 1295 1249 0 08:29 ? 00:00:00 /pause
root 1359 1 0 08:29 ? 00:00:00 /usr/local/bin/containerd-sh
65535 1378 1359 0 08:29 ? 00:00:00 /pause
root 1429 1249 0 08:29 ? 00:00:07 /coredns -conf /etc/coredns/
root 1438 1248 0 08:29 ? 00:00:06 /coredns -conf /etc/coredns/
root 1513 1359 0 08:29 ? 00:00:01 local-path-provisioner --deb
root 2028 0 0 15:57 pts/1 00:00:00 ps -ef
…/istioinaction_cluster ❯ Test 2: A simple cluster - OKAgain, deleted the 2nd (and failed to be properly created) cluster, after that Creating a simple (one single node, the control plane one) cluster using Test 3: A 1-worker-node - NOKTesting a smaller setup with two nodes: one control-plane and one worker (compared with the test 1, this time we have one worker node instead of two). In this case, it fails later, timing out while waiting for the worker node to join. Test 3 output…/istioinaction_cluster ❯ kind create cluster --name one-worker-cluster --config 1worker_kind_cluster --retain
Creating cluster "one-worker-cluster" ...
✓ Ensuring node image (kindest/node:v1.25.2) 🖼
✓ Preparing nodes 📦 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✗ Joining worker nodes 🚜
ERROR: failed to create cluster: failed to join node with kubeadm: command "docker exec --privileged one-worker-cluster-worker kubeadm join --config /kind/kubeadm.conf --skip-phases=preflight --v=6" failed with error: exit status 1
Command Output: I1019 16:07:05.985642 137 join.go:416] [preflight] found NodeName empty; using OS hostname as NodeName
I1019 16:07:05.985672 137 joinconfiguration.go:76] loading configuration from "/kind/kubeadm.conf"
...
I1019 16:07:28.574547 137 kubelet.go:219] [kubelet-start] preserving the crisocket information for the node
I1019 16:07:28.574601 137 patchnode.go:31] [patchnode] Uploading the CRI Socket information "unix:///run/containerd/containerd.sock" to the Node API object "one-worker-cluster-worker" as an annotation
I1019 16:07:29.081678 137 round_trippers.go:553] GET https://one-worker-cluster-control-plane:6443/api/v1/nodes/one-worker-cluster-worker?timeout=10s 404 Not Found in 5 milliseconds
I1019 16:07:29.577307 137 round_trippers.go:553] GET https://one-worker-cluster-control-plane:6443/api/v1/nodes/one-worker-cluster-worker?timeout=10s 404 Not Found in 1 milliseconds
...
I1019 16:07:58.077944 137 round_trippers.go:553] GET https://one-worker-cluster-control-plane:6443/api/v1/nodes/one-worker-cluster-worker?timeout=10s 404 Not Found in 1 milliseconds
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I1019 16:07:58.579523 137 round_trippers.go:553] GET https://one-worker-cluster-control-plane:6443/api/v1/nodes/one-worker-cluster-worker?timeout=10s 404 Not Found in 4 milliseconds
...
I1019 16:09:28.582278 137 round_trippers.go:553] GET https://one-worker-cluster-control-plane:6443/api/v1/nodes/one-worker-cluster-worker?timeout=10s 404 Not Found in 2 milliseconds
nodes "one-worker-cluster-worker" not found
error uploading crisocket
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runKubeletStartJoinPhase
cmd/kubeadm/app/cmd/phases/join/kubelet.go:221
...
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1594
…/istioinaction_cluster took 2m40s❯ As before, comparing the processes that are running in a functional worker node with this newer (but failed to join) one, there are differences. worker node processes differences…/istioinaction_cluster ❯ docker exec -it dxps-cluster-worker ps -ef Hope it helps. |
Are you able to share the rest of the cluster logs ( |
Absolutely, Ben! You guys are so "kind" to help me, I really want to use KinD instead of the alternatives, so yeah. I had to reproduce it again, since I did a clean up last night. Interestingly is that if I initially create a simple 3-worker nodes cluster and then the classic 1-node cluster ( Deleting this 2nd one, and creating a 1-worker node cluster fails as before (presented on Test 3 above). Relevant output…/istioinaction_cluster ❯ kind create cluster --name oneworker --config 1worker_kind_cluster --retain ; kind export logs --name=oneworker
Creating cluster "oneworker" ...
✓ Ensuring node image (kindest/node:v1.25.2) 🖼
✓ Preparing nodes 📦 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✗ Joining worker nodes 🚜
ERROR: failed to create cluster: failed to join node with kubeadm: command "docker exec --privileged oneworker-worker kubeadm join --config /kind/kubeadm.conf --skip-phases=preflight --v=6" failed with error: exit status 1
Command Output: I1020 17:45:53.106540 139 join.go:416] [preflight] found NodeName empty; using OS hostname as NodeName
I1020 17:45:53.106567 139 joinconfiguration.go:76] loading configuration from "/kind/kubeadm.conf"
I1020 17:45:53.107116 139 controlplaneprepare.go:220] [download-certs] Skipping certs download
I1020 17:45:53.107121 139 join.go:533] [preflight] Discovering cluster-info
I1020 17:45:53.107134 139 token.go:80] [discovery] Created cluster-info discovery client, requesting info from "oneworker-control-plane:6443"
I1020 17:45:53.110783 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s 200 OK in 3 milliseconds
I1020 17:45:53.110969 139 token.go:223] [discovery] The cluster-info ConfigMap does not yet contain a JWS signature for token ID "abcdef", will try again
I1020 17:45:59.045697 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s 200 OK in 1 milliseconds
I1020 17:45:59.045849 139 token.go:223] [discovery] The cluster-info ConfigMap does not yet contain a JWS signature for token ID "abcdef", will try again
I1020 17:46:05.485438 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s 200 OK in 1 milliseconds
I1020 17:46:05.486615 139 token.go:105] [discovery] Cluster info signature and contents are valid and no TLS pinning was specified, will use API Server "oneworker-control-plane:6443"
I1020 17:46:05.486624 139 discovery.go:52] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I1020 17:46:05.486634 139 join.go:547] [preflight] Fetching init configuration
I1020 17:46:05.486639 139 join.go:593] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
I1020 17:46:05.490973 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config?timeout=10s 200 OK in 4 milliseconds
I1020 17:46:05.492500 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/namespaces/kube-system/configmaps/kube-proxy?timeout=10s 200 OK in 0 milliseconds
I1020 17:46:05.493402 139 kubelet.go:74] attempting to download the KubeletConfiguration from ConfigMap "kubelet-config"
I1020 17:46:05.494312 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/namespaces/kube-system/configmaps/kubelet-config?timeout=10s 200 OK in 0 milliseconds
I1020 17:46:05.495577 139 interface.go:432] Looking for default routes with IPv4 addresses
I1020 17:46:05.495582 139 interface.go:437] Default route transits interface "eth0"
I1020 17:46:05.495643 139 interface.go:209] Interface eth0 is up
I1020 17:46:05.495679 139 interface.go:257] Interface "eth0" has 3 addresses :[172.22.0.6/16 fc00:f853:ccd:e793::6/64 fe80::42:acff:fe16:6/64].
I1020 17:46:05.495697 139 interface.go:224] Checking addr 172.22.0.6/16.
I1020 17:46:05.495702 139 interface.go:231] IP found 172.22.0.6
I1020 17:46:05.495709 139 interface.go:263] Found valid IPv4 address 172.22.0.6 for interface "eth0".
I1020 17:46:05.495712 139 interface.go:443] Found active IP 172.22.0.6
I1020 17:46:05.499512 139 kubelet.go:120] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I1020 17:46:05.499994 139 kubelet.go:135] [kubelet-start] writing CA certificate at /etc/kubernetes/pki/ca.crt
I1020 17:46:05.500269 139 loader.go:374] Config loaded from file: /etc/kubernetes/bootstrap-kubelet.conf
I1020 17:46:05.500465 139 kubelet.go:156] [kubelet-start] Checking for an existing Node in the cluster with name "oneworker-worker" and status "Ready"
I1020 17:46:05.501613 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 1 milliseconds
I1020 17:46:05.501775 139 kubelet.go:171] [kubelet-start] Stopping the kubelet
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I1020 17:46:10.658421 139 loader.go:374] Config loaded from file: /etc/kubernetes/kubelet.conf
I1020 17:46:10.659462 139 cert_rotation.go:137] Starting client certificate rotation controller
I1020 17:46:10.659911 139 loader.go:374] Config loaded from file: /etc/kubernetes/kubelet.conf
I1020 17:46:10.660124 139 kubelet.go:219] [kubelet-start] preserving the crisocket information for the node
I1020 17:46:10.660142 139 patchnode.go:31] [patchnode] Uploading the CRI Socket information "unix:///run/containerd/containerd.sock" to the Node API object "oneworker-worker" as an annotation
I1020 17:46:11.164704 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 4 milliseconds
I1020 17:46:11.662697 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 2 milliseconds
I1020 17:46:12.166844 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 4 milliseconds
...
I1020 17:46:45.163304 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 1 milliseconds
[kubelet-check] Initial timeout of 40s passed.
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
I1020 17:46:45.665926 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 4 milliseconds
I1020 17:46:46.167110 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 4 milliseconds
...
I1020 17:48:10.662571 139 round_trippers.go:553] GET https://oneworker-control-plane:6443/api/v1/nodes/oneworker-worker?timeout=10s 404 Not Found in 0 milliseconds
nodes "oneworker-worker" not found
error uploading crisocket
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/join.runKubeletStartJoinPhase
cmd/kubeadm/app/cmd/phases/join/kubelet.go:221
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:234
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1
cmd/kubeadm/app/cmd/join.go:181
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:974
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1594
error execution phase kubelet-start
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run.func1
cmd/kubeadm/app/cmd/phases/workflow/runner.go:235
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).visitAll
cmd/kubeadm/app/cmd/phases/workflow/runner.go:421
k8s.io/kubernetes/cmd/kubeadm/app/cmd/phases/workflow.(*Runner).Run
cmd/kubeadm/app/cmd/phases/workflow/runner.go:207
k8s.io/kubernetes/cmd/kubeadm/app/cmd.newCmdJoin.func1
cmd/kubeadm/app/cmd/join.go:181
github.com/spf13/cobra.(*Command).execute
vendor/github.com/spf13/cobra/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
vendor/github.com/spf13/cobra/command.go:974
github.com/spf13/cobra.(*Command).Execute
vendor/github.com/spf13/cobra/command.go:902
k8s.io/kubernetes/cmd/kubeadm/app.Run
cmd/kubeadm/app/kubeadm.go:50
main.main
cmd/kubeadm/kubeadm.go:25
runtime.main
/usr/local/go/src/runtime/proc.go:250
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1594
Exporting logs for cluster "oneworker" to:
/tmp/2171614989
…/istioinaction_cluster took 2m33s❯
And here is the archive with the kind exported logs. |
Sorry this has been unfortunate timing for me, looking at the logs now for kubelet on the worker node:
You're now hitting inotify limits, a variation on: |
Thanks, Benjamin Elder! Appreciate the feedback! |
Reporting the same issue on Mac M1 w/ DOCKER_DEFAULT_PLATFORM=linux/amd64
|
TLDR that's not supported, the platform needs to match to run Kubernetes. We need to determine the best option to handle this still. |
Also running into the same issue on Mac M1: my command:
the error:
my config:
Interesting items of note: image was found here kind version: |
Ah geez. I just realized that I was using the Rather than deleting this post, I'll leave it in case anyone else runs into the same issue I did. The proper link I should have used is here. |
Summarizing:
|
NOTE: we publish the multi-arch digests in our release notes, so you can use those instead of the docker hub digests. Docker hub's UI only exposes single-arch digests, but there is a digest for the multi-arch manifest as well. |
FTR, mine wasn't the same as @dxps because I wasn't trying to use |
1 similar comment
FTR, mine wasn't the same as @dxps because I wasn't trying to use |
Ok Its an old issue , yet since, I faced today the same issue as @notjames and the solution was to run the kind create command as root (sudo...) and it worked and the creation completed, but the cluster now is created in root, so all the time i have to execute command as root |
If you're trying to run as non-root, please see the docs: https://kind.sigs.k8s.io/docs/user/rootless/ rootless containers are still a bit "fun" but kind mostly works, if you take some additional steps outlined in the docs. |
Dear KinD community,
I'd like to readdress (from here, btw @BenTheElder thanks a lot for the feedback) this error:
What happened:
What you expected to happen:
Have it all running fine: the second cluster be created as well.
How to reproduce it (as minimally and precisely as possible):
Starting from a clean state (ok, not purely clean; deleted any kind cluster), I create a first multi-node cluster using this config:
and the cluster is successfully created.
Now, trying to create a second multi-node cluster with an ingress (as per this nice kind doc) based on this config:
the cluster creation fails:
The logs export (to that
/tmp/297015354
folder) collected these files and folders: docker-info.txt istioinaction-control-plane istioinaction-worker istioinaction-worker2 kind-version.txt.A brief look for erros in it shows different details:
I'll try to address such error and get back here with updates:
Anything else we need to know?:
Interesting is that it happens in one of my two systems (a workstation and laptop), both running the same up-to-date Linux distro, with the same versions of the tools (captured in the Environment section below).
Environment:
kind version
):v0.16.0
kubectl version
): clientv1.25.3
, serverv1.25.2
docker info
):20.10.12
/etc/os-release
):Pop!_OS 22.04 LTS
(nothing custom, all stock and up to date, running kernel 5.19.0-76051900-generic)Thanks
Update
Tried again after increasing the max number of open files, at least to eliminate one type of issues.
ulimit -n
now returns4096
(instead of1024
).The text was updated successfully, but these errors were encountered: