multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart #2045

hadrabap · 2021-01-30T14:52:33Z

Kind Kubernetes cluster does not survive Docker restart. It seems that docker assigns new IPs to containers on each start-up. The KIND nodes however have original IP addresses specified in the generated configuration files causing kubernetes services unable to talk to each other. The most affected ones are scheduler and controller.

What happened:

Kubernetes starts in broken state even though kubectl get pods -A reports otherwise (everything 1/1). The cluster is unable start deployed pods (if deployed before restart) and is unable to deploy anything new due to scheduler is not connected to apiserver.

What you expected to happen:

Kubernetes cluster continues working as expected even after Docker restart.

How to reproduce it (as minimally and precisely as possible):

Install KIND cluster by issuing:

cat <<EOF | kind create cluster --name kind --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
EOF

Restart Docker
Deploy anything, e.g.: kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml
Check dnsutils are in Pending state

Anything else we need to know?:

Log files: kind-cluster-logs.tar.gz
I tried to change IP addresses in /kind and /etc/kubernets files but than the services start complaining about certificate not issued for the IP address. Changing IP addresses each time the cluster starts is therefor not a solution.

Environment:

kind version: (use kind version):

kind v0.10.0 go1.15.7 darwin/amd64

Kubernetes version: (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.3", GitCommit:"1e11e4a2108024935ecfcb2912226cedeafd99df", GitTreeState:"clean", BuildDate:"2020-10-14T12:50:19Z", GoVersion:"go1.15.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-21T01:11:42Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}

Docker version: (use docker info):

Client:
 Version:           20.10.0
 API version:       1.41
 Go version:        go1.15.6
 Git commit:        03fa4b8
 Built:             Sat Dec 12 20:00:39 2020
 OS/Arch:           darwin/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.2
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       8891c58
  Built:            Mon Dec 28 16:15:28 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.3
  GitCommit:        269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc:
  Version:          1.0.0-rc92
  GitCommit:        ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

OS (e.g. from /etc/os-release):
macOS Catalina 10.15.7 Intel

The text was updated successfully, but these errors were encountered:

markush81 · 2021-01-31T09:00:02Z

Just wanted to open the same thing, my information is as follows

Environment:

macOS Big Sur 11.1
Docker 0.10.2
kind v0.10.0 go1.15.7 darwin/amd64

Kind setup

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:v1.19.1@sha256:98cf5288864662e37115e362b23e4369c8c4a408f99cbc06e58ac30ddc721600
- role: worker
  image: kindest/node:v1.19.1@sha256:98cf5288864662e37115e362b23e4369c8c4a408f99cbc06e58ac30ddc721600
- role: worker
  image: kindest/node:v1.19.1@sha256:98cf5288864662e37115e362b23e4369c8c4a408f99cbc06e58ac30ddc721600

After creating it the inspect of the related docker network shows

...
"Containers": {
            "780b60602e52be16b47f464e861ad065ac9737e9d2f330f18a65dbe242effe91": {
                "Name": "my-k8s-control-plane",
                "EndpointID": "6a14b15c13f74cf39df539c7e85b9e43b0af9d87626fad7f848bf815e40bfb6e",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": "fc00:f853:ccd:e793::2/64"
            },
            "8c1f003761a65e4dba126bd027440ed24050c8ff9730a37e0ac36b37aace61ba": {
                "Name": "my-k8s-worker2",
                "EndpointID": "93fb4c3abbb2cf3fe8470d8a01b779eae12b488088113ae8b9bd25eea2060daf",
                "MacAddress": "02:42:ac:12:00:04",
                "IPv4Address": "172.18.0.4/16",
                "IPv6Address": "fc00:f853:ccd:e793::4/64"
            },
            "b0623460e508b90eedc47a65e8d10218f70f6cac2d3d4695419dd06496faa9db": {
                "Name": "my-k8s-worker",
                "EndpointID": "1d2a0903a81aa6f8074809c37c9d69ad839c61b0521a5ef345041f4c22a69a50",
                "MacAddress": "02:42:ac:12:00:03",
                "IPv4Address": "172.18.0.3/16",
                "IPv6Address": "fc00:f853:ccd:e793::3/64"
            }
        }
...

After a reboot of the whole machine/ or a docker restart it usually comes to the situation that the IPs have changed

LAST SEEN   TYPE      REASON                    OBJECT                      MESSAGE
14h         Normal    Starting                  node/my-k8s-control-plane   Starting kubelet.
14h         Normal    NodeHasSufficientMemory   node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasSufficientMemory
14h         Normal    NodeHasNoDiskPressure     node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasNoDiskPressure
14h         Normal    NodeHasSufficientPID      node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasSufficientPID
14h         Normal    NodeAllocatableEnforced   node/my-k8s-control-plane   Updated Node Allocatable limit across pods
14h         Normal    Starting                  node/my-k8s-control-plane   Starting kubelet.
14h         Normal    NodeAllocatableEnforced   node/my-k8s-control-plane   Updated Node Allocatable limit across pods
14h         Normal    NodeHasSufficientMemory   node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasSufficientMemory
14h         Normal    NodeHasNoDiskPressure     node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasNoDiskPressure
14h         Normal    NodeHasSufficientPID      node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasSufficientPID
14h         Normal    RegisteredNode            node/my-k8s-control-plane   Node my-k8s-control-plane event: Registered Node my-k8s-control-plane in Controller
14h         Normal    Starting                  node/my-k8s-control-plane   Starting kube-proxy.
14h         Normal    NodeReady                 node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeReady
13m         Normal    Starting                  node/my-k8s-control-plane   Starting kubelet.
13m         Normal    NodeHasSufficientMemory   node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasSufficientMemory
13m         Normal    NodeHasNoDiskPressure     node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasNoDiskPressure
13m         Normal    NodeHasSufficientPID      node/my-k8s-control-plane   Node my-k8s-control-plane status is now: NodeHasSufficientPID
13m         Normal    NodeAllocatableEnforced   node/my-k8s-control-plane   Updated Node Allocatable limit across pods
13m         Normal    Starting                  node/my-k8s-control-plane   Starting kube-proxy.
14h         Normal    Starting                  node/my-k8s-worker          Starting kubelet.
14h         Normal    NodeHasSufficientMemory   node/my-k8s-worker          Node my-k8s-worker status is now: NodeHasSufficientMemory
14h         Normal    NodeHasNoDiskPressure     node/my-k8s-worker          Node my-k8s-worker status is now: NodeHasNoDiskPressure
14h         Normal    NodeHasSufficientPID      node/my-k8s-worker          Node my-k8s-worker status is now: NodeHasSufficientPID
14h         Normal    NodeAllocatableEnforced   node/my-k8s-worker          Updated Node Allocatable limit across pods
14h         Normal    RegisteredNode            node/my-k8s-worker          Node my-k8s-worker event: Registered Node my-k8s-worker in Controller
14h         Normal    Starting                  node/my-k8s-worker          Starting kube-proxy.
14h         Normal    NodeReady                 node/my-k8s-worker          Node my-k8s-worker status is now: NodeReady
13m         Normal    Starting                  node/my-k8s-worker          Starting kubelet.
13m         Normal    NodeHasSufficientMemory   node/my-k8s-worker          Node my-k8s-worker status is now: NodeHasSufficientMemory
13m         Normal    NodeHasNoDiskPressure     node/my-k8s-worker          Node my-k8s-worker status is now: NodeHasNoDiskPressure
13m         Normal    NodeHasSufficientPID      node/my-k8s-worker          Node my-k8s-worker status is now: NodeHasSufficientPID
13m         Normal    NodeAllocatableEnforced   node/my-k8s-worker          Updated Node Allocatable limit across pods
13m         Warning   Rebooted                  node/my-k8s-worker          Node my-k8s-worker has been rebooted, boot id: de45af38-2b9c-4634-b4aa-5ef1dab149df
13m         Normal    Starting                  node/my-k8s-worker          Starting kube-proxy.
14h         Normal    Starting                  node/my-k8s-worker2         Starting kubelet.
14h         Normal    NodeHasSufficientMemory   node/my-k8s-worker2         Node my-k8s-worker2 status is now: NodeHasSufficientMemory
14h         Normal    NodeHasNoDiskPressure     node/my-k8s-worker2         Node my-k8s-worker2 status is now: NodeHasNoDiskPressure
14h         Normal    NodeHasSufficientPID      node/my-k8s-worker2         Node my-k8s-worker2 status is now: NodeHasSufficientPID
14h         Normal    NodeAllocatableEnforced   node/my-k8s-worker2         Updated Node Allocatable limit across pods
14h         Normal    RegisteredNode            node/my-k8s-worker2         Node my-k8s-worker2 event: Registered Node my-k8s-worker2 in Controller
14h         Normal    Starting                  node/my-k8s-worker2         Starting kube-proxy.
14h         Normal    NodeReady                 node/my-k8s-worker2         Node my-k8s-worker2 status is now: NodeReady
13m         Normal    Starting                  node/my-k8s-worker2         Starting kubelet.
13m         Normal    NodeHasSufficientMemory   node/my-k8s-worker2         Node my-k8s-worker2 status is now: NodeHasSufficientMemory
13m         Normal    NodeHasNoDiskPressure     node/my-k8s-worker2         Node my-k8s-worker2 status is now: NodeHasNoDiskPressure
13m         Normal    NodeHasSufficientPID      node/my-k8s-worker2         Node my-k8s-worker2 status is now: NodeHasSufficientPID
13m         Normal    NodeAllocatableEnforced   node/my-k8s-worker2         Updated Node Allocatable limit across pods
13m         Warning   Rebooted                  node/my-k8s-worker2         Node my-k8s-worker2 has been rebooted, boot id: de45af38-2b9c-4634-b4aa-5ef1dab149df
13m         Normal    Starting                  node/my-k8s-worker2         Starting kube-proxy.

 "Containers": {
            "780b60602e52be16b47f464e861ad065ac9737e9d2f330f18a65dbe242effe91": {
                "Name": "my-k8s-control-plane",
                "EndpointID": "40ac0af6846bce624d6e20f3fe816a699999491ec006908ae3dfdbff3ed34027",
                "MacAddress": "02:42:ac:12:00:03",
                "IPv4Address": "172.18.0.3/16",
                "IPv6Address": "fc00:f853:ccd:e793::3/64"
            },
            "8c1f003761a65e4dba126bd027440ed24050c8ff9730a37e0ac36b37aace61ba": {
                "Name": "my-k8s-worker2",
                "EndpointID": "b36d5e2db63cecdf960519521de6e2bb32402bf2e0db0b07d505344f368a17db",
                "MacAddress": "02:42:ac:12:00:02",
                "IPv4Address": "172.18.0.2/16",
                "IPv6Address": "fc00:f853:ccd:e793::2/64"
            },
            "b0623460e508b90eedc47a65e8d10218f70f6cac2d3d4695419dd06496faa9db": {
                "Name": "my-k8s-worker",
                "EndpointID": "9c38c3f4c91b97cb213d520eda7b1af60fe682584f35c62c0d51773bd24ab282",
                "MacAddress": "02:42:ac:12:00:04",
                "IPv4Address": "172.18.0.4/16",
                "IPv6Address": "fc00:f853:ccd:e793::4/64"
            }
        },

Changed IP adresses


my-k8s-control-plane: 172.18.0.2/16 -> 172.18.0.3/16
my-k8s-worker: 172.18.0.3/16 -> 172.18.0.4/16
my-k8s-worker2: 172.18.0.4/16 -> 172.18.0.2/16

This mixup now confuses the cluster internally, since it doesn't get this updates somehow.

The pods itself all get into running state, but the cluster is in a non-functional state ... e.g. you can't startup a new pod.

kubectl run -i --tty busybox --image=busybox --restart=Never -- sh
error: timed out waiting for the condition

kubectl get pods
NAME      READY   STATUS    RESTARTS   AGE
busybox   0/1     Pending   0          10m

kubectl describe pod busybox --namespace='default'

Name:         busybox
Namespace:    default
Priority:     0
Node:         <none>
Labels:       run=busybox
Annotations:  <none>
Status:       Pending
IP:
IPs:          <none>
Containers:
  busybox:
    Image:      busybox
    Port:       <none>
    Host Port:  <none>
    Args:
      sh
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-wghwb (ro)
Volumes:
  default-token-wghwb:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-wghwb
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:          <none>

Looking through variuous pods:

kube-controller-manager-my-k8s-control-plane

 E0131 08:54:15.863481       1 leaderelection.go:321] error retrieving resource lock kube-system/kube-controller-manager: Get "https://172.18.0.2:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s": dial tcp 172.18.0.2:6443: connect: connection refused
 E0131 08:54:19.186263       1 leaderelection.go:321] error retrieving resource lock kube-system/kube-controller-manager: Get "https://172.18.0.2:6443/api/v1/namespaces/kube-system/endpoints/kube-controller-manager?timeout=10s": dial tcp 172.18.0.2:6443: connect: connection refused

coredns

 I0131 08:45:57.742283       1 trace.go:116] Trace[939984059]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125 (started: 2021-01-31 08:45:27.736738195 +0000 UTC m=+0.425505076) (total time: 30.004062522s):
Trace[939984059]: [30.004062522s] [30.004062522s] END
E0131 08:45:57.742331       1 reflector.go:178] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:125: Failed to list *v1.Namespace: Get "https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0": dial tcp 10.96.0.1:443: i/o timeout

kube-apiserver-my-k8s-control-plane

 Trace[1533160523]: ---"Transformed response object" 14383ms (08:54:00.090)
 Trace[1533160523]: [14.386428279s] [14.386428279s] END
 I0131 08:54:50.261412       1 trace.go:205] Trace[230667356]: "Get" url:/api/v1/namespaces/kube-system/pods/coredns-f9fd979d6-p4tgv/log,user-agent:kubernetic-backend/v0.0.0 (darwin/amd64) kubernetes/$Format,client:172.18.0.1 (31-Jan-2021 08:54:47.025) (total time: 3235ms):
 Trace[230667356]: ---"Transformed response object" 3234ms (08:54:00.261)
 Trace[230667356]: [3.235978319s] [3.235978319s] END
 I0131 08:55:00.223341       1 client.go:360] parsed scheme: "passthrough"
 I0131 08:55:00.223424       1 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{https://127.0.0.1:2379  <nil> 0 <nil>}] <nil> <nil>}
 I0131 08:55:00.223437       1 clientconn.go:948] ClientConn switching balancer to "pick_first"
 I0131 08:55:00.441118       1 trace.go:205] Trace[980747837]: "Get" url:/api/v1/namespaces/kube-system/pods/kindnet-drjnp/log,user-agent:kubernetic-backend/v0.0.0 (darwin/amd64) kubernetes/$Format,client:172.18.0.1 (31-Jan-2021 08:54:53.639) (total time: 6801ms):
 Trace[980747837]: ---"Transformed response object" 6799ms (08:55:00.441)
 Trace[980747837]: [6.801401461s] [6.801401461s] END

kube-scheduler-my-k8s-control-plane

 E0131 08:55:45.455623       1 reflector.go:127] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:188: Failed to watch *v1.Pod: failed to list *v1.Pod: Get "https://172.18.0.2:6443/api/v1/pods?fieldSelector=status.phase%21%3DFailed%2Cstatus.phase%21%3DSucceeded&limit=500&resourceVersion=0": dial tcp 172.18.0.2:6443: connect: connection refused
 E0131 08:55:45.840158       1 reflector.go:127] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: Get "https://172.18.0.2:6443/apis/storage.k8s.io/v1/storageclasses?limit=500&resourceVersion=0": dial tcp 172.18.0.2:6443: connect: connection refused
 E0131 08:55:58.745228       1 reflector.go:127] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1beta1.PodDisruptionBudget: failed to list *v1beta1.PodDisruptionBudget: Get "https://172.18.0.2:6443/apis/policy/v1beta1/poddisruptionbudgets?limit=500&resourceVersion=0": dial tcp 172.18.0.2:6443: connect: connection refused
 E0131 08:56:01.444706       1 reflector.go:127] k8s.io/apiserver/pkg/server/dynamiccertificates/configmap_cafile_content.go:206: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: Get "https://172.18.0.2:6443/api/v1/namespaces/kube-system/configmaps?fieldSelector=metadata.name%3Dextension-apiserver-authentication&limit=500&resourceVersion=0": dial tcp 172.18.0.2:6443: connect: connection refused

hadrabap · 2021-01-31T12:12:36Z

OKi, I've realized that kind creates its own docker network. Thank you @markush81!

I've patched kind Docker provider so it (currently) generates sequential IP addresses for each node and forces them with docker ... --ip XXX.

This seems to solve the issue. Docker reuses the IPs after restart.

Take a look at my branch.

I'll try to generalize it and make it more flexible, but I might fail as I have no idea about Go. :-/

aojea · 2021-01-31T13:45:03Z

/assign

custom IP allocation was discussed before and it not easy to implement, check that you can have multiple cluster created at same time ...

It will be interesting to understand what is the root cause, we switched to dns in most parts to allow restarts ... can this be a kubernets limitation or are we missing something?

hadrabap · 2021-01-31T16:21:40Z

So, I've implemented simple mechanism which:

Gets network address from kind Docker network
Obtains all already assigned IP addresses in that network
Sequentially generates IP addresses for the network and if it is not found in the assigned IP list then it is used.

The mechanism has however one disadvantage: it fills possible gaps. But it works even for subsequent additional cluster creations.

Checkout the code but don't take it seriously, these are my first lines of code in Go in my life.

P.S.: I'll take a look how difficult it could be for me to implement values parameter so one can specify docker IP manually, e.g.:

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  docker-ip: 172.18.0.10
- role: worker
  docker-ip: 172.18.0.11
- role: worker
  docker-ip: 172.18.0.12
- role: worker
  docker-ip: 172.18.0.13

P.P.S.: I've uploaded compiled binary for anybody who wants to test it: https://github.com/hadrabap/kind-test-snapshots/blob/main/kind

hadrabap · 2021-01-31T19:29:50Z

It will be interesting to understand what is the root cause, we switched to dns in most parts to allow restarts ... can this be a kubernets limitation or are we missing something?

I think this is Docker related. The kubernetes cluster itself uses DNS but it will not get there as kubeadm is launched by systemd with configuration files using IPs only. These files are generated by kind during installation. Thats why the cluster works after installation because the IPs are already assigned by Docker. After the restart new IPs are assigned unless the containers were originally ran with the --ip parameter. I found on stackoverflow.com lot of reports (for example) that by using the --ip parameter one can "fix" the IP and permanently register it in DNS for example.

BenTheElder · 2021-02-01T05:52:13Z

These files are generated by kind during installation.

not strictly.

this is a bug in

kind/images/base/files/usr/local/bin/entrypoint

Line 255 in 461ad52

# fixup IPs in manifests ...

BenTheElder · 2021-02-01T05:53:50Z

I've patched kind Docker provider so it (currently) generates sequential IP addresses for each node and forces them with docker ... --ip XXX.

Actually, docker does not guarantee this unless you create a network that excludes the chosen IPs from the auto-allocated range.
In order to do that we need to predict how much users need for kind versus containers alongside kind, and it generally complicates things. Even then it's only best effort, and not guaranteed. Docker does not provide guaranteed static IP allocation.

You can see more on this in the previous discussion in this repo.

BenTheElder · 2021-02-01T05:54:49Z

This also smells like a bug somewhere between kubeadm and kind, there's no reason I shouldn't be able to put the APIserver behind DNS and have components respect that.

hadrabap · 2021-02-01T10:05:39Z

I hope I understand it better now. Let me summarize and please correct me if I'm wrong:

Kind installs cluster with all it initializations and introduces IP addresses into /kind/ and /etc/kubernets/ directories.
Each time the cluster starts the entrypoint tries to manage the IP addresses somehow.

What I've found is that the second step does something silly. Originally I made myself a shell script which re-configures the IP addresses to mach the current state, but that fails as well as all the security certificates generated in step 1 are based on IP addresses in their CNs. Which leads to a situation that the services are finally able to contact to each other but they reject the certificates and we are back in square one.

I see only two ways to solve this issue permanently:

Use static IP addresses (which is problematic to do with Docker), or
use host names everywhere from the total beginning.

I hope I got the idea.

BenTheElder · 2021-02-03T07:24:55Z

Yeah I think that's pretty much it.

We could maybe fix 2) from the first list by regenerating the certs potentially (bit of a headache + at bootup kind binary orchestrated this between nodes)

I think we could also take some additional approaches re: the second list:

setting aside etcd, there are the node IPs (not a problem, core components are not listening on these, the nodes just need to report them) and the apiserver (this is a problem), for the apiserver if we couldn't do DNS we could do a VIP (has it's own problems) similar to the existing in-cluster kubernetes.default backing IP. and similar for etcd perhaps.

Regarding 1.) we could also limit this to just control plane nodes. One additional problematic thing for kind is that we support provisioning N clusters in parallel in separate invocations against the same docker. Currently some users depend on this in an even worse extended circumstance: the dockerd is not on the host the kind binary is on, so coordinating IPAM will be a headache.

I think the cleanest solution is getting everything to use DNS names, but I don't remember why that's not happening right now. I'm not sure how soon I can dig into this deeper ... lots going on right now.

BenTheElder · 2021-02-03T07:44:46Z

And to further clarify: the entrypoint IP update is intended to handle places where we bind to an IP in addition to the node reporting it's IP via kubelet. That intention should be fine, because certs should also be signed for the hostname and the hostname should be correctly mapped to the new IP. So the problem with that approach is things still connecting to IPs instead of domains, if we can fix that we still want to update the local node's references to it's old IP on restart.

aojea · 2021-02-03T07:47:13Z

but that fails as well as all the security certificates generated in step 1 are based on IP addresses in their CNs. Which leads to a situation that the services are finally able to contact to each other but they reject the certificates and we are back in square one

do we know exactly which components are failing or ALL the component are failing?
kubelet nodes registration? apiserver, controller-manager,

@neolit123 do you have the case in kubeadm of people switching IPs on their cluster once installed?

fejta-bot · 2021-05-04T08:36:12Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

markush81 · 2021-05-12T11:22:05Z

/remove-lifecycle stale

BenTheElder · 2021-07-01T21:26:36Z

One thing to clarify: It's perfectly fine for the nodes to have changing IPs and be fixing that up in some places in the config, notably we have kubelet registering and setting the node IP on the node object, which is used for things like routing traffic to blocks of pod IPs, which is fine to do dynamically.

What is problematic is the parts that use fixed certificates / contacting the control plane components. All of that should be using DNS, but it's not.

Unfortunately rebootable multi-node clusters are something with a somewhat limited (but certainly not none!) use case and personally this is not a priority versus other work (mostly Kubernetes things which is not even kind related to begin with at the moment...)

The bot will not close issues in this repo now, I've disabled it for us.

A good start if someone wants to see progress on this would be identifying where IP addresses are being used.

gagipro · 2021-07-02T08:55:42Z

Hi,

thanks for disabling the bot !

the use case is simple : you work on a project and need to have a stable multinode env. You need to rebuild the cluster each day or each reboot.

The goal of kind is to simulate an env, but if one needs to rebuild each time you reboot, it is clearly a game breaker for kind.

I stopped using kind and built a real cluster, until issue solved.

thanks.
regards.

BenTheElder · 2021-07-02T15:05:33Z

Hi, yes if a persistent "real" cluster suits your needs kind is not what you're looking for.

The goal of kind is to simulate an env, but if one needs to rebuild each time you reboot, it is clearly a game breaker for kind.

Regarding goals, see: https://kind.sigs.k8s.io/docs/contributing/project-scope/

For application testing long persistence is a nice to have. kind is not intended to be a persistent workload cluster and it would be insecure to keep it permanently (persistent clusters should be upgraded regularly to keep ahead of security fixes ...)

See also: https://kind.sigs.k8s.io/docs/user/configuration/#api-server

Most apps should not need multi-node at all to test, we need that to test some particular Kubernetes expectations around rolling node behavior for a few tests.

pablodgonzalez · 2021-12-20T18:52:03Z

No news here, Is a pity! It is neccesary in many kind of context such as making/teaching/learning curses for example

aojea · 2021-12-20T20:29:13Z

No news here, Is a pity! It is neccesary in many kind of context such as making/teaching/learning curses for example

can you expand on that?

Why do you need to restart docker?
why is not possible to create a cluster from scratch?

pablodgonzalez · 2021-12-20T21:51:48Z

Of course, just imagine

I start a course that lasts 2 days (when it could well be 5),

I created the cluster
I created pods, replica sets, implementations, services, etc ... each one with their yml configuration file
I Uploaded custom images to cluster
I get the cluster in a certain state ...
And yes, the day is finished and I turn off the laptop
So ... the next morning I have to recreate all the work from yesterday to continue with today's course ... just because on reboot I lost connectivity to the cluster and can't get back online.
Just with a workaround available could it be fine.
But for now to teach I have to first explain VirtualBox (or something else) to use minikube
And yes I like delete all with kind and rebuild all again is good practice, but no for long courses which many times between days I am explain a feature

I think kind is almost perfect to teach (and learn) but this issue continue being a little headache

Best Regards

aojea · 2021-12-20T22:16:27Z

I think kind is almost perfect to teach (and learn) but this issue continue being a little headache

There are some people that have implemented creative solutions to workaround it, I think that some of them shared them in the slack channels, using some bash scripts with docker pause and other commands ...
maybe is time to look for something more standard, I have to check how to solve the IP assignment problem 🤔

aojea · 2021-12-21T14:11:17Z

I created the cluster

@pablodgonzalez this happens with multinode cluster ,right?
I'm able to stop and restart docker and keep working with single-node clusters

pablodgonzalez · 2021-12-21T14:20:36Z

@aojea Yes! the issue is with multinode HA cluster with 2 or more control-planes.
I thoughi It is because the haproxy image does not support the restart, the problem is the haproxy lost the access to control-plane containers. I do not research in deep. I just saw the logs returning un bad healtcheck.

aojea · 2021-12-21T14:22:07Z

I see, because multinode works after restarts too

$ docker inspect -f '{{.Name}} - {{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $(docker ps -aq)
/kind-control-plane - 172.18.0.4
/kind-worker - 172.18.0.6
/kind-worker2 - 172.18.0.5
/vigilant_ptolemy - 172.18.0.2
/trusting_tharp - 172.18.0.3
$ sudo systemctl restart docker
$ docker inspect -f '{{.Name}} - {{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' $(docker ps -aq)
/kind-control-plane - 172.18.0.3
/kind-worker - 172.18.0.4
/kind-worker2 - 172.18.0.2
$ kubectl get pods -A
NAMESPACE            NAME                                         READY   STATUS    RESTARTS       AGE
kube-system          coredns-78fcd69978-cfck2                     1/1     Running   1 (110s ago)   6m23s
kube-system          coredns-78fcd69978-nl848                     1/1     Running   1 (110s ago)   6m23s
kube-system          etcd-kind-control-plane                      1/1     Running   0              97s
kube-system          kindnet-4mxrm                                1/1     Running   1 (110s ago)   6m23s
kube-system          kindnet-qj2f5                                1/1     Running   1 (110s ago)   6m4s
kube-system          kindnet-tbsrh                                1/1     Running   1 (111s ago)   6m5s
kube-system          kube-apiserver-kind-control-plane            1/1     Running   0              97s
kube-system          kube-controller-manager-kind-control-plane   1/1     Running   1 (110s ago)   6m36s
kube-system          kube-proxy-9kkcx                             1/1     Running   1 (110s ago)   6m4s
kube-system          kube-proxy-fxc8w                             1/1     Running   1 (110s ago)   6m5s
kube-system          kube-proxy-p7qf4                             1/1     Running   1 (110s ago)   6m23s
kube-system          kube-scheduler-kind-control-plane            1/1     Running   1 (110s ago)   6m43s
local-path-storage   local-path-provisioner-85494db59d-6sh9l      1/1     Running   2 (61s ago)    6m23s

what is the requirement for HA?

pablodgonzalez · 2021-12-21T14:42:17Z

Mainly I use it to show how to responde the cluster when a control-plane fail, the multi node is simple to show but control-plane is a extra.
I always recommend make the cluster in HA when is not using cloud providers managed instance, so the excercice to view the info in the containers and show how to works it is the cherry over the pie
So working always in HA cluster I get the attention and curiosity about the feature

aojea · 2021-12-21T14:55:43Z

ok, the problem is that the etcd cluster has the IPs hardcoded

  annotations:
    kubeadm.kubernetes.io/etcd.advertise-client-urls: https://172.18.0.2:2379
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://172.18.0.2:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --initial-advertise-peer-urls=https://172.18.0.2:2380
    - --initial-cluster=kind-control-plane=https://172.18.0.2:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379,https://172.18.0.2:2379
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://172.18.0.2:2380

hence, when the cluster reboots and the nodes change their ips it blows up
/cc @neolit123

neolit123 · 2021-12-21T15:25:30Z

Changing the IPs of CP nodes would require updating static pods on disk, and also changing annotations on the mirror pods. Perhaps there is a way to reserve Docker IPs on restart instead of adapting to IP changes.

aojea · 2021-12-21T15:40:41Z

Changing the IPs of CP nodes would require updating static pods on disk, and also changing annotations on the mirror pods. Perhaps there is a way to reserve Docker IPs on restart instead of adapting to IP changes.

and using dns names instead of IPs?

neolit123 · 2021-12-21T16:02:38Z

kube-* components have flags that only work with IPs. Those could be set to 0.0.0.0, but the mirror pod annotations still have to be updated post ip detection.
Kubeadm uses static etcd bootstrap, which uses IPs. DNS bootstrap is not supported by kubeadm woth a toggle, but users can opt into it using flags.

https://etcd.io/docs/v3.5/dev-internal/discovery_protocol/

boldandbusted · 2021-12-23T23:44:46Z

Of course, just imagine

I start a course that lasts 2 days (when it could well be 5),

I created the cluster

I created pods, replica sets, implementations, services, etc ... each one with their yml configuration file

I Uploaded custom images to cluster

I get the cluster in a certain state ...
And yes, the day is finished and I turn off the laptop
So ... the next morning I have to recreate all the work from yesterday to continue with today's course ... just because on reboot I lost connectivity to the cluster and can't get back online.
Just with a workaround available could it be fine.
But for now to teach I have to first explain VirtualBox (or something else) to use minikube
And yes I like delete all with kind and rebuild all again is good practice, but no for long courses which many times between days I am explain a feature

I think kind is almost perfect to teach (and learn) but this issue continue being a little headache

Best Regards

Howdy. I was just pointed here from Slack because I had a KinD cluster that has two control-plane nodes, and was perplexed at why it didn't survive a reboot of the host OS. However, I saw your comment, and while it is a bit of a pain to set up first-day, you and your students could use Vagrant to manage suspending and bringing up a VM with a KinD cluster in the guest OS as a workaround. I shared what I use to manage setup here: https://github.com/boldandbusted/vagrant-kind . Cheers.

pablodgonzalez · 2021-12-28T15:21:59Z

@boldandbusted Thanks for share your repo but the idea is avoid install any VMs or take time for explain another tools.
This is for the target public, many students are developers, architects, and sometimes decision makers, so, I want focus on kubernetes and his benefits and not get noise from another tools or setups.
For now I got a config for multi control plane node to use over the course ending but it lost the enchant of working hands on from the beginning and discover it for your self

BenTheElder · 2022-05-03T19:28:10Z

Unfortunately it's been difficult to track this issue as discussion has veered off topic and across many issues/threads.

See: #2671 for recent discussion on possible solutions.

Some discussions are in #2045

BenTheElder · 2022-05-26T00:00:28Z

This should be fixed for most multi-node clusters in the latest sources at HEAD, and in the forthcoming v0.15.0 (TBD, we'll want to wrap up some other things and make sure this is working widely before cutting a release).

#1689 remains for tracking clusters with multiple control-plane nodes ("HA") which we haven't dug into yet.

BenTheElder · 2022-05-26T00:01:05Z

(Thanks @tnqn !)

hadrabap added the kind/bug Categorizes issue or PR as related to a bug. label Jan 30, 2021

hadrabap closed this as completed Jan 31, 2021

hadrabap reopened this Jan 31, 2021

k8s-ci-robot assigned aojea Jan 31, 2021

BenTheElder added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Feb 3, 2021

BenTheElder mentioned this issue Feb 6, 2021

HA clusters don't reboot properly #1689

Open

BenTheElder mentioned this issue Feb 19, 2021

Kind v0.10.0 cluster fails to schedule pods after host restarts #2080

Closed

BenTheElder mentioned this issue Mar 4, 2021

DNS Resolution failing after cluster run a while #2100

Closed

iwilltry42 mentioned this issue Apr 7, 2021

[FEATURE] k3d IPAM (to prevent etcd failures) k3d-io/k3d#550

Closed

BenTheElder mentioned this issue Apr 13, 2021

Ability to set static IP for containers #2197

Closed

aojea mentioned this issue Apr 21, 2021

Number of desired replicas not matching with the actually existing ones after PC restart and manual pod deletion #2210

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 4, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 12, 2021

calshankar mentioned this issue Jun 1, 2021

envoy pods permanently unready projectcontour/contour#3192

Closed

BenTheElder mentioned this issue Jun 26, 2021

unable to retrieve multi node cluster if my pc restated. #2332

Closed

BenTheElder changed the title ~~Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart~~ multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart Jul 1, 2021

kubernetes-sigs deleted a comment from k8s-ci-robot Dec 21, 2021

BenTheElder mentioned this issue Jan 6, 2022

Feature Request: extraContainerArgs for Nodes #2579

Closed

BenTheElder mentioned this issue Feb 22, 2022

Restarting host lets control plane stop working #2640

Closed

BenTheElder mentioned this issue Mar 7, 2022

KIND_EXPERIMENTAL_DOCKER_NETWORK is not encouraged, need a replacement. #2657

Closed

BenTheElder mentioned this issue May 22, 2022

Control-Plane IP May Dismatch in Multi Kind Cluster Situation #2784

Closed

BenTheElder assigned BenTheElder and unassigned aojea May 26, 2022

BenTheElder closed this as completed May 26, 2022

LukeShortCloud mentioned this issue Jun 24, 2022

[virtualization][kubernetes_administration] kind 0.15.0 supports reboots of multi-node clusters LukeShortCloud/rootpages#777

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart #2045

multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart #2045

hadrabap commented Jan 30, 2021 •

edited

Loading

markush81 commented Jan 31, 2021 •

edited

Loading

hadrabap commented Jan 31, 2021

aojea commented Jan 31, 2021

hadrabap commented Jan 31, 2021 •

edited

Loading

hadrabap commented Jan 31, 2021

BenTheElder commented Feb 1, 2021

BenTheElder commented Feb 1, 2021

BenTheElder commented Feb 1, 2021

hadrabap commented Feb 1, 2021

BenTheElder commented Feb 3, 2021

BenTheElder commented Feb 3, 2021

aojea commented Feb 3, 2021

fejta-bot commented May 4, 2021

markush81 commented May 12, 2021

BenTheElder commented Jul 1, 2021

gagipro commented Jul 2, 2021

BenTheElder commented Jul 2, 2021 •

edited

Loading

pablodgonzalez commented Dec 20, 2021

aojea commented Dec 20, 2021

pablodgonzalez commented Dec 20, 2021

aojea commented Dec 20, 2021

aojea commented Dec 21, 2021

pablodgonzalez commented Dec 21, 2021

aojea commented Dec 21, 2021 •

edited

Loading

pablodgonzalez commented Dec 21, 2021

aojea commented Dec 21, 2021

neolit123 commented Dec 21, 2021 via email •

edited

Loading

aojea commented Dec 21, 2021

neolit123 commented Dec 21, 2021

boldandbusted commented Dec 23, 2021

pablodgonzalez commented Dec 28, 2021

BenTheElder commented May 3, 2022 •

edited

Loading

BenTheElder commented May 26, 2022

BenTheElder commented May 26, 2022

multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart #2045

multi-node: Kubernetes cluster does not start after Docker re-assigns node's IP addresses after (Docker) restart #2045

Comments

hadrabap commented Jan 30, 2021 • edited Loading

markush81 commented Jan 31, 2021 • edited Loading

hadrabap commented Jan 31, 2021

aojea commented Jan 31, 2021

hadrabap commented Jan 31, 2021 • edited Loading

hadrabap commented Jan 31, 2021

BenTheElder commented Feb 1, 2021

BenTheElder commented Feb 1, 2021

BenTheElder commented Feb 1, 2021

hadrabap commented Feb 1, 2021

BenTheElder commented Feb 3, 2021

BenTheElder commented Feb 3, 2021

aojea commented Feb 3, 2021

fejta-bot commented May 4, 2021

markush81 commented May 12, 2021

BenTheElder commented Jul 1, 2021

gagipro commented Jul 2, 2021

BenTheElder commented Jul 2, 2021 • edited Loading

pablodgonzalez commented Dec 20, 2021

aojea commented Dec 20, 2021

pablodgonzalez commented Dec 20, 2021

aojea commented Dec 20, 2021

aojea commented Dec 21, 2021

pablodgonzalez commented Dec 21, 2021

aojea commented Dec 21, 2021 • edited Loading

pablodgonzalez commented Dec 21, 2021

aojea commented Dec 21, 2021

neolit123 commented Dec 21, 2021 via email • edited Loading

aojea commented Dec 21, 2021

neolit123 commented Dec 21, 2021

boldandbusted commented Dec 23, 2021

pablodgonzalez commented Dec 28, 2021

BenTheElder commented May 3, 2022 • edited Loading

BenTheElder commented May 26, 2022

BenTheElder commented May 26, 2022

hadrabap commented Jan 30, 2021 •

edited

Loading

markush81 commented Jan 31, 2021 •

edited

Loading

hadrabap commented Jan 31, 2021 •

edited

Loading

BenTheElder commented Jul 2, 2021 •

edited

Loading

aojea commented Dec 21, 2021 •

edited

Loading

neolit123 commented Dec 21, 2021 via email •

edited

Loading

BenTheElder commented May 3, 2022 •

edited

Loading