Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] CoreDNS NodeHosts lost after adding a new node #1009

Open
lerminou opened this issue Mar 10, 2022 · 11 comments
Open

[BUG] CoreDNS NodeHosts lost after adding a new node #1009

lerminou opened this issue Mar 10, 2022 · 11 comments
Assignees
Labels
bug Something isn't working priority/high
Milestone

Comments

@lerminou
Copy link

lerminou commented Mar 10, 2022

What did you do

  • How was the cluster created?
    • k3d cluster create
INFO[0016] Injecting records for hostAliases (incl. host.k3d.internal) and for 2 network members into CoreDNS configmap...

What did you do afterwards?

  • Check configmap for coredns

The host host.k3d.internal is present on the cluster creation

  NodeHosts: |
    172.40.0.1 host.k3d.internal
    172.40.0.2 k3d-local-serverlb
    172.40.0.3 k3d-local-server-0
kind: ConfigMap
  • Add a new node with k3d node create newserver --cluster local --role server
INFO[0000] Adding 1 node(s) to the runtime local cluster 'local'... 
INFO[0000] Using the k3d-tools node to gather environment information 
INFO[0000] Starting new tools node...                   
INFO[0000] Starting Node 'k3d-local-tools'              
INFO[0000] HostIP: using network gateway 172.40.0.1 address 
INFO[0001] Starting Node 'k3d-newserver-0'              
INFO[0015] Updating loadbalancer config to include new server node(s) 
INFO[0015] Successfully configured loadbalancer k3d-local-serverlb! 
INFO[0016] Successfully created 1 node(s)!   
  • Check the coreDNS configmap
  NodeHosts: |
    172.40.0.4 k3d-newserver-0
kind: ConfigMap

What did you expect to happen

The new host is added but others NodeHosts are not lost

Screenshots or terminal output

Which OS & Architecture

  • Linux, Fedora 35

Which version of k3d

  • output of k3d version v5.3.0 k3s version v1.22.6-k3s1 (default)

Which version of docker

  • output of
 Version:           20.10.12
 API version:       1.41
 Go version:        go1.16.12
 Git commit:        e91ed57
 Built:             Mon Dec 13 11:46:03 2021
 OS/Arch:           linux/amd64
 Context:           default
 Experimental:      true

Server: Docker Engine - Community
 Engine:
  Version:          20.10.12
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.16.12
  Git commit:       459d0df
  Built:            Mon Dec 13 11:43:48 2021
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.4.12
  GitCommit:        7b11cfaabd73bb80907dd23182b9347b4245eb5d
 runc:
  Version:          1.0.2
  GitCommit:        v1.0.2-0-g52b36a2
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0
@lerminou lerminou added the bug Something isn't working label Mar 10, 2022
@iwilltry42 iwilltry42 added this to the v5.4.0 milestone Mar 15, 2022
@iwilltry42 iwilltry42 self-assigned this Mar 15, 2022
@iwilltry42
Copy link
Member

Hi @lerminou , thanks for opening this issue!
This is indeed pretty weird and I haven't seen this before.
This would in fact mean, that K3s even deletes its own entries 🤔
I put this on the list to be checked before the next release 👍

@lerminou
Copy link
Author

Hi @iwilltry42
Thanks for your time.
One more precision, My cluster was created with a 1.21 k3s image..
My second node was created with a 1.22 version ( I submitted another bug about that)

If a check k3s changelog. There are some changes about the coredns config.

I will try to reproduce it in a full 1.22 cluster

@iwilltry42
Copy link
Member

Hey @lerminou in context of #1032 I just did several node adds and my CoreDNS configmap now looks like this:

NodeHosts: |
    172.21.0.2 k3d-test-server-0
    172.21.0.5 k3d-testnode2-0
    172.21.0.1 host.k3d.internal
    172.21.0.6 k3d-testnode3-0
    172.21.0.3 k3d-test-agent-0
    172.21.0.4 k3d-test-serverlb

So I assume that the issue is limited to creating a new server node.

Which after taking another look at your original post raises a question:

You just ran k3d cluster create without any additional flags (Meaning: 1 server and no --cluster-init flag), right? 🤔
That would mean, that the new server node shouldn't even be able to join the existing cluster (or induce some split-brain mode), since the original server-0 does not use etcd and thus cannot add another server node.

@iwilltry42 iwilltry42 modified the milestones: v5.4.0, v5.5.0 Mar 26, 2022
@lerminou
Copy link
Author

lerminou commented Mar 26, 2022

Hi @iwilltry42
I didn't fill the yaml config file but the cluster was created with the correct flag to use etcd.

I agree it only appears when adding a new node to an existing cluster

@iwilltry42
Copy link
Member

I didn't fill the yaml config file but the cluster was created with the correct flag to use etcd.

Huh? The --cluster-init flag should only be set when creating more than 1 server at cluster creation or when you set it manually 🤔
Can you share the output of docker inspect <server-0-container>? (inspecting the initial server container created by k3d cluster create)

I agree it only appears when adding a new node to an existing cluster

Also when adding an agent node?

@lerminou
Copy link
Author

Here is the relevant informations from my config.yaml used when I create the cluster

apiVersion: k3d.io/v1alpha4
kind: Simple
metadata:
  name: local
servers: 1
agents: 0
image: rancher/k3s:v1.21.10-k3s1
options:
  k3s: # options passed on to K3s itself
    extraArgs: # additional arguments passed to the `k3s server|agent` command; same as `--k3s-arg`
      - arg: --tls-san=back-local.kube.com
        nodeFilters:
          - all
      - arg: --cluster-init
        nodeFilters:
          - server:0

And the output of the container

[
    {
        "Id": "f03c57557f5e0e74efc7d9ac6c6bf672af1a9b35dbbb0e009e288cc8d1e2a153",
        "Created": "2022-03-26T14:35:51.512776582Z",
        "Path": "/bin/k3s",
        "Args": [
            "server",
            "--tls-san=back-local.kube.com",
            "--cluster-init",
            "--cluster-domain=kube.com",
            "--disable=traefik",
            "--resolv-conf=/var/resolv.conf",
            "--tls-san",
            "back-local.kube.com"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 45282,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2022-03-26T14:35:52.169325626Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:ce2cc3439feeb2a54dc01daf414754a64897f815e4c9a5c59c02b6d9d5b8d38c",
        "ResolvConfPath": "/home/docker-images/containers/f03c57557f5e0e74efc7d9ac6c6bf672af1a9b35dbbb0e009e288cc8d1e2a153/resolv.conf",
        "HostnamePath": "/home/docker-images/containers/f03c57557f5e0e74efc7d9ac6c6bf672af1a9b35dbbb0e009e288cc8d1e2a153/hostname",
        "HostsPath": "/home/docker-images/containers/f03c57557f5e0e74efc7d9ac6c6bf672af1a9b35dbbb0e009e288cc8d1e2a153/hosts",
        "LogPath": "/home/docker-images/containers/f03c57557f5e0e74efc7d9ac6c6bf672af1a9b35dbbb0e009e288cc8d1e2a153/f03c57557f5e0e74efc7d9ac6c6bf672af1a9b35dbbb0e009e288cc8d1e2a153-json.log",
        "Name": "/k3d-local-server-0",
        "RestartCount": 0,
        "Driver": "overlay2",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/home/projects/k3d/manifests:/var/lib/rancher/k3s/server/manifests/",
                "/home/projects/k3d/CA-kube.crt:/etc/ssl/certs/ca.pem",
                "/home/projects/k3d/storage:/var/lib/rancher/k3s/storage",
                "/home/projects/k3d/resolv.conf:/var/resolv.conf",
                "/home:/projects",
                "k3d-local-images:/k3d/images"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "default",
            "PortBindings": null,
            "RestartPolicy": {
                "Name": "unless-stopped",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "CgroupnsMode": "host",
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": null,
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": true,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [
                "label=disable"
            ],
            "Tmpfs": {
                "/run": "",
                "/var/run": ""
            },
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": false,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": null,
            "ReadonlyPaths": null,
            "Init": true
        },
        "GraphDriver": {
            "Data": {
                "LowerDir": "/home/docker-images/overlay2/b99aeab69d19a2a6c3837ac92b6fb0b56724d203459a9f4de47a75b636a7e2d2-init/diff:/home/nicolasdubut/docker-images/overlay2/2a7d00212eac6ae53ab10ddee65285eb88f69c80f1b4fbc539787da3719a7001/diff:/home/nicolasdubut/docker-images/overlay2/5ac2e683a098eb7db25da3239d0b11620528e0cc7f208a9cb0c63f96392d1c80/diff:/home/nicolasdubut/docker-images/overlay2/a781511d2cea6e430e927f862a3b443862af1b4ef74b12029b03d499760cf968/diff",
                "MergedDir": "/home/docker-images/overlay2/b99aeab69d19a2a6c3837ac92b6fb0b56724d203459a9f4de47a75b636a7e2d2/merged",
                "UpperDir": "/home/docker-images/overlay2/b99aeab69d19a2a6c3837ac92b6fb0b56724d203459a9f4de47a75b636a7e2d2/diff",
                "WorkDir": "/home/docker-images/overlay2/b99aeab69d19a2a6c3837ac92b6fb0b56724d203459a9f4de47a75b636a7e2d2/work"
            },
            "Name": "overlay2"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/home/projects/k3d/storage",
                "Destination": "/var/lib/rancher/k3s/storage",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "volume",
                "Name": "3c31da469f55de07251b661d5f1c42a88ee6a479cb08cc35397037894eabe302",
                "Source": "/home/docker-images/volumes/3c31da469f55de07251b661d5f1c42a88ee6a479cb08cc35397037894eabe302/_data",
                "Destination": "/var/lib/cni",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            },
            {
                "Type": "volume",
                "Name": "678124af0cdcc59b2714c59b91e94a8ac9a3bdd1042ca25dda118742b3aa2a39",
                "Source": "/home/docker-images/volumes/678124af0cdcc59b2714c59b91e94a8ac9a3bdd1042ca25dda118742b3aa2a39/_data",
                "Destination": "/var/lib/rancher/k3s",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            },
            {
                "Type": "bind",
                "Source": "/home/projects/k3d/manifests",
                "Destination": "/var/lib/rancher/k3s/server/manifests",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/home/projects/k3d/resolv.conf",
                "Destination": "/var/resolv.conf",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/home",
                "Destination": "/projects",
                "Mode": "",
                "RW": true,
                "Propagation": "rslave"
            },
            {
                "Type": "volume",
                "Name": "k3d-local-images",
                "Source": "/home/docker-images/volumes/k3d-local-images/_data",
                "Destination": "/k3d/images",
                "Driver": "local",
                "Mode": "z",
                "RW": true,
                "Propagation": ""
            },
            {
                "Type": "volume",
                "Name": "6554fec529be600aebc4219f07891c27ca6a12d1d51d36566b104ed65afd4996",
                "Source": "/home/docker-images/volumes/6554fec529be600aebc4219f07891c27ca6a12d1d51d36566b104ed65afd4996/_data",
                "Destination": "/var/lib/kubelet",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            },
            {
                "Type": "volume",
                "Name": "e86a826c0420a7033f933d396b60ed6948c1d4a28b897b6a3e64dc47871827db",
                "Source": "/home/docker-images/volumes/e86a826c0420a7033f933d396b60ed6948c1d4a28b897b6a3e64dc47871827db/_data",
                "Destination": "/var/log",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            },
            {
                "Type": "bind",
                "Source": "/home/projects/k3d/CA-kube.crt",
                "Destination": "/etc/ssl/certs/ca.pem",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
        "Config": {
            "Hostname": "k3d-local-server-0",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "bar=baz",
                "K3S_TOKEN=superSecretToken",
                "K3S_KUBECONFIG_OUTPUT=/output/kubeconfig.yaml",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/bin/aux",
                "CRI_CONFIG_FILE=/var/lib/rancher/k3s/agent/etc/crictl.yaml"
            ],
            "Cmd": [
                "server",
                "--tls-san=back-local.kube.com",
                "--cluster-init",
                "--cluster-domain=kube.com",
                "--disable=traefik",
                "--resolv-conf=/var/resolv.conf",
                "--tls-san",
                "back-local.kube.com"
            ],
            "Image": "rancher/k3s:v1.21.10-k3s1",
            "Volumes": {
                "/var/lib/cni": {},
                "/var/lib/kubelet": {},
                "/var/lib/rancher/k3s": {},
                "/var/log": {}
            },
            "WorkingDir": "",
            "Entrypoint": [
                "/bin/k3s"
            ],
            "OnBuild": null,
            "Labels": {
                "app": "k3d",
                "k3d.cluster": "local",
                "k3d.cluster.imageVolume": "k3d-local-images",
                "k3d.cluster.network": "k3d-local",
                "k3d.cluster.network.external": "false",
                "k3d.cluster.network.id": "d29ad6e8de48162f149e07fd956c5deb59f335dea88ccd6c49475ecb368e6507",
                "k3d.cluster.network.iprange": "172.40.0.0/16",
                "k3d.cluster.token": "superSecretToken",
                "k3d.cluster.url": "https://k3d-local-server-0:6443",
                "k3d.node.staticIP": "172.40.0.3",
                "k3d.role": "server",
                "k3d.server.api.host": "back-local.kube.com",
                "k3d.server.api.hostIP": "0.0.0.0",
                "k3d.server.api.port": "36145",
                "k3d.version": "v5.3.0",
                "org.opencontainers.image.created": "2022-02-24T23:35:30Z",
                "org.opencontainers.image.revision": "471f5eb3dbfeaee6b6dd6ed9ab4037c10cc39680",
                "org.opencontainers.image.source": "https://github.com/k3s-io/k3s.git",
                "org.opencontainers.image.url": "https://github.com/k3s-io/k3s"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "e0e8f724abe0e51bcf123e5b21e07b40f64c28ec2cc7294eae16668f1b9b5367",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/e0e8f724abe0",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "k3d-local": {
                    "IPAMConfig": {
                        "IPv4Address": "172.40.0.3"
                    },
                    "Links": null,
                    "Aliases": [
                        "f03c57557f5e",
                        "k3d-local-server-0"
                    ],
                    "NetworkID": "d29ad6e8de48162f149e07fd956c5deb59f335dea88ccd6c49475ecb368e6507",
                    "EndpointID": "95d2c18a81c178b1beac39b6be429755409c9bb7c9c76a05301a1c63a805664e",
                    "Gateway": "172.40.0.1",
                    "IPAddress": "172.40.0.3",
                    "IPPrefixLen": 16,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:ac:28:00:03",
                    "DriverOpts": null
                }
            }
        }
    }
]

When I create a new AGENT node in this cluster, the nodeHost is not modified:

 projects  k3d  k3d node create newserver --cluster local --role agent           
INFO[0000] Adding 1 node(s) to the runtime local cluster 'local'... 
INFO[0000] Using the k3d-tools node to gather environment information 
INFO[0000] Starting new tools node...                   
INFO[0000] Starting Node 'k3d-local-tools'              
INFO[0000] HostIP: using network gateway 172.40.0.1 address 
INFO[0000] Starting Node 'k3d-newserver-0'              
INFO[0009] Successfully created 1 node(s)! 

when I create a new SERVER node, the configMap is broken

@iwilltry42
Copy link
Member

Okay, now that makes more sense. From your initial post I couldn't see that you were using a config file.
I can verify that that's an issue. It's quite weird though that K3s overwrites the already deployed CoreDNS config 🤔
Maybe that's worth an upstream issue.

@iwilltry42
Copy link
Member

This is indeed the intended behavior on K3s' end (thanks @brandond for the explanation 🙏 )
However, there are many ways this can be tackled on k3d's side, so let's see what nice solution we can come up with for the next release 👍

@jracabado
Copy link

Hi, I found this issue while investigating why when K3s containers created via K3d would not maintain the CoreDNS NodeHosts ConfigMap on host machine reboot. Could it be the same issue?

From my testing the issues comes down to the K3s starting by themselves without K3d orchestraction and it rewrites CoreDNS ConfigMap to only contain k3d-test-server-0 for my 1-server cluster.

@lerminou
Copy link
Author

Hi @jracabado yes. It's exactly the same problem for me.

@fragolinux
Copy link

and i think i've this issue, too, in a different scenario: just a k3s container restart, or docker restart, triggers the configmap to be emptied... issue here: 1112

@iwilltry42 iwilltry42 modified the milestones: v5.5.0, v5.6.0 May 17, 2023
@iwilltry42 iwilltry42 modified the milestones: v5.6.0, v5.8.0 Sep 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority/high
Projects
None yet
Development

No branches or pull requests

4 participants