Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm join does not explicitly wait for etcd to have grown when joining secondary control plane #1353

Closed
ereslibre opened this issue Jan 16, 2019 · 25 comments · Fixed by kubernetes/kubernetes#72984 or kubernetes/kubernetes#73112
Assignees
Labels
area/HA kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@ereslibre
Copy link
Contributor

ereslibre commented Jan 16, 2019

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version): 1.13 onwards.

What happened?

When joining a secondary control plane (kubeadm join --experimental-control-plane, or by using controlPlane configuration on the kubeadm YAML), and etcd needs growing from 1 instance to 2 there's a temporary blackout that we are not explicitly waiting for.

This is a spin-off of #1321 (comment).

Different issues were leading to #1321 error on the new joining master: error uploading configuration: Get https://10.11.0.1:6443/api/v1/namespaces/kube-system/configmaps/kubeadm-config: unexpected EOF.

  1. Kubeadm fails to bring up a HA cluster due to EOF error when uploading configmap #1321

    • Bad certificates when growing etcd. More certificates than the required ones are copied between master machines; kubeadm won't recreate them (as they exist), but their SANs are wrong and etcd will fail to grow to 2 nodes. The previous error happens because is the immediate next step after we write the etcd static pod manifests on the new master.
  2. kubeadm join controlplane not pulling images and fails #1341

    • When joining a new controller plane we are not pre-pulling the images
    • The uploadconfig phase can timeout with the previous mentioned error because the system is still pulling images while we are trying to upload the new cluster configuration (with the new node)
  3. This issue

    • When we grow etcd from 1 instance to 2 there's a blackout of etcd that we should explicitly wait for. Instead, we are at the moment implicitly waiting for it in this phase (uploadconfig). If this blackout is long enough (many index sync required, slow I/O), the uploadconfig phase can timeout with the error mentioned before.

What you expected to happen?

When we grow etcd we should perform an intermediate step explicitly waiting for etcd to be available (or healthy). Then, we can continue with the next step, in this case uploadconfig.

The timeout happens here because uploadconfig happens to be the first phase after the static etcd pod creation that tries to reach to the apiserver while the etcd blackout is still happening, when it tries to call to GetClusterStatus.

How to reproduce it (as minimally and precisely as possible)?

Please, refer to #1321 (comment)

@ereslibre
Copy link
Contributor Author

/assign

@oneoneonepig
Copy link

oneoneonepig commented Mar 20, 2020

Hi, I'm still getting the same error message while performing kubeadm join --control-plane ...
I'll have to update the ConfigMap, apply taint and labels to the second master node manually.

I'm following the steps specified here -> kubeadm/high-availability
Images are pre-pulled and the only operation done before kubeadm command was setting up HAProxy and Keepalived.

Error message (partially omitted):

  • Secondly joined control-plane node
[---omitted---]
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-03-20T06:14:36.711Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://10.20.131.51:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
error execution phase control-plane-join/update-status: error uploading configuration: unable to create ConfigMap: Post https://10.20.131.60:16443/api/v1/namespaces/kube-system/configmaps?timeout=10s: unexpected EOF
To see the stack trace of this error execute with --v=5 or higher
[---omitted---]

For more detailed log, I've made another attempt with verbosity set to 10 -> HERE

  • Thirdly joined control-plane node
[---omitted---]
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-03-20T06:15:23.315Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://10.20.131.52:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node node3 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node node3 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[---omitted---]
  • kubectl get node
NAME    STATUS   ROLES    AGE   VERSION
node1   Ready    master   18m   v1.17.4
node2   Ready    <none>   14m   v1.17.4
node3   Ready    master   13m   v1.17.4
  • kubectl get cm -n kube-system -o yaml
[---omitted---]
  ClusterStatus: |
    apiEndpoints:
      node1:
        advertiseAddress: 10.20.131.50
        bindPort: 6443
      node3:
        advertiseAddress: 10.20.131.52
        bindPort: 6443
[---omitted---]

@zedtux
Copy link

zedtux commented Apr 6, 2020

I'm in the same boat as @oneoneonepig.

I'm writing a chef cookbook and my tests are failing, with kubeadm 1.18, with the described error when the second master joins the first one.
With kubeadm 1.15 all works fine, I'll test with 1.16 and 1.17 soon.

@neolit123
Copy link
Member

neolit123 commented Apr 6, 2020

we are not seeing etcd related e2e failures in our most recent signal:
https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-1-18
https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-1-17

[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
error execution phase control-plane-join/update-status: error uploading configuration: unable to create ConfigMap: Post https://10.20.131.60:16443/api/v1/namespaces/kube-system/configmaps?timeout=10s: unexpected EOF

this will hopefully get fixed once we have proper retries for join:
#2094
#2093
#2092

@zedtux
Copy link

zedtux commented Apr 8, 2020

Here is the result of my test suite:

kubeadm version CNI Result
1.18 flannel Failed
1.17 flannel Failed
1.16 flannel Failed
1.15 flannel Passed
1.14 flannel Passed

My test is doing the following steps:

  1. Create, converge and verify a first control plane
  2. Update the node's fixture file with the kubeadm token and the discovery token CA cert hash
  3. Create, converge and verify a worker node
  4. Create, converge and verify a second control plane

The tests are all failing at the same step:

kubeadm join 172.28.128.10:6443 --token=ovvf16.0w86uz15yskxijg8 --discovery-token-ca-cert-hash=sha256:b819ff154c0e5ae070857e1e77a98acdea61813fc259d4b7b72dbff4f55aea67 --certificate-key=bf9685289b0e0a8685cd412608f2d9ac33c99a57bd44fde3cf3ebcef59d928c9 --apiserver-advertise-address=172.28.128.201 --control-plane
[preflight] Running pre-flight checks
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master-2-flannel localhost] and IPs [172.28.128.201 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master-2-flannel localhost] and IPs [172.28.128.201 127.0.0.1 ::1]
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master-2-flannel kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.28.128.201 172.28.128.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
{"level":"warn","ts":"2020-04-08T07:25:28.225Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://172.28.128.201:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node master-2-flannel as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master-2-flannel as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
This node has joined the cluster and a new control plane instance was created:
* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.
To start administering your cluster from this node, you need to run the following as a regular user:
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
Run 'kubectl get nodes' to see this node join the cluster.
An error has been detected from the output of the "kubeadm join 172.28.128.10:6443 --token=ovvf16.0w86uz15yskxijg8 --discovery-token-ca-cert-hash=sha256:b819ff154c0e5ae070857e1e77a98acdea61813fc259d4b7b72dbff4f55aea67 --certificate-key=bf9685289b0e0a8685cd412608f2d9ac33c99a57bd44fde3cf3ebcef59d928c9 --apiserver-advertise-address=172.28.128.201 --control-plane\n" command.

@neolit123
Copy link
Member

@zedtux

does this fix work for flannel?
https://github.com/kubernetes/website/pull/19995/files

please note that if one CNI plugin fails you should switch to another - kubeadm recommends Calico or WeaveNet.

@zedtux
Copy link

zedtux commented Apr 8, 2020

Thank you @neolit123 for pointing me to this, but I don't think this is useful for me as the Flannel pods are fine.

My cookbook is waiting for the Flannel pods to be in the "Running" state, and it also wait for the master node to be in the "Ready" state.

I have a test on the flannel pods as on the master node status too.

kubeadm recommends Calico or WeaveNet.

Oh ... I remember having read that flannel should work in many cases on the kubernetes.io documentation but I can't find it now.

Anyway I want to implement more CNI drivers, so I can give a try to Calico or WeaveNet, but I'm not sure that would solve the issue as with kubeadm 1.14 and 1.15 all works well.

@neolit123
Copy link
Member

we saw a number of issues related to flannel in recent k8s versions.

@zedtux
Copy link

zedtux commented Apr 8, 2020

Oh, okay, then I will give a try to Calico 😃

@vijayk13
Copy link

vijayk13 commented Apr 9, 2020

I am seeing the same issue as reported by @oneoneonepig and @zedtux on fresh cluster installs on Ubuntu 18.04 machines with both v1.18.0 and v1.18.1. Have tried with both Weave and Calico ...

I am following the instructions at https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/

I am using a "stacked control plane and etcd node" setup, and HAProxy as my load balancer. Have tried changing the haproxy config to just refer to the first control plane node (in case there is some race with the LB switching to the second control plane node before it is fully configured ...)

  • kubeadm init on the first control plane node works well
  • Am able to do a kubeadm join on the worker nodes, and these nodes are correctly added to the cluster
  • The problem happens when attempting a kubeadm join --control-plane on the second control plane node ... I get the same error that both @oneoneonepig and @zedtux have reported.
  • Additionally, kubectl on the first control plane node is no longer responsive
  • To rule out any races, have repeated this experiment multiple times ... same error
  • journalctl -u kubelet on the first control plane node shows the following:
Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"etcd-k8s-master1.160361462790264c", GenerateName:"", Namespace:"kube-system", SelfLink:"", UID:"", ResourceVersion:"", G$
Apr 07 00:14:16 k8s-master1 kubelet[8177]: I0407 00:14:16.919190    8177 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: d99c6e817fab4552a959835efcc5a85ac1f7cffe0e68d76ec226595e82852b19
Apr 07 00:14:16 k8s-master1 kubelet[8177]: I0407 00:14:16.924621    8177 topology_manager.go:219] [topologymanager] RemoveContainer - Container ID: 6341e35b5e3df33bb24041a966196ef22537a80ece28e0f0ca5616d7d0fa5361
Apr 07 00:14:20 k8s-master1 kubelet[8177]: E0407 00:14:20.487457    8177 controller.go:178] failed to update node lease, error: etcdserver: request timed out, possibly due to connection lost
Apr 07 00:14:21 k8s-master1 kubelet[8177]: E0407 00:14:21.717815    8177 event.go:260] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"kube-apiserver-k8s-master1.160361$

Apr 07 00:14:27 k8s-master1 kubelet[8177]: E0407 00:14:27.495727    8177 controller.go:178] failed to update node lease, error: etcdserver: request timed out
Apr 07 00:14:27 k8s-master1 kubelet[8177]: E0407 00:14:27.497272    8177 event.go:269] Unable to write event: 'Post https://k8s-lb.lxd:443/api/v1/namespaces/kube-system/events: read tcp 240.204.0.106:52706->240.205.0.21:443: use o$
Apr 07 00:14:34 k8s-master1 kubelet[8177]: E0407 00:14:34.535374    8177 controller.go:178] failed to update node lease, error: etcdserver: request timed out, possibly due to connection lost
Apr 07 00:14:41 k8s-master1 kubelet[8177]: E0407 00:14:41.565301    8177 controller.go:178] failed to update node lease, error: etcdserver: request timed out, possibly due to connection lost
Apr 07 00:14:41 k8s-master1 kubelet[8177]: E0407 00:14:41.566842    8177 event.go:269] Unable to write event: 'Post https://k8s-lb.lxd:443/api/v1/namespaces/kube-system/events: read tcp 240.204.0.106:59096->240.205.0.21:443: use o$
Apr 07 00:14:48 k8s-master1 kubelet[8177]: E0407 00:14:48.596349    8177 controller.go:178] failed to update node lease, error: etcdserver: request timed out, possibly due to connection lost
Apr 07 00:14:48 k8s-master1 kubelet[8177]: I0407 00:14:48.598492    8177 controller.go:106] failed to update lease using latest lease, fallback to ensure lease, err: failed 5 attempts to update node lease
Apr 07 00:14:57 k8s-master1 kubelet[8177]: W0407 00:14:57.264040    8177 status_manager.go:556] Failed to get status for pod "kube-scheduler-k8s-master1_kube-system(5795d0c442cb997ff93c49feeb9f6386)": etcdserver: request timed out
Apr 07 00:14:57 k8s-master1 kubelet[8177]: E0407 00:14:57.266879    8177 controller.go:136] failed to ensure node lease exists, will retry in 200ms, error: etcdserver: request timed out
  • was able to look through the etcdserver logs in one of the runs by using docker level commands to get to the container (before the system reclaimed it/tried to restart it) ... seems as if the join request from the second node is triggering etcd leader election, which does not successfully complete for some reason.

Am a little stuck with how to make progress here, so would appreciate some guidance from the community.

@vijayk13
Copy link

vijayk13 commented Apr 9, 2020

Am wondering if a new issue should be opened for these recent problems, given that this issue is in "Closed" state?

@neolit123
Copy link
Member

we have e2e tests using HAProxy as the LB and we are not seeing issues, so it must be specific to your setup:
https://k8s-testgrid.appspot.com/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-1-18

a bit hard to advise on how to resolve it, as kubeadm is LB agnostic.

Am wondering if a new issue should be opened for these recent problems, given that this issue is in "Closed" state?

i think solving this new issue may resolve the issues that you are seeing:
#2092

Additionally, kubectl on the first control plane node is no longer responsive

is this problem permanent or only for a period of time during LB reconfiguration?

(in case there is some race with the LB switching to the second control plane node before it is fully configured ...)

to my understand this is the where the problem lies. once you reconfigure the LB you are going to see a LB blackout, but it should be temporary.

@vijayk13
Copy link

vijayk13 commented Apr 9, 2020

Wanted to clarify that I was not changing the HAProxy configuration midstream ... in one experiment, I had listed both the control plane nodes as my backend servers, and in the other, just the first node.

Is there a timeline for #2092?

Also, in your e2e tests, are you using a physical cluster (with potentially additional network delays between the components), or a single host setup with multiple containers? I am not very familiar with the kind setup, hence this question. My experiments are being run on a cluster with multiple physical nodes (the LB and the two control-plane nodes are on different hosts).

Are there some logs I can collect/share that might shed light on this issue?
Also, @zedtux can perhaps confirm whether or not your recommended switch to Calico fixed his problems ...

@neolit123
Copy link
Member

Is there a timeline for #2092?

should be fixed in 1.19 and backported to 1.18, 1.17...hopefully.

Also, in your e2e tests, are you using a physical cluster (with potentially additional network delays between the components), or a single host setup with multiple containers? I am not very familiar with the kind setup, hence this question. My experiments are being run on a cluster with multiple physical nodes (the LB and the two control-plane nodes are on different hosts).

it's a docker in docker setup, so yes the nodes are local containers and we don't simulate network latency.

yet the issue that is being discussed in this thread is apparently very rare. are the networking delays significant on your side? and are the control plane machines spec-ed well on CPU/RAM?

Are there some logs I can collect/share that might shed light on this issue?

@zedtux 's logs are missing for kubeadm join so if you provide the output of your kubeadm join --control-plane --v=5 this might help us understand this a little better.

@ereslibre might have more suggestions.

@vijayk13
Copy link

vijayk13 commented Apr 9, 2020

The control plane nodes are KVM VMs on Ubuntu hosts. Each control plane node VM is provisioned with 4 CPUs and 8 GB of DRAM. The inter-VM latencies don't show anything out of place.

I have repeated the experiment starting from a clean install and will share the following logs below:

  • The output of kubeadm init --v=5 on the first control node (k8s-master1)
  • The output of several kubectl commands after the first control node has been initialized with Calico
  • The output of kubeadm join --control-plane --v=5 on the second control node (k8s-master2)
  • An attempt to run kubectl commands on the first control node after the error-ed join above ... showing that kubectl client appears to have lost connectivity with the server
  • Logs retrieved using docker ps to find the etcd container (on k8s-master1)

Appreciate any insights you might have.

--- Output of kubeadm init --v=5 ---

I0409 02:46:25.937531    4963 initconfiguration.go:103] detected and using CRI socket: /var/run/dockershim.sock
I0409 02:46:25.938160    4963 interface.go:400] Looking for default routes with IPv4 addresses
I0409 02:46:25.938194    4963 interface.go:405] Default route transits interface "enp5s0"
I0409 02:46:25.938467    4963 interface.go:208] Interface enp5s0 is up
I0409 02:46:25.938606    4963 interface.go:256] Interface "enp5s0" has 2 addresses :[240.204.0.235/8 fe80::216:3eff:fe6b:c874/64].
I0409 02:46:25.938658    4963 interface.go:223] Checking addr  240.204.0.235/8.
I0409 02:46:25.938682    4963 interface.go:230] IP found 240.204.0.235
I0409 02:46:25.938718    4963 interface.go:262] Found valid IPv4 address 240.204.0.235 for interface "enp5s0".
I0409 02:46:25.938738    4963 interface.go:411] Found active IP 240.204.0.235 
I0409 02:46:25.938851    4963 version.go:183] fetching Kubernetes version from URL: https://dl.k8s.io/release/stable-1.txt
W0409 02:46:26.340956    4963 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[init] Using Kubernetes version: v1.18.1
[preflight] Running pre-flight checks
I0409 02:46:26.341493    4963 checks.go:577] validating Kubernetes and kubeadm version
I0409 02:46:26.341548    4963 checks.go:166] validating if the firewall is enabled and active
I0409 02:46:26.354902    4963 checks.go:201] validating availability of port 6443
I0409 02:46:26.355251    4963 checks.go:201] validating availability of port 10259
I0409 02:46:26.355329    4963 checks.go:201] validating availability of port 10257
I0409 02:46:26.355398    4963 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
I0409 02:46:26.355429    4963 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
I0409 02:46:26.355462    4963 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
I0409 02:46:26.355482    4963 checks.go:286] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0409 02:46:26.355510    4963 checks.go:432] validating if the connectivity type is via proxy or direct
I0409 02:46:26.355557    4963 checks.go:471] validating http connectivity to first IP address in the CIDR
I0409 02:46:26.355596    4963 checks.go:471] validating http connectivity to first IP address in the CIDR
I0409 02:46:26.355627    4963 checks.go:102] validating the container runtime
I0409 02:46:26.480648    4963 checks.go:128] validating if the service is enabled and active
I0409 02:46:26.616401    4963 checks.go:335] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0409 02:46:26.616498    4963 checks.go:335] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0409 02:46:26.616545    4963 checks.go:649] validating whether swap is enabled or not
I0409 02:46:26.616612    4963 checks.go:376] validating the presence of executable conntrack
I0409 02:46:26.616672    4963 checks.go:376] validating the presence of executable ip
I0409 02:46:26.616748    4963 checks.go:376] validating the presence of executable iptables
I0409 02:46:26.616785    4963 checks.go:376] validating the presence of executable mount
I0409 02:46:26.616820    4963 checks.go:376] validating the presence of executable nsenter
I0409 02:46:26.616851    4963 checks.go:376] validating the presence of executable ebtables
I0409 02:46:26.616879    4963 checks.go:376] validating the presence of executable ethtool
I0409 02:46:26.616914    4963 checks.go:376] validating the presence of executable socat
I0409 02:46:26.616937    4963 checks.go:376] validating the presence of executable tc
I0409 02:46:26.616968    4963 checks.go:376] validating the presence of executable touch
I0409 02:46:26.616991    4963 checks.go:520] running all checks
I0409 02:46:26.723760    4963 checks.go:406] checking whether the given node name is reachable using net.LookupHost
I0409 02:46:26.723812    4963 checks.go:618] validating kubelet version
I0409 02:46:26.804357    4963 checks.go:128] validating if the service is enabled and active
I0409 02:46:26.819966    4963 checks.go:201] validating availability of port 10250
I0409 02:46:26.820111    4963 checks.go:201] validating availability of port 2379
I0409 02:46:26.820179    4963 checks.go:201] validating availability of port 2380
I0409 02:46:26.820247    4963 checks.go:249] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
I0409 02:46:26.897440    4963 checks.go:844] pulling k8s.gcr.io/kube-apiserver:v1.18.1
I0409 02:46:31.383971    4963 checks.go:844] pulling k8s.gcr.io/kube-controller-manager:v1.18.1
I0409 02:46:34.548733    4963 checks.go:844] pulling k8s.gcr.io/kube-scheduler:v1.18.1
I0409 02:46:36.830587    4963 checks.go:844] pulling k8s.gcr.io/kube-proxy:v1.18.1
I0409 02:46:40.171454    4963 checks.go:844] pulling k8s.gcr.io/pause:3.2
I0409 02:46:41.577193    4963 checks.go:844] pulling k8s.gcr.io/etcd:3.4.3-0
I0409 02:46:47.851089    4963 checks.go:844] pulling k8s.gcr.io/coredns:1.6.7
I0409 02:46:50.017369    4963 kubelet.go:64] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Starting the kubelet
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0409 02:46:50.289461    4963 certs.go:103] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master1 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-lb.lxd] and IPs [10.96.0.1 240.204.0.235]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0409 02:46:51.821707    4963 certs.go:103] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
I0409 02:46:52.392514    4963 certs.go:103] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master1 localhost] and IPs [240.204.0.235 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master1 localhost] and IPs [240.204.0.235 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0409 02:46:53.804695    4963 certs.go:69] creating new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0409 02:46:54.104212    4963 kubeconfig.go:79] creating kubeconfig file for admin.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
I0409 02:46:54.700568    4963 kubeconfig.go:79] creating kubeconfig file for kubelet.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0409 02:46:55.080364    4963 kubeconfig.go:79] creating kubeconfig file for controller-manager.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0409 02:46:55.424075    4963 kubeconfig.go:79] creating kubeconfig file for scheduler.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0409 02:46:55.719743    4963 manifests.go:91] [control-plane] getting StaticPodSpecs
I0409 02:46:55.720397    4963 manifests.go:104] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0409 02:46:55.720415    4963 manifests.go:104] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0409 02:46:55.720426    4963 manifests.go:104] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0409 02:46:55.720440    4963 manifests.go:104] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0409 02:46:55.720453    4963 manifests.go:104] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I0409 02:46:55.733531    4963 manifests.go:121] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0409 02:46:55.733560    4963 manifests.go:91] [control-plane] getting StaticPodSpecs
W0409 02:46:55.733642    4963 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
I0409 02:46:55.734005    4963 manifests.go:104] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0409 02:46:55.734028    4963 manifests.go:104] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0409 02:46:55.734041    4963 manifests.go:104] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0409 02:46:55.734055    4963 manifests.go:104] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0409 02:46:55.734066    4963 manifests.go:104] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0409 02:46:55.734079    4963 manifests.go:104] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0409 02:46:55.734093    4963 manifests.go:104] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
I0409 02:46:55.735467    4963 manifests.go:121] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0409 02:46:55.735498    4963 manifests.go:91] [control-plane] getting StaticPodSpecs
W0409 02:46:55.735579    4963 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
I0409 02:46:55.735881    4963 manifests.go:104] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0409 02:46:55.736603    4963 manifests.go:121] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
I0409 02:46:55.737692    4963 local.go:72] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0409 02:46:55.737720    4963 waitcontrolplane.go:87] [wait-control-plane] Waiting for the API server to be healthy
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
I0409 02:47:00.752822    4963 request.go:907] Got a Retry-After 1s response for attempt 1 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:01.761282    4963 request.go:907] Got a Retry-After 1s response for attempt 2 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:02.769232    4963 request.go:907] Got a Retry-After 1s response for attempt 3 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:03.777253    4963 request.go:907] Got a Retry-After 1s response for attempt 4 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:04.784954    4963 request.go:907] Got a Retry-After 1s response for attempt 5 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:06.293405    4963 request.go:907] Got a Retry-After 1s response for attempt 1 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:07.301146    4963 request.go:907] Got a Retry-After 1s response for attempt 2 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:08.309775    4963 request.go:907] Got a Retry-After 1s response for attempt 3 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:09.319291    4963 request.go:907] Got a Retry-After 1s response for attempt 4 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:10.324572    4963 request.go:907] Got a Retry-After 1s response for attempt 5 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:11.330258    4963 request.go:907] Got a Retry-After 1s response for attempt 6 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:12.337988    4963 request.go:907] Got a Retry-After 1s response for attempt 7 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:13.345403    4963 request.go:907] Got a Retry-After 1s response for attempt 8 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:14.352571    4963 request.go:907] Got a Retry-After 1s response for attempt 9 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:15.808261    4963 request.go:907] Got a Retry-After 1s response for attempt 1 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:16.816547    4963 request.go:907] Got a Retry-After 1s response for attempt 2 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:17.824911    4963 request.go:907] Got a Retry-After 1s response for attempt 3 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:18.833085    4963 request.go:907] Got a Retry-After 1s response for attempt 4 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:19.841251    4963 request.go:907] Got a Retry-After 1s response for attempt 5 to https://k8s-lb.lxd:443/healthz?timeout=10s
I0409 02:47:20.849079    4963 request.go:907] Got a Retry-After 1s response for attempt 6 to https://k8s-lb.lxd:443/healthz?timeout=10s
[apiclient] All control plane components are healthy after 27.635538 seconds
I0409 02:47:23.377026    4963 uploadconfig.go:108] [upload-config] Uploading the kubeadm ClusterConfiguration to a ConfigMap
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
I0409 02:47:23.402507    4963 uploadconfig.go:122] [upload-config] Uploading the kubelet component config to a ConfigMap
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
I0409 02:47:23.412081    4963 uploadconfig.go:127] [upload-config] Preserving the CRISocket information for the control-plane node
I0409 02:47:23.412119    4963 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "k8s-master1" as an annotation
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
8a1d92c8ac62f60562ea128cb223bd1e480168d4ff1a1e5c6f0b5216f08f120a
[mark-control-plane] Marking the node k8s-master1 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node k8s-master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: vdp92a.zblpzq3slbvg3irw
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
I0409 02:47:24.480765    4963 clusterinfo.go:45] [bootstrap-token] loading admin kubeconfig
I0409 02:47:24.481285    4963 clusterinfo.go:53] [bootstrap-token] copying the cluster from admin.conf to the bootstrap kubeconfig
I0409 02:47:24.481620    4963 clusterinfo.go:65] [bootstrap-token] creating/updating ConfigMap in kube-public namespace
I0409 02:47:24.484339    4963 clusterinfo.go:79] creating the RBAC rules for exposing the cluster-info ConfigMap in the kube-public namespace
I0409 02:47:24.489291    4963 kubeletfinalize.go:88] [kubelet-finalize] Assuming that kubelet client certificate rotation is enabled: found "/var/lib/kubelet/pki/kubelet-client-current.pem"
[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
I0409 02:47:24.490532    4963 kubeletfinalize.go:132] [kubelet-finalize] Restarting the kubelet to enable client certificate rotation
I0409 02:47:24.912570    4963 request.go:557] Throttling request took 184.455361ms, request: POST:https://k8s-lb.lxd:443/api/v1/namespaces/kube-system/serviceaccounts?timeout=10s
I0409 02:47:25.112691    4963 request.go:557] Throttling request took 143.281461ms, request: POST:https://k8s-lb.lxd:443/api/v1/namespaces/kube-system/services?timeout=10s
[addons] Applied essential addon: CoreDNS
I0409 02:47:25.312567    4963 request.go:557] Throttling request took 186.795081ms, request: POST:https://k8s-lb.lxd:443/api/v1/namespaces/kube-system/serviceaccounts?timeout=10s
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
I0409 02:47:25.512709    4963 request.go:557] Throttling request took 192.927223ms, request: POST:https://k8s-lb.lxd:443/api/v1/namespaces/kube-system/configmaps?timeout=10s
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

--- Output of kubectl commands on first control node ---

kubectl version:
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:38:50Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.1", GitCommit:"7879fc12a63337efff607952a323df90cdc7a335", GitTreeState:"clean", BuildDate:"2020-04-08T17:30:47Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

kubectl cluster-info:
Kubernetes master is running at https://k8s-lb.lxd:443
KubeDNS is running at https://k8s-lb.lxd:443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

kubectl get nodes:
NAME          STATUS   ROLES    AGE     VERSION
k8s-master1   Ready    master   9m10s   v1.18.1

kubectl get pods -n kube-system:
NAME                                       READY   STATUS    RESTARTS   AGE
calico-kube-controllers-555fc8cc5c-5r6gd   1/1     Running   0          6m20s
calico-node-6hlnc                          1/1     Running   0          6m20s
coredns-66bff467f8-4bp9c                   1/1     Running   0          9m28s
coredns-66bff467f8-x7mzk                   1/1     Running   0          9m28s
etcd-k8s-master1                           1/1     Running   0          9m26s
kube-apiserver-k8s-master1                 1/1     Running   0          9m26s
kube-controller-manager-k8s-master1        1/1     Running   0          9m26s
kube-proxy-gb6cp                           1/1     Running   0          9m28s
kube-scheduler-k8s-master1                 1/1     Running   0          9m26s

--- Output of kubeadm join --control-plane --v=5 on k8s-master2 ---

I0409 02:59:15.410344    8947 join.go:371] [preflight] found NodeName empty; using OS hostname as NodeName
I0409 02:59:15.410436    8947 join.go:375] [preflight] found advertiseAddress empty; using default interface's IP address as advertiseAddress
I0409 02:59:15.410499    8947 initconfiguration.go:103] detected and using CRI socket: /var/run/dockershim.sock
I0409 02:59:15.410718    8947 interface.go:400] Looking for default routes with IPv4 addresses
I0409 02:59:15.410730    8947 interface.go:405] Default route transits interface "enp5s0"
I0409 02:59:15.410898    8947 interface.go:208] Interface enp5s0 is up
I0409 02:59:15.410976    8947 interface.go:256] Interface "enp5s0" has 2 addresses :[240.197.0.10/8 fe80::216:3eff:fe8b:3ac1/64].
I0409 02:59:15.411006    8947 interface.go:223] Checking addr  240.197.0.10/8.
I0409 02:59:15.411017    8947 interface.go:230] IP found 240.197.0.10
I0409 02:59:15.411032    8947 interface.go:262] Found valid IPv4 address 240.197.0.10 for interface "enp5s0".
I0409 02:59:15.411041    8947 interface.go:411] Found active IP 240.197.0.10 
[preflight] Running pre-flight checks
I0409 02:59:15.411149    8947 preflight.go:90] [preflight] Running general checks
I0409 02:59:15.411230    8947 checks.go:249] validating the existence and emptiness of directory /etc/kubernetes/manifests
I0409 02:59:15.411309    8947 checks.go:286] validating the existence of file /etc/kubernetes/kubelet.conf
I0409 02:59:15.411330    8947 checks.go:286] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
I0409 02:59:15.411362    8947 checks.go:102] validating the container runtime
I0409 02:59:15.605932    8947 checks.go:128] validating if the service is enabled and active
I0409 02:59:15.760479    8947 checks.go:335] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0409 02:59:15.760687    8947 checks.go:335] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0409 02:59:15.760796    8947 checks.go:649] validating whether swap is enabled or not
I0409 02:59:15.760902    8947 checks.go:376] validating the presence of executable conntrack
I0409 02:59:15.760980    8947 checks.go:376] validating the presence of executable ip
I0409 02:59:15.761125    8947 checks.go:376] validating the presence of executable iptables
I0409 02:59:15.761173    8947 checks.go:376] validating the presence of executable mount
I0409 02:59:15.761239    8947 checks.go:376] validating the presence of executable nsenter
I0409 02:59:15.761308    8947 checks.go:376] validating the presence of executable ebtables
I0409 02:59:15.761366    8947 checks.go:376] validating the presence of executable ethtool
I0409 02:59:15.761463    8947 checks.go:376] validating the presence of executable socat
I0409 02:59:15.761551    8947 checks.go:376] validating the presence of executable tc
I0409 02:59:15.761628    8947 checks.go:376] validating the presence of executable touch
I0409 02:59:15.761679    8947 checks.go:520] running all checks
I0409 02:59:15.899094    8947 checks.go:406] checking whether the given node name is reachable using net.LookupHost
I0409 02:59:15.899362    8947 checks.go:618] validating kubelet version
I0409 02:59:15.980502    8947 checks.go:128] validating if the service is enabled and active
I0409 02:59:15.998322    8947 checks.go:201] validating availability of port 10250
I0409 02:59:15.998699    8947 checks.go:432] validating if the connectivity type is via proxy or direct
I0409 02:59:15.998797    8947 join.go:441] [preflight] Discovering cluster-info
I0409 02:59:15.998868    8947 token.go:78] [discovery] Created cluster-info discovery client, requesting info from "k8s-lb.lxd:443"
I0409 02:59:21.029789    8947 token.go:116] [discovery] Requesting info from "k8s-lb.lxd:443" again to validate TLS against the pinned public key
I0409 02:59:21.046609    8947 token.go:133] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "k8s-lb.lxd:443"
I0409 02:59:21.046638    8947 discovery.go:51] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I0409 02:59:21.046671    8947 join.go:455] [preflight] Fetching init configuration
I0409 02:59:21.046682    8947 join.go:493] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I0409 02:59:21.063482    8947 interface.go:400] Looking for default routes with IPv4 addresses
I0409 02:59:21.063511    8947 interface.go:405] Default route transits interface "enp5s0"
I0409 02:59:21.063695    8947 interface.go:208] Interface enp5s0 is up
I0409 02:59:21.063810    8947 interface.go:256] Interface "enp5s0" has 2 addresses :[240.197.0.10/8 fe80::216:3eff:fe8b:3ac1/64].
I0409 02:59:21.063846    8947 interface.go:223] Checking addr  240.197.0.10/8.
I0409 02:59:21.063865    8947 interface.go:230] IP found 240.197.0.10
I0409 02:59:21.063879    8947 interface.go:262] Found valid IPv4 address 240.197.0.10 for interface "enp5s0".
I0409 02:59:21.063895    8947 interface.go:411] Found active IP 240.197.0.10 
I0409 02:59:21.063996    8947 preflight.go:101] [preflight] Running configuration dependant checks
[preflight] Running pre-flight checks before initializing the new control plane instance
I0409 02:59:21.064057    8947 checks.go:577] validating Kubernetes and kubeadm version
I0409 02:59:21.064092    8947 checks.go:166] validating if the firewall is enabled and active
I0409 02:59:21.079122    8947 checks.go:201] validating availability of port 6443
I0409 02:59:21.079236    8947 checks.go:201] validating availability of port 10259
I0409 02:59:21.079304    8947 checks.go:201] validating availability of port 10257
I0409 02:59:21.079369    8947 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
I0409 02:59:21.079411    8947 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
I0409 02:59:21.079435    8947 checks.go:286] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
I0409 02:59:21.079456    8947 checks.go:286] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0409 02:59:21.079477    8947 checks.go:432] validating if the connectivity type is via proxy or direct
I0409 02:59:21.079542    8947 checks.go:471] validating http connectivity to first IP address in the CIDR
I0409 02:59:21.079587    8947 checks.go:471] validating http connectivity to first IP address in the CIDR
I0409 02:59:21.079603    8947 checks.go:201] validating availability of port 2379
I0409 02:59:21.079676    8947 checks.go:201] validating availability of port 2380
I0409 02:59:21.079738    8947 checks.go:249] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
I0409 02:59:21.157165    8947 checks.go:844] pulling k8s.gcr.io/kube-apiserver:v1.18.1
I0409 02:59:25.602170    8947 checks.go:844] pulling k8s.gcr.io/kube-controller-manager:v1.18.1
I0409 02:59:28.828122    8947 checks.go:844] pulling k8s.gcr.io/kube-scheduler:v1.18.1
I0409 02:59:31.124127    8947 checks.go:844] pulling k8s.gcr.io/kube-proxy:v1.18.1
I0409 02:59:34.272751    8947 checks.go:844] pulling k8s.gcr.io/pause:3.2
I0409 02:59:35.668368    8947 checks.go:844] pulling k8s.gcr.io/etcd:3.4.3-0
I0409 02:59:42.359004    8947 checks.go:844] pulling k8s.gcr.io/coredns:1.6.7
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0409 02:59:44.545013    8947 certs.go:38] creating PKI assets
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master2 localhost] and IPs [240.197.0.10 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master2 localhost] and IPs [240.197.0.10 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master2 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local k8s-lb.lxd] and IPs [10.96.0.1 240.197.0.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
I0409 02:59:48.142303    8947 certs.go:69] creating new public/private key files for signing service account users
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0409 02:59:50.743225    8947 manifests.go:91] [control-plane] getting StaticPodSpecs
W0409 02:59:50.743550    8947 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
I0409 02:59:50.744414    8947 manifests.go:104] [control-plane] adding volume "ca-certs" for component "kube-apiserver"
I0409 02:59:50.744484    8947 manifests.go:104] [control-plane] adding volume "etc-ca-certificates" for component "kube-apiserver"
I0409 02:59:50.744512    8947 manifests.go:104] [control-plane] adding volume "k8s-certs" for component "kube-apiserver"
I0409 02:59:50.744533    8947 manifests.go:104] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-apiserver"
I0409 02:59:50.744554    8947 manifests.go:104] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-apiserver"
I0409 02:59:50.762594    8947 manifests.go:121] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0409 02:59:50.762686    8947 manifests.go:91] [control-plane] getting StaticPodSpecs
W0409 02:59:50.763069    8947 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
I0409 02:59:50.763811    8947 manifests.go:104] [control-plane] adding volume "ca-certs" for component "kube-controller-manager"
I0409 02:59:50.763871    8947 manifests.go:104] [control-plane] adding volume "etc-ca-certificates" for component "kube-controller-manager"
I0409 02:59:50.763903    8947 manifests.go:104] [control-plane] adding volume "flexvolume-dir" for component "kube-controller-manager"
I0409 02:59:50.763922    8947 manifests.go:104] [control-plane] adding volume "k8s-certs" for component "kube-controller-manager"
I0409 02:59:50.763941    8947 manifests.go:104] [control-plane] adding volume "kubeconfig" for component "kube-controller-manager"
I0409 02:59:50.763960    8947 manifests.go:104] [control-plane] adding volume "usr-local-share-ca-certificates" for component "kube-controller-manager"
I0409 02:59:50.763979    8947 manifests.go:104] [control-plane] adding volume "usr-share-ca-certificates" for component "kube-controller-manager"
I0409 02:59:50.766680    8947 manifests.go:121] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0409 02:59:50.766762    8947 manifests.go:91] [control-plane] getting StaticPodSpecs
W0409 02:59:50.766910    8947 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
I0409 02:59:50.767428    8947 manifests.go:104] [control-plane] adding volume "kubeconfig" for component "kube-scheduler"
I0409 02:59:50.768489    8947 manifests.go:121] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[check-etcd] Checking that the etcd cluster is healthy
I0409 02:59:50.771121    8947 local.go:78] [etcd] Checking etcd cluster health
I0409 02:59:50.771189    8947 local.go:81] creating etcd client that connects to etcd pods
I0409 02:59:50.771231    8947 etcd.go:178] retrieving etcd endpoints from "kubeadm.kubernetes.io/etcd.advertise-client-urls" annotation in etcd Pods
I0409 02:59:50.808354    8947 etcd.go:102] etcd endpoints read from pods: https://240.204.0.235:2379
I0409 02:59:50.827229    8947 etcd.go:250] etcd endpoints read from etcd: https://240.204.0.235:2379
I0409 02:59:50.827328    8947 etcd.go:120] update etcd endpoints: https://240.204.0.235:2379
I0409 02:59:50.861852    8947 kubelet.go:111] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I0409 02:59:50.863755    8947 kubelet.go:145] [kubelet-start] Checking for an existing Node in the cluster with name "k8s-master2" and status "Ready"
I0409 02:59:50.867588    8947 kubelet.go:159] [kubelet-start] Stopping the kubelet
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I0409 02:59:53.666658    8947 cert_rotation.go:137] Starting client certificate rotation controller
I0409 02:59:53.671366    8947 kubelet.go:194] [kubelet-start] preserving the crisocket information for the node
I0409 02:59:53.671427    8947 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "k8s-master2" as an annotation
I0409 03:00:00.191405    8947 local.go:130] creating etcd client that connects to etcd pods
I0409 03:00:00.191490    8947 etcd.go:178] retrieving etcd endpoints from "kubeadm.kubernetes.io/etcd.advertise-client-urls" annotation in etcd Pods
I0409 03:00:00.197876    8947 etcd.go:102] etcd endpoints read from pods: https://240.204.0.235:2379
I0409 03:00:00.215578    8947 etcd.go:250] etcd endpoints read from etcd: https://240.204.0.235:2379
I0409 03:00:00.215632    8947 etcd.go:120] update etcd endpoints: https://240.204.0.235:2379
I0409 03:00:00.215648    8947 local.go:139] Adding etcd member: https://240.197.0.10:2380
[etcd] Announced new etcd member joining to the existing etcd cluster
I0409 03:00:00.247695    8947 local.go:145] Updated etcd member list: [{k8s-master1 https://240.204.0.235:2380} {k8s-master2 https://240.197.0.10:2380}]
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
I0409 03:00:00.249146    8947 etcd.go:500] [etcd] attempting to see if all cluster endpoints ([https://240.204.0.235:2379 https://240.197.0.10:2379]) are available 1/8
[kubelet-check] Initial timeout of 40s passed.
I0409 03:00:40.276206    8947 etcd.go:480] Failed to get etcd status for https://240.197.0.10:2379: failed to dial endpoint https://240.197.0.10:2379 with maintenance client: context deadline exceeded
I0409 03:01:20.331362    8947 etcd.go:480] Failed to get etcd status for https://240.197.0.10:2379: failed to dial endpoint https://240.197.0.10:2379 with maintenance client: context deadline exceeded
I0409 03:02:00.434365    8947 etcd.go:480] Failed to get etcd status for https://240.197.0.10:2379: failed to dial endpoint https://240.197.0.10:2379 with maintenance client: context deadline exceeded
I0409 03:02:40.642867    8947 etcd.go:480] Failed to get etcd status for https://240.197.0.10:2379: failed to dial endpoint https://240.197.0.10:2379 with maintenance client: context deadline exceeded

--- Output of attempted kubectl command on first control node after above ---

kubectl get nodes:
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding

--- Output of docker log <etcdserver-container> on first control node ---

[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2020-04-09 02:47:09.589345 I | etcdmain: etcd Version: 3.4.3
2020-04-09 02:47:09.589396 I | etcdmain: Git SHA: 3cf2f69b5
2020-04-09 02:47:09.589400 I | etcdmain: Go Version: go1.12.12
2020-04-09 02:47:09.589405 I | etcdmain: Go OS/Arch: linux/amd64
2020-04-09 02:47:09.589409 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
[WARNING] Deprecated '--logger=capnslog' flag is set; use '--logger=zap' flag instead
2020-04-09 02:47:09.589542 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 
2020-04-09 02:47:09.590605 I | embed: name = k8s-master1
2020-04-09 02:47:09.590623 I | embed: data dir = /var/lib/etcd
2020-04-09 02:47:09.590630 I | embed: member dir = /var/lib/etcd/member
2020-04-09 02:47:09.590634 I | embed: heartbeat = 100ms
2020-04-09 02:47:09.590637 I | embed: election = 1000ms
2020-04-09 02:47:09.590641 I | embed: snapshot count = 10000
2020-04-09 02:47:09.590660 I | embed: advertise client URLs = https://240.204.0.235:2379
2020-04-09 02:47:09.595665 I | etcdserver: starting member 36cea8585d2d08ae in cluster a89b8c7b3ac41e4f
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae switched to configuration voters=()
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae became follower at term 0
raft2020/04/09 02:47:09 INFO: newRaft 36cea8585d2d08ae [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae became follower at term 1
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae switched to configuration voters=(3949279020724324526)
2020-04-09 02:47:09.606123 W | auth: simple token is not cryptographically signed
2020-04-09 02:47:09.622736 I | etcdserver: starting server... [version: 3.4.3, cluster version: to_be_decided]
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae switched to configuration voters=(3949279020724324526)
2020-04-09 02:47:09.624935 I | etcdserver/membership: added member 36cea8585d2d08ae [https://240.204.0.235:2380] to cluster a89b8c7b3ac41e4f
2020-04-09 02:47:09.625008 I | etcdserver: 36cea8585d2d08ae as single-node; fast-forwarding 9 ticks (election ticks 10)
2020-04-09 02:47:09.626821 I | embed: ClientTLS: cert = /etc/kubernetes/pki/etcd/server.crt, key = /etc/kubernetes/pki/etcd/server.key, trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file = 
2020-04-09 02:47:09.626995 I | embed: listening for metrics on http://127.0.0.1:2381
2020-04-09 02:47:09.627092 I | embed: listening for peers on 240.204.0.235:2380
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae is starting a new election at term 1
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae became candidate at term 2
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae received MsgVoteResp from 36cea8585d2d08ae at term 2
raft2020/04/09 02:47:09 INFO: 36cea8585d2d08ae became leader at term 2
raft2020/04/09 02:47:09 INFO: raft.node: 36cea8585d2d08ae elected leader 36cea8585d2d08ae at term 2
2020-04-09 02:47:09.903398 I | etcdserver: setting up the initial cluster version to 3.4
2020-04-09 02:47:09.903974 N | etcdserver/membership: set the initial cluster version to 3.4
2020-04-09 02:47:09.904358 I | etcdserver/api: enabled capabilities for version 3.4
2020-04-09 02:47:09.904518 I | etcdserver: published {Name:k8s-master1 ClientURLs:[https://240.204.0.235:2379]} to cluster a89b8c7b3ac41e4f
2020-04-09 02:47:09.904657 I | embed: ready to serve client requests
2020-04-09 02:47:09.904844 I | embed: ready to serve client requests
2020-04-09 02:47:09.908220 I | embed: serving client requests on 127.0.0.1:2379
2020-04-09 02:47:09.919930 I | embed: serving client requests on 240.204.0.235:2379
2020-04-09 02:57:10.095531 I | mvcc: store.index: compact 1177
2020-04-09 02:57:10.115078 I | mvcc: finished scheduled compaction at 1177 (took 18.779693ms)
raft2020/04/09 03:00:00 INFO: 36cea8585d2d08ae switched to configuration voters=(3949279020724324526 18000103369015100633)
2020-04-09 03:00:00.238784 I | etcdserver/membership: added member f9cd36a53e7504d9 [https://240.197.0.10:2380] to cluster a89b8c7b3ac41e4f
2020-04-09 03:00:00.238850 I | rafthttp: starting peer f9cd36a53e7504d9...
2020-04-09 03:00:00.238903 I | rafthttp: started HTTP pipelining with peer f9cd36a53e7504d9
2020-04-09 03:00:00.240424 I | rafthttp: started streaming with peer f9cd36a53e7504d9 (writer)
2020-04-09 03:00:00.244430 I | rafthttp: started streaming with peer f9cd36a53e7504d9 (writer)
2020-04-09 03:00:00.253166 I | rafthttp: started peer f9cd36a53e7504d9
2020-04-09 03:00:00.253235 I | rafthttp: started streaming with peer f9cd36a53e7504d9 (stream MsgApp v2 reader)
2020-04-09 03:00:00.253300 I | rafthttp: started streaming with peer f9cd36a53e7504d9 (stream Message reader)
2020-04-09 03:00:00.253381 I | rafthttp: added peer f9cd36a53e7504d9
raft2020/04/09 03:00:01 WARN: 36cea8585d2d08ae stepped down to follower since quorum is not active
raft2020/04/09 03:00:01 INFO: 36cea8585d2d08ae became follower at term 2
raft2020/04/09 03:00:01 INFO: raft.node: 36cea8585d2d08ae lost leader 36cea8585d2d08ae at term 2
raft2020/04/09 03:00:03 INFO: 36cea8585d2d08ae is starting a new election at term 2
raft2020/04/09 03:00:03 INFO: 36cea8585d2d08ae became candidate at term 3
raft2020/04/09 03:00:03 INFO: 36cea8585d2d08ae received MsgVoteResp from 36cea8585d2d08ae at term 3
raft2020/04/09 03:00:03 INFO: 36cea8585d2d08ae [logterm: 2, index: 2479] sent MsgVote request to f9cd36a53e7504d9 at term 3
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2020-04-09 03:00:03.200510 W | etcdserver: read-only range request "key:\"/registry/pods/kube-system/kube-scheduler-k8s-master2\" " with result "error:context canceled" took too long (1.385200931s) to execute
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2020-04-09 03:00:04.593820 I | rafthttp: peer f9cd36a53e7504d9 became active
2020-04-09 03:00:04.593869 I | rafthttp: established a TCP streaming connection with peer f9cd36a53e7504d9 (stream MsgApp v2 reader)
2020-04-09 03:00:04.594253 I | rafthttp: established a TCP streaming connection with peer f9cd36a53e7504d9 (stream Message reader)
2020-04-09 03:00:04.634136 I | rafthttp: established a TCP streaming connection with peer f9cd36a53e7504d9 (stream MsgApp v2 writer)
2020-04-09 03:00:04.634762 I | rafthttp: established a TCP streaming connection with peer f9cd36a53e7504d9 (stream Message writer)
raft2020/04/09 03:00:04 INFO: 36cea8585d2d08ae is starting a new election at term 3
raft2020/04/09 03:00:04 INFO: 36cea8585d2d08ae became candidate at term 4
raft2020/04/09 03:00:04 INFO: 36cea8585d2d08ae received MsgVoteResp from 36cea8585d2d08ae at term 4
raft2020/04/09 03:00:04 INFO: 36cea8585d2d08ae [logterm: 2, index: 2479] sent MsgVote request to f9cd36a53e7504d9 at term 4
raft2020/04/09 03:00:04 INFO: 36cea8585d2d08ae received MsgVoteResp from f9cd36a53e7504d9 at term 4
raft2020/04/09 03:00:04 INFO: 36cea8585d2d08ae has received 2 MsgVoteResp votes and 0 vote rejections
raft2020/04/09 03:00:04 INFO: 36cea8585d2d08ae became leader at term 4
raft2020/04/09 03:00:04 INFO: raft.node: 36cea8585d2d08ae elected leader 36cea8585d2d08ae at term 4
2020-04-09 03:00:04.814835 W | etcdserver: read-only range request "key:\"/registry/services/endpoints/kube-system/kube-scheduler\" " with result "error:etcdserver: leader changed" took too long (3.35793109s) to execute
2020-04-09 03:00:06.674871 W | etcdserver: read-only range request "key:\"/registry/health\" " with result "error:context canceled" took too long (1.999626736s) to execute
WARNING: 2020/04/09 03:00:06 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"

@zedtux
Copy link

zedtux commented Apr 9, 2020

Here are the commands output with --v=5:

Few information

Before to run the kubeadm init command, keepalived and HAproxy are installed and configured.

keepalived is configured in order to create the 172.28.128.10 IP address, and HAproxy to listen on port 9443.

The first control plane kubeadm init command uses the 6443 port, then the other nodes are joining on port 9443.

kubeadm init (first control plane)

kubeadm config file kubeadm-config.yaml:

apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 172.28.128.10
  bindPort: 6443
certificateKey: bf9685289b0e0a8685cd412608f2d9ac33c99a57bd44fde3cf3ebcef59d928c9
---
apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
kubernetesVersion: 1.16.8
controlPlaneEndpoint: "172.28.128.10:9443"
clusterName: kubernetes
apiServer:
  extraArgs:
    advertise-address: 172.28.128.10
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
  serviceSubnet: 10.96.0.0/12
"kubeadm init --config=kubeadm-config.yaml --v=5 --upload-certs\n" ...
I0409 07:29:45.814412   17397 initconfiguration.go:190] loading configuration from "kubeadm-config.yaml"
I0409 07:29:45.816890   17397 initconfiguration.go:102] detected and using CRI socket: /var/run/dockershim.sock
I0409 07:29:45.817079   17397 feature_gate.go:216] feature gates: &{map[]}
[init] Using Kubernetes version: v1.16.8
[preflight] Running pre-flight checks
I0409 07:29:45.817996   17397 checks.go:578] validating Kubernetes and kubeadm version
I0409 07:29:45.818050   17397 checks.go:167] validating if the firewall is enabled and active
I0409 07:29:45.825366   17397 checks.go:202] validating availability of port 6443
I0409 07:29:45.825470   17397 checks.go:202] validating availability of port 10251
I0409 07:29:45.825488   17397 checks.go:202] validating availability of port 10252
I0409 07:29:45.825504   17397 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
I0409 07:29:45.825520   17397 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
I0409 07:29:45.825527   17397 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
I0409 07:29:45.825533   17397 checks.go:287] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0409 07:29:45.825539   17397 checks.go:433] validating if the connectivity type is via proxy or direct
I0409 07:29:45.825568   17397 checks.go:472] validating http connectivity to first IP address in the CIDR
I0409 07:29:45.825579   17397 checks.go:472] validating http connectivity to first IP address in the CIDR
I0409 07:29:45.825585   17397 checks.go:103] validating the container runtime
I0409 07:29:45.982324   17397 checks.go:129] validating if the service is enabled and active
I0409 07:29:46.047712   17397 checks.go:336] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0409 07:29:46.047814   17397 checks.go:336] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0409 07:29:46.047854   17397 checks.go:650] validating whether swap is enabled or not
I0409 07:29:46.047943   17397 checks.go:377] validating the presence of executable ip
I0409 07:29:46.048010   17397 checks.go:377] validating the presence of executable iptables
I0409 07:29:46.048026   17397 checks.go:377] validating the presence of executable mount
I0409 07:29:46.048047   17397 checks.go:377] validating the presence of executable nsenter
I0409 07:29:46.048076   17397 checks.go:377] validating the presence of executable ebtables
I0409 07:29:46.048092   17397 checks.go:377] validating the presence of executable ethtool
I0409 07:29:46.048108   17397 checks.go:377] validating the presence of executable socat
I0409 07:29:46.048120   17397 checks.go:377] validating the presence of executable tc
I0409 07:29:46.048137   17397 checks.go:377] validating the presence of executable touch
I0409 07:29:46.048150   17397 checks.go:521] running all checks
I0409 07:29:46.105325   17397 checks.go:407] checking whether the given node name is reachable using net.LookupHost
I0409 07:29:46.105546   17397 checks.go:619] validating kubelet version
I0409 07:29:46.168774   17397 checks.go:129] validating if the service is enabled and active
I0409 07:29:46.177907   17397 checks.go:202] validating availability of port 10250
I0409 07:29:46.177978   17397 checks.go:202] validating availability of port 2379
I0409 07:29:46.178006   17397 checks.go:202] validating availability of port 2380
I0409 07:29:46.178035   17397 checks.go:250] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
I0409 07:29:46.244594   17397 checks.go:845] pulling k8s.gcr.io/kube-apiserver:v1.16.8
I0409 07:29:57.457406   17397 checks.go:845] pulling k8s.gcr.io/kube-controller-manager:v1.16.8
I0409 07:30:05.142795   17397 checks.go:845] pulling k8s.gcr.io/kube-scheduler:v1.16.8
I0409 07:30:09.388201   17397 checks.go:845] pulling k8s.gcr.io/kube-proxy:v1.16.8
I0409 07:30:13.381608   17397 checks.go:845] pulling k8s.gcr.io/pause:3.1
I0409 07:30:14.675174   17397 checks.go:845] pulling k8s.gcr.io/etcd:3.3.15-0
I0409 07:30:34.318311   17397 checks.go:845] pulling k8s.gcr.io/coredns:1.6.2
I0409 07:30:39.279687   17397 kubelet.go:61] Stopping the kubelet
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
I0409 07:30:39.417248   17397 kubelet.go:79] Starting the kubelet
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0409 07:30:39.579148   17397 certs.go:104] creating a new certificate authority for ca
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master-1-flannel kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.28.128.10 172.28.128.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
I0409 07:30:40.120311   17397 certs.go:104] creating a new certificate authority for front-proxy-ca
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
I0409 07:30:40.810967   17397 certs.go:104] creating a new certificate authority for etcd-ca
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master-1-flannel localhost] and IPs [172.28.128.10 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master-1-flannel localhost] and IPs [172.28.128.10 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
I0409 07:30:43.115525   17397 certs.go:70] creating a new public/private key files for signing service account users
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
I0409 07:30:44.542850   17397 kubeconfig.go:79] creating kubeconfig file for admin.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
I0409 07:30:44.674571   17397 kubeconfig.go:79] creating kubeconfig file for kubelet.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
I0409 07:30:45.013083   17397 kubeconfig.go:79] creating kubeconfig file for controller-manager.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
I0409 07:30:45.385774   17397 kubeconfig.go:79] creating kubeconfig file for scheduler.conf
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0409 07:30:45.537852   17397 manifests.go:91] [control-plane] getting StaticPodSpecs
I0409 07:30:45.555047   17397 manifests.go:116] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0409 07:30:45.555590   17397 manifests.go:91] [control-plane] getting StaticPodSpecs
I0409 07:30:45.557924   17397 manifests.go:116] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0409 07:30:45.558369   17397 manifests.go:91] [control-plane] getting StaticPodSpecs
I0409 07:30:45.565161   17397 manifests.go:116] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
I0409 07:30:45.581741   17397 local.go:69] [etcd] wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
I0409 07:30:45.581817   17397 waitcontrolplane.go:80] [wait-control-plane] Waiting for the API server to be healthy
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 20.532794 seconds
I0409 07:31:06.118641   17397 uploadconfig.go:108] [upload-config] Uploading the kubeadm ClusterConfiguration to a ConfigMap
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
I0409 07:31:06.158756   17397 uploadconfig.go:122] [upload-config] Uploading the kubelet component config to a ConfigMap
[kubelet] Creating a ConfigMap "kubelet-config-1.16" in namespace kube-system with the configuration for the kubelets in the cluster
I0409 07:31:06.205452   17397 uploadconfig.go:127] [upload-config] Preserving the CRISocket information for the control-plane node
I0409 07:31:06.205496   17397 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master-1-flannel" as an annotation
[upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
bf9685289b0e0a8685cd412608f2d9ac33c99a57bd44fde3cf3ebcef59d928c9
[mark-control-plane] Marking the node master-1-flannel as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master-1-flannel as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: 06vuqf.b83ty4csi91j9ftv
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
I0409 07:31:07.867360   17397 clusterinfo.go:45] [bootstrap-token] loading admin kubeconfig
I0409 07:31:07.872938   17397 clusterinfo.go:53] [bootstrap-token] copying the cluster from admin.conf to the bootstrap kubeconfig
I0409 07:31:07.873923   17397 clusterinfo.go:65] [bootstrap-token] creating/updating ConfigMap in kube-public namespace
I0409 07:31:07.908264   17397 clusterinfo.go:79] creating the RBAC rules for exposing the cluster-info ConfigMap in the kube-public namespace
I0409 07:31:08.015837   17397 request.go:538] Throttling request took 59.103471ms, request: POST:https://172.28.128.10:9443/api/v1/namespaces/kube-system/configmaps
I0409 07:31:08.206290   17397 request.go:538] Throttling request took 164.932064ms, request: POST:https://172.28.128.10:9443/api/v1/namespaces/kube-system/serviceaccounts
I0409 07:31:08.406452   17397 request.go:538] Throttling request took 133.87725ms, request: POST:https://172.28.128.10:9443/api/v1/namespaces/kube-system/services
[addons] Applied essential addon: CoreDNS
I0409 07:31:08.606403   17397 request.go:538] Throttling request took 185.390418ms, request: POST:https://172.28.128.10:9443/api/v1/namespaces/kube-system/serviceaccounts
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
I0409 07:31:08.806962   17397 request.go:538] Throttling request took 192.76346ms, request: POST:https://172.28.128.10:9443/api/v1/namespaces/kube-system/configmaps
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 172.28.128.10:9443 --token 06vuqf.b83ty4csi91j9ftv \
    --discovery-token-ca-cert-hash sha256:59c336de05e4ac959293f398ae40a19edb6c1ce1587e82a8c7b9d5ad6a795c1a \
    --control-plane --certificate-key bf9685289b0e0a8685cd412608f2d9ac33c99a57bd44fde3cf3ebcef59d928c9

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use 
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 172.28.128.10:9443 --token 06vuqf.b83ty4csi91j9ftv \
    --discovery-token-ca-cert-hash sha256:59c336de05e4ac959293f398ae40a19edb6c1ce1587e82a8c7b9d5ad6a795c1a 

Worker join

kubeadm join 172.28.128.10:9443 --token=06vuqf.b83ty4csi91j9ftv --discovery-token-ca-cert-hash=sha256:59c336de05e4ac959293f398ae40a19edb6c1ce1587e82a8c7b9d5ad6a795c1a --v=5
I0409 07:35:00.013157   14430 join.go:363] [preflight] found NodeName empty; using OS hostname as NodeName
I0409 07:35:00.013450   14430 initconfiguration.go:102] detected and using CRI socket: /var/run/dockershim.sock
[preflight] Running pre-flight checks
I0409 07:35:00.013813   14430 preflight.go:90] [preflight] Running general checks
I0409 07:35:00.014201   14430 checks.go:250] validating the existence and emptiness of directory /etc/kubernetes/manifests
I0409 07:35:00.014574   14430 checks.go:287] validating the existence of file /etc/kubernetes/kubelet.conf
I0409 07:35:00.014603   14430 checks.go:287] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
I0409 07:35:00.014734   14430 checks.go:103] validating the container runtime
I0409 07:35:00.110262   14430 checks.go:129] validating if the service is enabled and active
I0409 07:35:00.165895   14430 checks.go:336] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0409 07:35:00.165992   14430 checks.go:336] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0409 07:35:00.166048   14430 checks.go:650] validating whether swap is enabled or not
I0409 07:35:00.166117   14430 checks.go:377] validating the presence of executable ip
I0409 07:35:00.166157   14430 checks.go:377] validating the presence of executable iptables
I0409 07:35:00.166181   14430 checks.go:377] validating the presence of executable mount
I0409 07:35:00.166216   14430 checks.go:377] validating the presence of executable nsenter
I0409 07:35:00.167716   14430 checks.go:377] validating the presence of executable ebtables
I0409 07:35:00.167769   14430 checks.go:377] validating the presence of executable ethtool
I0409 07:35:00.167816   14430 checks.go:377] validating the presence of executable socat
I0409 07:35:00.167835   14430 checks.go:377] validating the presence of executable tc
I0409 07:35:00.167893   14430 checks.go:377] validating the presence of executable touch
I0409 07:35:00.167925   14430 checks.go:521] running all checks
I0409 07:35:00.226230   14430 checks.go:407] checking whether the given node name is reachable using net.LookupHost
I0409 07:35:00.226677   14430 checks.go:619] validating kubelet version
I0409 07:35:00.286265   14430 checks.go:129] validating if the service is enabled and active
I0409 07:35:00.294724   14430 checks.go:202] validating availability of port 10250
I0409 07:35:00.294827   14430 checks.go:287] validating the existence of file /etc/kubernetes/pki/ca.crt
I0409 07:35:00.294842   14430 checks.go:433] validating if the connectivity type is via proxy or direct
I0409 07:35:00.294869   14430 join.go:433] [preflight] Discovering cluster-info
I0409 07:35:00.294916   14430 token.go:188] [discovery] Trying to connect to API Server "172.28.128.10:9443"
I0409 07:35:00.295330   14430 token.go:73] [discovery] Created cluster-info discovery client, requesting info from "https://172.28.128.10:9443"
I0409 07:35:00.307042   14430 token.go:134] [discovery] Requesting info from "https://172.28.128.10:9443" again to validate TLS against the pinned public key
I0409 07:35:00.315660   14430 token.go:152] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "172.28.128.10:9443"
I0409 07:35:00.316125   14430 token.go:194] [discovery] Successfully established connection with API Server "172.28.128.10:9443"
I0409 07:35:00.316675   14430 discovery.go:51] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I0409 07:35:00.317031   14430 join.go:447] [preflight] Fetching init configuration
I0409 07:35:00.317290   14430 join.go:485] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I0409 07:35:00.336825   14430 interface.go:384] Looking for default routes with IPv4 addresses
I0409 07:35:00.336857   14430 interface.go:389] Default route transits interface "eth0"
I0409 07:35:00.336947   14430 interface.go:196] Interface eth0 is up
I0409 07:35:00.336990   14430 interface.go:244] Interface "eth0" has 2 addresses :[10.0.2.15/24 fe80::a00:27ff:fe8d:c04d/64].
I0409 07:35:00.337011   14430 interface.go:211] Checking addr  10.0.2.15/24.
I0409 07:35:00.337017   14430 interface.go:218] IP found 10.0.2.15
I0409 07:35:00.337023   14430 interface.go:250] Found valid IPv4 address 10.0.2.15 for interface "eth0".
I0409 07:35:00.337027   14430 interface.go:395] Found active IP 10.0.2.15 
I0409 07:35:00.337060   14430 preflight.go:101] [preflight] Running configuration dependant checks
I0409 07:35:00.337067   14430 controlplaneprepare.go:211] [download-certs] Skipping certs download
I0409 07:35:00.337074   14430 kubelet.go:107] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I0409 07:35:00.339878   14430 kubelet.go:115] [kubelet-start] writing CA certificate at /etc/kubernetes/pki/ca.crt
I0409 07:35:00.340854   14430 kubelet.go:133] [kubelet-start] Stopping the kubelet
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
I0409 07:35:00.408223   14430 kubelet.go:150] [kubelet-start] Starting the kubelet
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I0409 07:35:01.001598   14430 kubelet.go:168] [kubelet-start] preserving the crisocket information for the node
I0409 07:35:01.002061   14430 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "worker-flannel" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

kubeadm init (second control plane)

kubeadm join 172.28.128.10:9443 --token=06vuqf.b83ty4csi91j9ftv --discovery-token-ca-cert-hash=sha256:59c336de05e4ac959293f398ae40a19edb6c1ce1587e82a8c7b9d5ad6a795c1a --certificate-key=bf9685289b0e0a8685cd412608f2d9ac33c99a57bd44fde3cf3ebcef59d928c9 --apiserver-advertise-address=172.28.128.201 --v=5 --control-plane
I0409 07:38:25.240037   17388 join.go:363] [preflight] found NodeName empty; using OS hostname as NodeName
I0409 07:38:25.240142   17388 initconfiguration.go:102] detected and using CRI socket: /var/run/dockershim.sock
[preflight] Running pre-flight checks
I0409 07:38:25.240215   17388 preflight.go:90] [preflight] Running general checks
I0409 07:38:25.240253   17388 checks.go:250] validating the existence and emptiness of directory /etc/kubernetes/manifests
I0409 07:38:25.240835   17388 checks.go:287] validating the existence of file /etc/kubernetes/kubelet.conf
I0409 07:38:25.240878   17388 checks.go:287] validating the existence of file /etc/kubernetes/bootstrap-kubelet.conf
I0409 07:38:25.240898   17388 checks.go:103] validating the container runtime
I0409 07:38:25.336878   17388 checks.go:129] validating if the service is enabled and active
I0409 07:38:25.393901   17388 checks.go:336] validating the contents of file /proc/sys/net/bridge/bridge-nf-call-iptables
I0409 07:38:25.393992   17388 checks.go:336] validating the contents of file /proc/sys/net/ipv4/ip_forward
I0409 07:38:25.394029   17388 checks.go:650] validating whether swap is enabled or not
I0409 07:38:25.394080   17388 checks.go:377] validating the presence of executable ip
I0409 07:38:25.394173   17388 checks.go:377] validating the presence of executable iptables
I0409 07:38:25.394211   17388 checks.go:377] validating the presence of executable mount
I0409 07:38:25.394257   17388 checks.go:377] validating the presence of executable nsenter
I0409 07:38:25.394294   17388 checks.go:377] validating the presence of executable ebtables
I0409 07:38:25.394328   17388 checks.go:377] validating the presence of executable ethtool
I0409 07:38:25.394369   17388 checks.go:377] validating the presence of executable socat
I0409 07:38:25.394399   17388 checks.go:377] validating the presence of executable tc
I0409 07:38:25.394442   17388 checks.go:377] validating the presence of executable touch
I0409 07:38:25.394480   17388 checks.go:521] running all checks
I0409 07:38:25.453449   17388 checks.go:407] checking whether the given node name is reachable using net.LookupHost
I0409 07:38:25.453597   17388 checks.go:619] validating kubelet version
I0409 07:38:25.517433   17388 checks.go:129] validating if the service is enabled and active
I0409 07:38:25.537698   17388 checks.go:202] validating availability of port 10250
I0409 07:38:25.537846   17388 checks.go:433] validating if the connectivity type is via proxy or direct
I0409 07:38:25.537886   17388 join.go:433] [preflight] Discovering cluster-info
I0409 07:38:25.537957   17388 token.go:188] [discovery] Trying to connect to API Server "172.28.128.10:6443"
I0409 07:38:25.538426   17388 token.go:73] [discovery] Created cluster-info discovery client, requesting info from "https://172.28.128.10:6443"
I0409 07:38:25.550719   17388 token.go:134] [discovery] Requesting info from "https://172.28.128.10:6443" again to validate TLS against the pinned public key
I0409 07:38:25.557258   17388 token.go:152] [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "172.28.128.10:6443"
I0409 07:38:25.557292   17388 token.go:194] [discovery] Successfully established connection with API Server "172.28.128.10:6443"
I0409 07:38:25.557310   17388 discovery.go:51] [discovery] Using provided TLSBootstrapToken as authentication credentials for the join process
I0409 07:38:25.557344   17388 join.go:447] [preflight] Fetching init configuration
I0409 07:38:25.557387   17388 join.go:485] [preflight] Retrieving KubeConfig objects
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
I0409 07:38:25.573452   17388 interface.go:384] Looking for default routes with IPv4 addresses
I0409 07:38:25.573473   17388 interface.go:389] Default route transits interface "eth0"
I0409 07:38:25.573544   17388 interface.go:196] Interface eth0 is up
I0409 07:38:25.573585   17388 interface.go:244] Interface "eth0" has 2 addresses :[10.0.2.15/24 fe80::a00:27ff:fe8d:c04d/64].
I0409 07:38:25.573598   17388 interface.go:211] Checking addr  10.0.2.15/24.
I0409 07:38:25.573604   17388 interface.go:218] IP found 10.0.2.15
I0409 07:38:25.573609   17388 interface.go:250] Found valid IPv4 address 10.0.2.15 for interface "eth0".
I0409 07:38:25.573613   17388 interface.go:395] Found active IP 10.0.2.15 
I0409 07:38:25.573647   17388 preflight.go:101] [preflight] Running configuration dependant checks
[preflight] Running pre-flight checks before initializing the new control plane instance
I0409 07:38:25.573672   17388 checks.go:578] validating Kubernetes and kubeadm version
I0409 07:38:25.573684   17388 checks.go:167] validating if the firewall is enabled and active
I0409 07:38:25.582299   17388 checks.go:202] validating availability of port 6443
I0409 07:38:25.582384   17388 checks.go:202] validating availability of port 10251
I0409 07:38:25.582412   17388 checks.go:202] validating availability of port 10252
I0409 07:38:25.582429   17388 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-apiserver.yaml
I0409 07:38:25.582446   17388 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-controller-manager.yaml
I0409 07:38:25.582452   17388 checks.go:287] validating the existence of file /etc/kubernetes/manifests/kube-scheduler.yaml
I0409 07:38:25.582458   17388 checks.go:287] validating the existence of file /etc/kubernetes/manifests/etcd.yaml
I0409 07:38:25.582465   17388 checks.go:433] validating if the connectivity type is via proxy or direct
I0409 07:38:25.582481   17388 checks.go:472] validating http connectivity to first IP address in the CIDR
I0409 07:38:25.582490   17388 checks.go:472] validating http connectivity to first IP address in the CIDR
I0409 07:38:25.582505   17388 checks.go:202] validating availability of port 2379
I0409 07:38:25.582525   17388 checks.go:202] validating availability of port 2380
I0409 07:38:25.582541   17388 checks.go:250] validating the existence and emptiness of directory /var/lib/etcd
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
I0409 07:38:25.638121   17388 checks.go:845] pulling k8s.gcr.io/kube-apiserver:v1.16.8
I0409 07:38:36.992612   17388 checks.go:845] pulling k8s.gcr.io/kube-controller-manager:v1.16.8
I0409 07:38:44.814362   17388 checks.go:845] pulling k8s.gcr.io/kube-scheduler:v1.16.8
I0409 07:38:48.941571   17388 checks.go:845] pulling k8s.gcr.io/kube-proxy:v1.16.8
I0409 07:38:53.543264   17388 checks.go:845] pulling k8s.gcr.io/pause:3.1
I0409 07:38:54.923668   17388 checks.go:845] pulling k8s.gcr.io/etcd:3.3.15-0
I0409 07:39:14.592328   17388 checks.go:845] pulling k8s.gcr.io/coredns:1.6.2
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
I0409 07:39:19.073546   17388 certs.go:39] creating PKI assets
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [master-2-flannel kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.28.128.201 172.28.128.10]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [master-2-flannel localhost] and IPs [172.28.128.201 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [master-2-flannel localhost] and IPs [172.28.128.201 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
I0409 07:39:20.871447   17388 certs.go:70] creating a new public/private key files for signing service account users
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
I0409 07:39:21.412341   17388 manifests.go:91] [control-plane] getting StaticPodSpecs
I0409 07:39:21.420256   17388 manifests.go:116] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
I0409 07:39:21.420945   17388 manifests.go:91] [control-plane] getting StaticPodSpecs
I0409 07:39:21.422206   17388 manifests.go:116] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[control-plane] Creating static Pod manifest for "kube-scheduler"
I0409 07:39:21.422795   17388 manifests.go:91] [control-plane] getting StaticPodSpecs
I0409 07:39:21.423966   17388 manifests.go:116] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[check-etcd] Checking that the etcd cluster is healthy
I0409 07:39:21.435464   17388 local.go:75] [etcd] Checking etcd cluster health
I0409 07:39:21.435734   17388 local.go:78] creating etcd client that connects to etcd pods
I0409 07:39:21.450081   17388 etcd.go:107] etcd endpoints read from pods: https://172.28.128.10:2379
I0409 07:39:21.474262   17388 etcd.go:156] etcd endpoints read from etcd: https://172.28.128.10:2379
I0409 07:39:21.474700   17388 etcd.go:125] update etcd endpoints: https://172.28.128.10:2379
I0409 07:39:21.491970   17388 kubelet.go:107] [kubelet-start] writing bootstrap kubelet config file at /etc/kubernetes/bootstrap-kubelet.conf
I0409 07:39:21.493157   17388 kubelet.go:133] [kubelet-start] Stopping the kubelet
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.16" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
I0409 07:39:21.578838   17388 kubelet.go:150] [kubelet-start] Starting the kubelet
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
I0409 07:39:22.234598   17388 kubelet.go:168] [kubelet-start] preserving the crisocket information for the node
I0409 07:39:22.235352   17388 patchnode.go:30] [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "master-2-flannel" as an annotation
I0409 07:39:32.763719   17388 local.go:127] creating etcd client that connects to etcd pods
I0409 07:39:32.767983   17388 etcd.go:107] etcd endpoints read from pods: https://172.28.128.10:2379
I0409 07:39:32.784358   17388 etcd.go:156] etcd endpoints read from etcd: https://172.28.128.10:2379
I0409 07:39:32.784988   17388 etcd.go:125] update etcd endpoints: https://172.28.128.10:2379
I0409 07:39:32.785537   17388 local.go:136] Adding etcd member: https://172.28.128.201:2380
[etcd] Announced new etcd member joining to the existing etcd cluster
I0409 07:39:32.853320   17388 local.go:142] Updated etcd member list: [{master-1-flannel https://172.28.128.10:2380} {master-2-flannel https://172.28.128.201:2380}]
[etcd] Creating static Pod manifest for "etcd"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
I0409 07:39:32.859556   17388 etcd.go:372] [etcd] attempting to see if all cluster endpoints ([https://172.28.128.10:2379 https://172.28.128.201:2379]) are available 1/8
{"level":"warn","ts":"2020-04-09T07:39:43.203Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://172.28.128.201:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
I0409 07:39:43.204720   17388 etcd.go:377] [etcd] Attempt timed out
I0409 07:39:43.204734   17388 etcd.go:369] [etcd] Waiting 5s until next retry
I0409 07:39:48.205919   17388 etcd.go:372] [etcd] attempting to see if all cluster endpoints ([https://172.28.128.10:2379 https://172.28.128.201:2379]) are available 2/8
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node master-2-flannel as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node master-2-flannel as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

@neolit123
Copy link
Member

@zedtux i'm not seeing errors when your worker and control-plane nodes are joining.
am i missing something?

@vijayk13

2020-04-09 03:00:06.674871 W | etcdserver: read-only range request "key:"/registry/health" " with result "error:context canceled" took too long (1.999626736s) to execute

i think you might be facing this same etcd bug:
kubernetes/kubernetes#88111 (comment)

not directly related to RAID, but my understanding there is that etcd does not tolerate slow disks.
i think i've seen multiple reports about this already.

@vijayk13
Copy link

vijayk13 commented Apr 9, 2020

@neolit123 - thanks for the etcd slow disk issue pointer.
Will change my configuration, run some experiments and report back.

@vijayk13
Copy link

Okay, checked my configuration and ran fio-based tests to see if the disk fdatasync latency is the issue ... It does not appear to be.

I followed the guidance in:
https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md, and
https://www.ibm.com/cloud/blog/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd

For the suggested fio command (--ioengine=sync --fdatasync=1 --size=22m --bs=2300), my two control plane nodes show 99% fsync latency well below the required 10ms. I verified that the underlying disks are SSDs, with data loss protection enabled (so Linux sees them as "write through" write-cache devices).

Control plane node 1:

  fsync/fdatasync/sync_file_range:
    sync (usec): min=108, max=14712, avg=583.04, stdev=477.73
    sync percentiles (usec):
     |  1.00th=[  123],  5.00th=[  139], 10.00th=[  163], 20.00th=[  241],
     | 30.00th=[  338], 40.00th=[  400], 50.00th=[  469], 60.00th=[  537],
     | 70.00th=[  660], 80.00th=[  955], 90.00th=[ 1205], 95.00th=[ 1336],
     | 99.00th=[ 1516], 99.50th=[ 1598], 99.90th=[ 5407], 99.95th=[ 8979],
     | 99.99th=[12780]

Control plane node 2:

  fsync/fdatasync/sync_file_range:
    sync (usec): min=107, max=12111, avg=495.70, stdev=419.26
    sync percentiles (usec):
     |  1.00th=[  121],  5.00th=[  135], 10.00th=[  147], 20.00th=[  186],
     | 30.00th=[  265], 40.00th=[  359], 50.00th=[  396], 60.00th=[  445],
     | 70.00th=[  506], 80.00th=[  619], 90.00th=[ 1156], 95.00th=[ 1303],
     | 99.00th=[ 1467], 99.50th=[ 1582], 99.90th=[ 4228], 99.95th=[ 5407],
     | 99.99th=[ 9110]

Looking through the etcdserver logs I had shared yesterday, the warning about the range request taking too long comes after a sequence of gRPC errors (and only after the second node has been added to the cluster) ... so don't know if disk latency is the cause or something else:

WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
WARNING: 2020/04/09 03:00:03 grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
2020-04-09 03:00:03.200510 W | etcdserver: read-only range request "key:\"/registry/pods/kube-system/kube-scheduler-k8s-master2\" " with result "error:context canceled" took too long (1.385200931s) to execute

For what it is worth, I saw a post on StackOverflow from a few days back where someone else is reporting the same issue:
https://stackoverflow.com/questions/61063301/kubernets-1-18-stacked-control-plane-and-etcd-nodes-unable-to-add-second-etcd-no

Finally, I assume your internal e2e tests are doing this, but wanted to confirm that you are also testing the 'stacked control plane and etcd' configuration, not just the external etcd one ... wanted to rule out any possible issues there.

Thanks!

@neolit123
Copy link
Member

we have no solution for this on the kubeadm side. current indicators point towards an etcd issue.

but wanted to confirm that you are also testing the 'stacked control plane and etcd' configuration, not just the external etcd one

we are primary testing stacked etcd, but also have some tests for external.

@zedtux
Copy link

zedtux commented Apr 10, 2020

@neolit123 the issue is not when the worker joins the first control plane, it's when the second control plane tries to join the first one.

@neolit123
Copy link
Member

@zedtux

your logs show:

kubeadm join 172.28.128.10:9443 --token=06vuqf.b83ty4csi91j9ftv --discovery-token-ca-cert-hash=sha256:59c336de05e4ac959293f398ae40a19edb6c1ce1587e82a8c7b9d5ad6a795c1a --certificate-key=bf9685289b0e0a8685cd412608f2d9ac33c99a57bd44fde3cf3ebcef59d928c9 --apiserver-advertise-address=172.28.128.201 --v=5 --control-plane

...

This node has joined the cluster and a new control plane instance was created:

so it says it passed.

@zedtux
Copy link

zedtux commented Apr 10, 2020

Oh, actually you are right, I was focusing on the following line:

I0409 07:39:32.859556   17388 etcd.go:372] [etcd] attempting to see if all cluster endpoints ([https://172.28.128.10:2379 https://172.28.128.201:2379]) are available 1/8
{"level":"warn","ts":"2020-04-09T07:39:43.203Z","caller":"clientv3/retry_interceptor.go:61","msg":"retrying of unary invoker failed","target":"passthrough:///https://172.28.128.201:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
I0409 07:39:43.204720   17388 etcd.go:377] [etcd] Attempt timed out

But then there are those lines:

I0409 07:39:43.204734   17388 etcd.go:369] [etcd] Waiting 5s until next retry
I0409 07:39:48.205919   17388 etcd.go:372] [etcd] attempting to see if all cluster endpoints ([https://172.28.128.10:2379 https://172.28.128.201:2379]) are available 2/8

So you are right, it works. 🤔.

I'll review my test suite.

@vijayk13
Copy link

@neolit123 - Wanted to provide an update ...

After a few days of debugging, managed to root cause and fix the issue. My setup had a misconfiguration, where the control-plane Linux hosts had the network interface MTU set to the wrong value. Fixing the MTU enabled the control-plane cluster setup to work as intended.

What I did not anticipate, and why it took a while to debug is the following:

  • There was no issue while I was downloading and setting up the required packages on the host (including the 'kubeadm config images pull' commands) ... I chalk this down to the fact that TCP MSS handling did what was expected and worked around the misconfiguration
  • The issue manifested itself when the etcdserver pods on the first and second control-plane nodes attempted to communicate with each other, and when the packet sizes went past a threshold, it appears as if the IP masquerading on the hosts created a situation where an overly large packet was silently dropped ...
    Looks related to the following (old) Calico github issue (I ran into it with weave as well):
    MTU issue with IPIP and host-networking projectcalico/calico#1709

So, while I know how to fix this now, wanted to request that this be documented in the setup requirements so others don't stumble into it and end up spending the same amount of time to find/fix the issue as I did. Don't know how you would do it, but adding it to the pre-flight checks for kubeadm would also be helpful.

Thanks again for all the time you spent on this.

@neolit123
Copy link
Member

interface MTU set to the wrong value

what value did you set?

that this be documented in the setup requirements

there are 100s of different things that can wrong with the networking of a Node, so i don't think kubeadm should document all of them.

(from a previous post)

The control plane nodes are KVM VMs on Ubuntu hosts.

this is the first case i've seen a user report MTU issues.
instead of documenting such problems users should be able to just download k8s-ready OS images.
...yet the k8s project does not maintain such.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/HA kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
6 participants