Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm init hangs on creating kube-discovery and fails to create kube-dns #68

Closed
akarasik opened this issue Nov 24, 2016 · 8 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. priority/backlog Higher priority than priority/awaiting-more-evidence. state/needs-more-information

Comments

@akarasik
Copy link

Hello,

I'm trying to create a new kubernetes cluster using kubeadm.
I'm running Ubuntu 16.04 on Dell bare metal servers, and pre installed docker 1.11.2.
Following the kubeadm guide, I running the following commands:

curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF > /etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl kubernetes-cni

And then trying to set up the first kubernetes server, which hangs for 10 minutes:

[email protected]# kubeadm init
Running pre-flight checks
<master/tokens> generated token: "308122.26f241052d981a03"
<master/pki> generated Certificate Authority key and certificate:
Issuer: CN=kubernetes | Subject: CN=kubernetes | CA: true
Not before: 2016-11-24 06:36:32 +0000 UTC Not After: 2026-11-22 06:36:32 +0000 UTC
Public: /etc/kubernetes/pki/ca-pub.pem
Private: /etc/kubernetes/pki/ca-key.pem
Cert: /etc/kubernetes/pki/ca.pem
<master/pki> generated API Server key and certificate:
Issuer: CN=kubernetes | Subject: CN=kube-apiserver | CA: false
Not before: 2016-11-24 06:36:32 +0000 UTC Not After: 2017-11-24 06:36:32 +0000 UTC
Alternate Names: [<ipaddresses> kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local]
Public: /etc/kubernetes/pki/apiserver-pub.pem
Private: /etc/kubernetes/pki/apiserver-key.pem
Cert: /etc/kubernetes/pki/apiserver.pem
<master/pki> generated Service Account Signing keys:
Public: /etc/kubernetes/pki/sa-pub.pem
Private: /etc/kubernetes/pki/sa-key.pem
<master/pki> created keys and certificates in "/etc/kubernetes/pki"
<util/kubeconfig> created "/etc/kubernetes/kubelet.conf"
<util/kubeconfig> created "/etc/kubernetes/admin.conf"
<master/apiclient> created API client configuration
<master/apiclient> created API client, waiting for the control plane to become ready
<master/apiclient> all control plane components are healthy after 50.344647 seconds
<master/apiclient> waiting for at least one node to register and become ready
<master/apiclient> first node is ready after 1.503290 seconds
<master/apiclient> attempting a test deployment
<master/apiclient> test deployment succeeded
<master/discovery> created essential addon: kube-discovery, waiting for it to become ready

Eventualy it continues:

<master/discovery> created essential addon: kube-discovery, waiting for it to become ready
<master/discovery> kube-discovery is ready after 668.503635 seconds
<master/addons> created essential addon: kube-proxy
<master/addons> created essential addon: kube-dns

Kubernetes master initialised successfully!

You can now join any number of machines by running the following on each node:

kubeadm join --token=308122.26f241052d981a03 <ip_address>

When trying to understand why, I came up with the kube-discovery pod's logs that reads:
2016/11/24 06:40:55 list of API endpoints does not exist: /tmp/secret/endpoint-list.json
I can also see an errors about the CNI in the kube-dns container, here are the full events:

[email protected]:~)# kubectl get pods -n kube-system
NAME                                                               READY     STATUS              RESTARTS   AGE
dummy-2088944543-7w6xa                                             1/1       Running             0          21m
etcd-host.test.com                      1/1       Running             0          20m
kube-apiserver-host.test.com            1/1       Running             0          21m
kube-controller-manager-host.test.com   1/1       Running             0          21m
kube-discovery-1150918428-1ik91                                    0/1       Error               9          21m
kube-dns-654381707-y05e7                                           0/3       ContainerCreating   0          10m
kube-proxy-h1snf                                                   1/1       Running             0          10m
kube-scheduler-host.test.com            1/1       Running             0          20m

[email protected]:~)# kubectl logs kube-discovery-1150918428-1ik91 -n kube-system
2016/11/24 06:58:58 root CA certificate does not exist: /tmp/secret/ca.pem

[email protected]:~)# kubectl get events -n kube-system
LASTSEEN   FIRSTSEEN   COUNT     NAME                                                               KIND         SUBOBJECT                                  TYPE      REASON              SOURCE                                               MESSAGE
21m        21m         1         dummy-2088944543-7w6xa                                             Pod                                                     Normal    Scheduled           {default-scheduler }                                 Successfully assigned dummy-2088944543-7w6xa to host.test.com
21m        21m         1         dummy-2088944543-7w6xa                                             Pod          spec.containers{dummy}                     Normal    Pulled              {kubelet host.test.com}   Container image "gcr.io/google_containers/pause-amd64:3.0" already present on machine
21m        21m         1         dummy-2088944543-7w6xa                                             Pod          spec.containers{dummy}                     Normal    Created             {kubelet host.test.com}   Created container with docker id 7b1ae84b73b7; Security:[seccomp=unconfined]
21m        21m         1         dummy-2088944543-7w6xa                                             Pod          spec.containers{dummy}                     Normal    Started             {kubelet host.test.com}   Started container with docker id 7b1ae84b73b7
21m        21m         1         dummy-2088944543                                                   ReplicaSet                                              Normal    SuccessfulCreate    {replicaset-controller }                             Created pod: dummy-2088944543-7w6xa
21m        21m         1         dummy                                                              Deployment                                              Normal    ScalingReplicaSet   {deployment-controller }                             Scaled up replica set dummy-2088944543 to 1
22m        22m         1         etcd-host.test.com                      Pod          spec.containers{etcd}                      Normal    Pulling             {kubelet host.test.com}   pulling image "gcr.io/google_containers/etcd-amd64:2.2.5"
22m        22m         1         etcd-host.test.com                      Pod          spec.containers{etcd}                      Normal    Pulled              {kubelet host.test.com}   Successfully pulled image "gcr.io/google_containers/etcd-amd64:2.2.5"
22m        22m         1         etcd-host.test.com                      Pod          spec.containers{etcd}                      Normal    Created             {kubelet host.test.com}   Created container with docker id 2c3edaeb457e; Security:[seccomp=unconfined]
22m        22m         1         etcd-host.test.com                      Pod          spec.containers{etcd}                      Normal    Started             {kubelet host.test.com}   Started container with docker id 2c3edaeb457e
22m        22m         1         kube-apiserver-host.test.com            Pod          spec.containers{kube-apiserver}            Normal    Pulling             {kubelet host.test.com}   pulling image "gcr.io/google_containers/kube-apiserver-amd64:v1.4.4"
21m        21m         1         kube-apiserver-host.test.com            Pod          spec.containers{kube-apiserver}            Normal    Pulled              {kubelet host.test.com}   Successfully pulled image "gcr.io/google_containers/kube-apiserver-amd64:v1.4.4"
21m        21m         1         kube-apiserver-host.test.com            Pod          spec.containers{kube-apiserver}            Normal    Created             {kubelet host.test.com}   Created container with docker id b8088a3372d0; Security:[seccomp=unconfined]
21m        21m         1         kube-apiserver-host.test.com            Pod          spec.containers{kube-apiserver}            Normal    Started             {kubelet host.test.com}   Started container with docker id b8088a3372d0
22m        22m         1         kube-controller-manager-host.test.com   Pod          spec.containers{kube-controller-manager}   Normal    Pulling             {kubelet host.test.com}   pulling image "gcr.io/google_containers/kube-controller-manager-amd64:v1.4.4"
21m        21m         1         kube-controller-manager-host.test.com   Pod          spec.containers{kube-controller-manager}   Normal    Pulled              {kubelet host.test.com}   Successfully pulled image "gcr.io/google_containers/kube-controller-manager-amd64:v1.4.4"
21m        21m         1         kube-controller-manager-host.test.com   Pod          spec.containers{kube-controller-manager}   Normal    Created             {kubelet host.test.com}   Created container with docker id 73f586ea582a; Security:[seccomp=unconfined]
21m        21m         1         kube-controller-manager-host.test.com   Pod          spec.containers{kube-controller-manager}   Normal    Started             {kubelet host.test.com}   Started container with docker id 73f586ea582a
21m        21m         1         kube-discovery-1150918428-1ik91                                    Pod                                                     Normal    Scheduled           {default-scheduler }                                 Successfully assigned kube-discovery-1150918428-1ik91 to host.test.com
21m        21m         1         kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Normal    Pulling             {kubelet host.test.com}   pulling image "gcr.io/google_containers/kube-discovery-amd64:1.0"
21m        21m         1         kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Normal    Pulled              {kubelet host.test.com}   Successfully pulled image "gcr.io/google_containers/kube-discovery-amd64:1.0"
21m        21m         1         kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Normal    Created             {kubelet host.test.com}   Created container with docker id e455883be38a; Security:[seccomp=unconfined]
21m        21m         1         kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Normal    Started             {kubelet host.test.com}   Started container with docker id e455883be38a
21s        21m         9         kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Normal    Pulled              {kubelet host.test.com}   Container image "gcr.io/google_containers/kube-discovery-amd64:1.0" already present on machine
21m        21m         1         kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Normal    Created             {kubelet host.test.com}   Created container with docker id 15aa023f8198; Security:[seccomp=unconfined]
21m        21m         1         kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Normal    Started             {kubelet host.test.com}   Started container with docker id 15aa023f8198
5s         21m         100       kube-discovery-1150918428-1ik91                                    Pod          spec.containers{kube-discovery}            Warning   BackOff             {kubelet host.test.com}   Back-off restarting failed docker container
21m        21m         2         kube-discovery-1150918428-1ik91                                    Pod                                                     Warning   FailedSync          {kubelet host.test.com}   Error syncing pod, skipping: failed to "StartContainer" for "kube-discovery" with CrashLoopBackOff: "Back-off 10s restarting failed container=kube-discovery pod=kube-discovery-1150918428-1ik91_kube-system(7ba022a8-b210-11e6-9dfd-008cfafea9f4)"

21m       21m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Created      {kubelet host.test.com}   Created container with docker id e242d2ba6a94; Security:[seccomp=unconfined]
21m       21m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Started      {kubelet host.test.com}   Started container with docker id e242d2ba6a94
21m       21m       2         kube-discovery-1150918428-1ik91   Pod                                         Warning   FailedSync   {kubelet host.test.com}   Error syncing pod, skipping: failed to "StartContainer" for "kube-discovery" with CrashLoopBackOff: "Back-off 20s restarting failed container=kube-discovery pod=kube-discovery-1150918428-1ik91_kube-system(7ba022a8-b210-11e6-9dfd-008cfafea9f4)"

20m       20m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Created      {kubelet host.test.com}   Created container with docker id 2ddd8167c48b; Security:[seccomp=unconfined]
20m       20m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Started      {kubelet host.test.com}   Started container with docker id 2ddd8167c48b
20m       20m       4         kube-discovery-1150918428-1ik91   Pod                                         Warning   FailedSync   {kubelet host.test.com}   Error syncing pod, skipping: failed to "StartContainer" for "kube-discovery" with CrashLoopBackOff: "Back-off 40s restarting failed container=kube-discovery pod=kube-discovery-1150918428-1ik91_kube-system(7ba022a8-b210-11e6-9dfd-008cfafea9f4)"

19m       19m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Created      {kubelet host.test.com}   Created container with docker id 6782291e11b2; Security:[seccomp=unconfined]
19m       19m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Started      {kubelet host.test.com}   Started container with docker id 6782291e11b2
18m       19m       7         kube-discovery-1150918428-1ik91   Pod                                         Warning   FailedSync   {kubelet host.test.com}   Error syncing pod, skipping: failed to "StartContainer" for "kube-discovery" with CrashLoopBackOff: "Back-off 1m20s restarting failed container=kube-discovery pod=kube-discovery-1150918428-1ik91_kube-system(7ba022a8-b210-11e6-9dfd-008cfafea9f4)"

18m       18m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Created      {kubelet host.test.com}   Created container with docker id 3a50f8b99f21; Security:[seccomp=unconfined]
18m       18m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Started      {kubelet host.test.com}   Started container with docker id 3a50f8b99f21
15m       18m       13        kube-discovery-1150918428-1ik91   Pod                                         Warning   FailedSync   {kubelet host.test.com}   Error syncing pod, skipping: failed to "StartContainer" for "kube-discovery" with CrashLoopBackOff: "Back-off 2m40s restarting failed container=kube-discovery pod=kube-discovery-1150918428-1ik91_kube-system(7ba022a8-b210-11e6-9dfd-008cfafea9f4)"

15m       15m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Created      {kubelet host.test.com}   Created container with docker id 07a98e7f4a72; Security:[seccomp=unconfined]
15m       15m       1         kube-discovery-1150918428-1ik91   Pod       spec.containers{kube-discovery}   Normal    Started      {kubelet host.test.com}   Started container with docker id 07a98e7f4a72
5s        15m       72        kube-discovery-1150918428-1ik91   Pod                                         Warning   FailedSync   {kubelet host.test.com}   Error syncing pod, skipping: failed to "StartContainer" for "kube-discovery" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-discovery pod=kube-discovery-1150918428-1ik91_kube-system(7ba022a8-b210-11e6-9dfd-008cfafea9f4)"

10m       10m       1         kube-discovery-1150918428-1ik91   Pod          spec.containers{kube-discovery}   Normal    Created             {kubelet host.test.com}   Created container with docker id 05bd8eefb9e8; Security:[seccomp=unconfined]
10m       10m       1         kube-discovery-1150918428-1ik91   Pod          spec.containers{kube-discovery}   Normal    Started             {kubelet host.test.com}   Started container with docker id 05bd8eefb9e8
5m        5m        1         kube-discovery-1150918428-1ik91   Pod          spec.containers{kube-discovery}   Normal    Created             {kubelet host.test.com}   Created container with docker id d44112fbafca; Security:[seccomp=unconfined]
5m        5m        1         kube-discovery-1150918428-1ik91   Pod          spec.containers{kube-discovery}   Normal    Started             {kubelet host.test.com}   Started container with docker id d44112fbafca
21s       21s       1         kube-discovery-1150918428-1ik91   Pod          spec.containers{kube-discovery}   Normal    Created             {kubelet host.test.com}   (events with common reason combined)
21s       21s       1         kube-discovery-1150918428-1ik91   Pod          spec.containers{kube-discovery}   Normal    Started             {kubelet host.test.com}   (events with common reason combined)
21m       21m       1         kube-discovery-1150918428         ReplicaSet                                     Normal    SuccessfulCreate    {replicaset-controller }                             Created pod: kube-discovery-1150918428-1ik91
21m       21m       1         kube-discovery                    Deployment                                     Normal    ScalingReplicaSet   {deployment-controller }                             Scaled up replica set kube-discovery-1150918428 to 1
10m       10m       1         kube-dns-654381707-y05e7          Pod                                            Normal    Scheduled           {default-scheduler }                                 Successfully assigned kube-dns-654381707-y05e7 to host.test.com
1s        10m       626       kube-dns-654381707-y05e7          Pod                                            Warning   FailedSync          {kubelet host.test.com}   Error syncing pod, skipping: failed to "SetupNetwork" for "kube-dns-654381707-y05e7_kube-system" with SetupNetworkError: "Failed to setup network for pod \"kube-dns-654381707-y05e7_kube-system(0a1fcd35-b212-11e6-9dfd-008cfafea9f4)\" using network plugins \"cni\": cni config unintialized; Skipping pod"

10m       10m       1         kube-dns-654381707                                        ReplicaSet                                     Normal    SuccessfulCreate    {replicaset-controller }                             Created pod: kube-dns-654381707-y05e7
10m       10m       1         kube-dns                                                  Deployment                                     Normal    ScalingReplicaSet   {deployment-controller }                             Scaled up replica set kube-dns-654381707 to 1
10m       10m       1         kube-proxy-h1snf                                          Pod          spec.containers{kube-proxy}       Normal    Pulling             {kubelet host.test.com}   pulling image "gcr.io/google_containers/kube-proxy-amd64:v1.4.4"
10m       10m       1         kube-proxy-h1snf                                          Pod          spec.containers{kube-proxy}       Normal    Pulled              {kubelet host.test.com}   Successfully pulled image "gcr.io/google_containers/kube-proxy-amd64:v1.4.4"
10m       10m       1         kube-proxy-h1snf                                          Pod          spec.containers{kube-proxy}       Normal    Created             {kubelet host.test.com}   Created container with docker id 1955d229e74e; Security:[seccomp=unconfined]
10m       10m       1         kube-proxy-h1snf                                          Pod          spec.containers{kube-proxy}       Normal    Started             {kubelet host.test.com}   Started container with docker id 1955d229e74e
10m       10m       1         kube-proxy                                                DaemonSet                                      Normal    SuccessfulCreate    {daemon-set }                                        Created pod: kube-proxy-h1snf
22m       22m       1         kube-scheduler-host.test.com   Pod          spec.containers{kube-scheduler}   Normal    Pulling             {kubelet host.test.com}   pulling image "gcr.io/google_containers/kube-scheduler-amd64:v1.4.4"
22m       22m       1         kube-scheduler-host.test.com   Pod          spec.containers{kube-scheduler}   Normal    Pulled              {kubelet host.test.com}   Successfully pulled image "gcr.io/google_containers/kube-scheduler-amd64:v1.4.4"
22m       22m       1         kube-scheduler-host.test.com   Pod          spec.containers{kube-scheduler}   Normal    Created             {kubelet host.test.com}   Created container with docker id 3ce9c0fa9876; Security:[seccomp=unconfined]
22m       22m       1         kube-scheduler-host.test.com   Pod          spec.containers{kube-scheduler}   Normal    Started             {kubelet host.test.com}   Started container with docker id 3ce9c0fa9876

Is there anything additional I should do to setup a cluster with default config?

@luxas
Copy link
Member

luxas commented Nov 24, 2016

cc @dgoodwin thoughts on the discovery part?

@dgoodwin
Copy link

Presumably your secret exists:

(root@centos1 ~) $ kubectl get secret/clusterinfo --namespace kube-system -o json
{
    "kind": "Secret",
    "apiVersion": "v1",
    "metadata": {
        "name": "clusterinfo",
        "namespace": "kube-system",
        "selfLink": "/api/v1/namespaces/kube-system/secrets/clusterinfo",
        "uid": "056b9a94-b30e-11e6-b7f0-5254006cb728",
        "resourceVersion": "174",
        "creationTimestamp": "2016-11-25T12:52:28Z"
    },
    "data": {
        "ca.pem": "[snip]",
        "endpoint-list.json": "WyJodHRwczovLzE5Mi4xNjguMTIyLjE3Njo2NDQzIl0=",
        "token-map.json": "eyJjYmY1MDMiOiJkMjk0NTgxY2RiYzNhOTRhIn0="
    },
    "type": "Opaque"
}

That error looks like what you would see with selinux before we unconfined the discovery pod, but we did unconfine it, and this is Ubuntu which most likely isn't using selinux.

Also not great that the output claims kube-discovery is ready when it appears it actually is not. We might have some bad error handling there.

Regardless if the secret exists on your cluster, could something in the OS be blocking access to it being mounted in the container? I'd try digging through kubelet logs as well looking for info about that pod and secret.

@luxas luxas added kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. priority/backlog Higher priority than priority/awaiting-more-evidence. state/needs-more-information labels Nov 25, 2016
@akarasik
Copy link
Author

It seems to be some kind of systemd issue (not sure why), and installing docker with 2.9 version of this docker cookbook caused it.
Installing using apt-get install docker-engine or using the latest version of that cookbook solved the hang and kube-dns network issue.

@luxas
Copy link
Member

luxas commented Nov 30, 2016

Ok, cool to know, thanks!

Which version did you use before and which docker version do you use now?

@akarasik
Copy link
Author

The docker version was the same all the time (I tried both 1.11.2, which is the recommended, and 1.12.3).
What made the difference was the docker cookbook version.
Previously I used v2.9.8 of the cookbook (which failed to start the discovery pod), and the latest v2.13.0 was able to start the discovery pod.

@luxas
Copy link
Member

luxas commented Nov 30, 2016

May it be something with the storage driver?
Have you used overlay2 or...?

@akarasik
Copy link
Author

akarasik commented Dec 1, 2016

No, didn't used it.

@luxas
Copy link
Member

luxas commented May 29, 2017

Fixed with v1.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. kind/support Categorizes issue or PR as a support question. priority/backlog Higher priority than priority/awaiting-more-evidence. state/needs-more-information
Projects
None yet
Development

No branches or pull requests

3 participants