Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coredns is always pending after apply flannel.yml #1906

Closed
oneslideicywater opened this issue Nov 11, 2019 · 5 comments
Closed

coredns is always pending after apply flannel.yml #1906

oneslideicywater opened this issue Nov 11, 2019 · 5 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@oneslideicywater
Copy link

What keywords did you search in kubeadm issues before filing this one?

I have view issues#1178,it doesn't work

Is this a BUG REPORT or FEATURE REQUEST?

BUG REPORT

Versions

kubeadm version (use kubeadm version):

Environment:

  • Kubernetes version (use kubectl version):
kubeadm version: &version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:15:39Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration:
    private computer
  • OS (e.g. from /etc/os-release):
    CentOS-7
  • Kernel (e.g. uname -a):
Linux k8s-master 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

What happened?

Core DNS is pending after applying flannel.yml and modify the CNI version

What you expected to happen?

Core DNS is Running

How to reproduce it (as minimally and precisely as possible)?

with this init.yaml to use kubeadm init

apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
localAPIEndpoint:
  advertiseAddress: 192.168.0.148
  bindPort: 6443
nodeRegistration:
  criSocket: /var/run/dockershim.sock
  name: k8s-master
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
---
apiServer:
  timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
  type: CoreDNS
etcd:
  local:
    dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
networking:
  dnsDomain: cluster.local
  podSubnet: 10.244.0.0/16
scheduler: {}

and use

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml
# after I find issues#1178,modify the CNI version and apply changes
kubectl apply -f kube-flannel.yml

Anything else we need to know?

I reset kubeadm several times,and rm -rf $HOME/.kube after reset.
and then kubeadm init --config=xx.yaml.

I have't join any node into kubernetes ,and my node info

Name:               k8s-master
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8s-master
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Mon, 11 Nov 2019 08:15:29 -0500
Taints:             node.kubernetes.io/not-ready:NoExecute
                    node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Mon, 11 Nov 2019 08:45:30 -0500   Mon, 11 Nov 2019 08:15:26 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Mon, 11 Nov 2019 08:45:30 -0500   Mon, 11 Nov 2019 08:15:26 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Mon, 11 Nov 2019 08:45:30 -0500   Mon, 11 Nov 2019 08:15:26 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Mon, 11 Nov 2019 08:45:30 -0500   Mon, 11 Nov 2019 08:15:26 -0500   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:
  InternalIP:  192.168.0.148
  Hostname:    k8s-master
Capacity:
 cpu:                2
 ephemeral-storage:  17394Mi
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             2044472Ki
 pods:               110
Allocatable:
 cpu:                2
 ephemeral-storage:  16415037823
 hugepages-1Gi:      0
 hugepages-2Mi:      0
 memory:             1942072Ki
 pods:               110
System Info:
 Machine ID:                 d682bcf55261444ba9fdbc6b6c24b003
 System UUID:                F32D4D56-E4DC-839B-5A27-E139BFD484F3
 Boot ID:                    c97afef0-66da-445e-85e8-f4acf957c9bd
 Kernel Version:             3.10.0-957.el7.x86_64
 OS Image:                   CentOS Linux 7 (Core)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://18.3.0
 Kubelet Version:            v1.16.2
 Kube-Proxy Version:         v1.16.2
PodCIDR:                     10.244.0.0/24
PodCIDRs:                    10.244.0.0/24
Non-terminated Pods:         (6 in total)
  Namespace                  Name                                  CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                  ----                                  ------------  ----------  ---------------  -------------  ---
  kube-system                etcd-k8s-master                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         29m
  kube-system                kube-apiserver-k8s-master             250m (12%)    0 (0%)      0 (0%)           0 (0%)         29m
  kube-system                kube-controller-manager-k8s-master    200m (10%)    0 (0%)      0 (0%)           0 (0%)         29m
  kube-system                kube-flannel-ds-amd64-lkclv           100m (5%)     100m (5%)   50Mi (2%)        50Mi (2%)      26m
  kube-system                kube-proxy-cz2hd                      0 (0%)        0 (0%)      0 (0%)           0 (0%)         30m
  kube-system                kube-scheduler-k8s-master             100m (5%)     0 (0%)      0 (0%)           0 (0%)         29m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                650m (32%)  100m (5%)
  memory             50Mi (2%)   50Mi (2%)
  ephemeral-storage  0 (0%)      0 (0%)
Events:
  Type    Reason                   Age                From                    Message
  ----    ------                   ----               ----                    -------
  Normal  Starting                 30m                kubelet, k8s-master     Starting kubelet.
  Normal  NodeHasSufficientMemory  30m (x8 over 30m)  kubelet, k8s-master     Node k8s-master status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    30m (x8 over 30m)  kubelet, k8s-master     Node k8s-master status is now: NodeHasNoDiskPressure
  Normal  NodeHasSufficientPID     30m (x7 over 30m)  kubelet, k8s-master     Node k8s-master status is now: NodeHasSufficientPID
  Normal  NodeAllocatableEnforced  30m                kubelet, k8s-master     Updated Node Allocatable limit across pods
  Normal  Starting                 30m                kube-proxy, k8s-master  Starting kube-proxy.

@neolit123
Copy link
Member

neolit123 commented Nov 11, 2019

/triage support

try another CNI plugin, please.
flannel is not as well maintained as Calico or WeaveNet.

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network

if still does not work:

are you getting any logs from the coredns pods?
any logs from the kubelet?

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Nov 11, 2019
@oneslideicywater
Copy link
Author

@neolit123

I don't want alternatives,just provide some work around about that.bro

I don't have log about pods that run coreDNS. but some more information here when diving in documents a little bit

[root@localhost ~]# kubectl describe pod coredns-58cc8c89f4-5kdx5 -n kube-system
Name:                 coredns-58cc8c89f4-5kdx5
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 <none>
Labels:               k8s-app=kube-dns
                      pod-template-hash=58cc8c89f4
Annotations:          <none>
Status:               Pending
IP:                   
IPs:                  <none>
Controlled By:        ReplicaSet/coredns-58cc8c89f4
Containers:
  coredns:
    Image:       registry.aliyuncs.com/google_containers/coredns:1.6.2
    Ports:       53/UDP, 53/TCP, 9153/TCP
    Host Ports:  0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from coredns-token-zwfww (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  coredns-token-zwfww:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  coredns-token-zwfww
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  beta.kubernetes.io/os=linux
Tolerations:     CriticalAddonsOnly
                 node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age        From               Message
  ----     ------            ----       ----               -------
  Warning  FailedScheduling  <unknown>  default-scheduler  0/1 nodes are available: 1 node(s) had taints that the pod didn't tolerate.

and

[root@localhost ~]# kubectl get pod  coredns-58cc8c89f4-2z55k  -o yaml -n kube-system
apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2019-11-11T13:15:48Z"
  generateName: coredns-58cc8c89f4-
  labels:
    k8s-app: kube-dns
    pod-template-hash: 58cc8c89f4
  name: coredns-58cc8c89f4-2z55k
  namespace: kube-system
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: ReplicaSet
    name: coredns-58cc8c89f4
    uid: 15d09292-516a-4bb4-be61-a61501d3b4dd
  resourceVersion: "360"
  selfLink: /api/v1/namespaces/kube-system/pods/coredns-58cc8c89f4-2z55k
  uid: 52aa274b-961f-4d34-a021-633b45418015
spec:
  containers:
  - args:
    - -conf
    - /etc/coredns/Corefile
    image: registry.aliyuncs.com/google_containers/coredns:1.6.2
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 5
      httpGet:
        path: /health
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 60
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 5
    name: coredns
    ports:
    - containerPort: 53
      name: dns
      protocol: UDP
    - containerPort: 53
      name: dns-tcp
      protocol: TCP
    - containerPort: 9153
      name: metrics
      protocol: TCP
    readinessProbe:
      failureThreshold: 3
      httpGet:
        path: /ready
        port: 8181
        scheme: HTTP
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 1
    resources:
      limits:
        memory: 170Mi
      requests:
        cpu: 100m
        memory: 70Mi
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
        - NET_BIND_SERVICE
        drop:
        - all
      readOnlyRootFilesystem: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/coredns
      name: config-volume
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: coredns-token-zwfww
      readOnly: true
  dnsPolicy: Default
  enableServiceLinks: true
  nodeSelector:
    beta.kubernetes.io/os: linux
  priority: 2000000000
  priorityClassName: system-cluster-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: coredns
  serviceAccountName: coredns
  terminationGracePeriodSeconds: 30
  tolerations:
  - key: CriticalAddonsOnly
    operator: Exists
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - configMap:
      defaultMode: 420
      items:
      - key: Corefile
        path: Corefile
      name: coredns
    name: config-volume
  - name: coredns-token-zwfww
    secret:
      defaultMode: 420
      secretName: coredns-token-zwfww
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-11-11T13:15:48Z"
    message: '0/1 nodes are available: 1 node(s) had taints that the pod didn''t tolerate.'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: Burstable

but I got nothing when

kubectl logs  coredns-58cc8c89f4-2z55k -n kube-system

in either pod

[root@localhost ~]# kubectl get pods --all-namespaces
NAMESPACE     NAME                                 READY   STATUS                  RESTARTS   AGE
kube-system   coredns-58cc8c89f4-2z55k             0/1     Pending                 0          12h
kube-system   coredns-58cc8c89f4-5kdx5             0/1     Pending                 0          12h
kube-system   etcd-k8s-master                      1/1     Running                 0          12h
kube-system   kube-apiserver-k8s-master            1/1     Running                 0          12h
kube-system   kube-controller-manager-k8s-master   1/1     Running                 0          12h
kube-system   kube-flannel-ds-amd64-lkclv          0/1     Init:ImagePullBackOff   0          12h
kube-system   kube-proxy-cz2hd                     1/1     Running                 0          12h
kube-system   kube-scheduler-k8s-master            1/1     Running                 0          12h

I wonder

  • whether coredns is not allowed run on a master?

@oneslideicywater
Copy link
Author

I got more information about my situation:

[root@localhost ~]# journalctl -fu kubelet
-- Logs begin at Sun 2019-11-10 14:25:12 EST. --
Nov 11 20:38:42 k8s-master kubelet[27869]: E1111 20:38:42.896019   27869 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Nov 11 20:38:46 k8s-master kubelet[27869]: W1111 20:38:46.496271   27869 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Nov 11 20:38:47 k8s-master kubelet[27869]: E1111 20:38:47.897181   27869 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Nov 11 20:38:48 k8s-master kubelet[27869]: E1111 20:38:48.199334   27869 pod_workers.go:191] Error syncing pod 0a620b01-49cd-498f-8f3f-70a6ed042484 ("kube-flannel-ds-amd64-lkclv_kube-system(0a620b01-49cd-498f-8f3f-70a6ed042484)"), skipping: failed to "StartContainer" for "install-cni" with ImagePullBackOff: "Back-off pulling image \"quay.io/coreos/flannel:v0.11.0-amd64\""
Nov 11 20:38:51 k8s-master kubelet[27869]: W1111 20:38:51.496883   27869 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Nov 11 20:38:52 k8s-master kubelet[27869]: E1111 20:38:52.899501   27869 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Nov 11 20:38:56 k8s-master kubelet[27869]: W1111 20:38:56.497534   27869 cni.go:237] Unable to update cni config: no networks found in /etc/cni/net.d
Nov 11 20:38:57 k8s-master kubelet[27869]: E1111 20:38:57.903069   27869 kubelet.go:2187] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Nov 11 20:39:01 k8s-master kubelet[27869]: E1111 20:39:01.200085   27869 pod_workers.go:191] Error syncing pod 0a620b01-49cd-498f-8f3f-70a6ed042484 ("kube-flannel-ds-amd64-lkclv_kube-system(0a620b01-49cd-498f-8f3f-70a6ed042484)"), skipping: failed to "StartContainer" for "install-cni" with ImagePullBackOff: "Back-off pulling image \"quay.io/coreos/flannel:v0.11.0-amd64\""

@oneslideicywater
Copy link
Author

my flannel config file when use kubectl apply -f ,I download it and modify it,here's the snippet

data:
  cni-conf.json: |
    {
      "cniVersion": "0.3.1",
      "name": "cbr0",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "Backend": {
        "Type": "vxlan"
      }
    }

@neolit123
Copy link
Member

Nov 11 20:39:01 k8s-master kubelet[27869]: E1111 20:39:01.200085 27869 pod_workers.go:191] Error syncing pod 0a620b01-49cd-498f-8f3f-70a6ed042484 ("kube-flannel-ds-amd64-lkclv_kube-system(0a620b01-49cd-498f-8f3f-70a6ed042484)"), skipping: failed to "StartContainer" for "install-cni" with ImagePullBackOff: "Back-off pulling image "quay.io/coreos/flannel:v0.11.0-amd64""

looks like the flannel image cannot be pulled.
you can try downloading the image and uploading it to registry that the kubelet can pull from - e.g. a local one.

i'm sorry but this is not a kubeadm issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

3 participants