Problem during driver installation: "path /var/lib/kubelet is mounted on / but it is not a shared mount" #335

ptitvert · 2021-08-10T11:50:16Z

What happened:

We are trying to deploy this driver on our k8s environment, and it doesn't work.
We get the following error:

  Normal   Created  59m (x156 over 14h)   kubelet  Created container smb
  Warning  Failed   59m (x156 over 14h)   kubelet  Error: failed to start container "smb": Error response from daemon: path /var/lib/kubelet is mounted on / but it is not a shared mount
  Warning  BackOff  33m (x3669 over 14h)  kubelet  Back-off restarting failed container

What you expected to happen:

I would expect the container smb to be started, as stated in the installation guide.

How to reproduce it:

We have follow the installation guide :

https://github.com/kubernetes-csi/csi-driver-smb/blob/master/docs/install-csi-driver-v1.2.0.m

Then:

kubectl -n kube-system get pod -o wide --watch -l app=csi-smb-controller

the output

NAME                                 READY   STATUS    RESTARTS   AGE   IP           NODE                                   NOMINATED NODE   READINESS GATES
csi-smb-controller-c74858679-474m5   3/3     Running   0          98m   1.2.3.4   6c63deaf-6720-4b93-86f7-578e2f41021c   <none>           <none>
csi-smb-controller-c74858679-btbwd   3/3     Running   0          91m   1.2.3.4   dce37ac7-bfd9-4996-9c8d-a31002fec280   <none>           <none>

Now the node:

kubectl -n kube-system get pod -o wide --watch -l app=csi-smb-node

and output:

NAME                 READY   STATUS              RESTARTS   AGE   IP           NODE                                   NOMINATED NODE   READINESS GATES
csi-smb-node-4tfgk   2/3     CrashLoopBackOff    172        14h   1.2.3.10    6c63deaf-6720-4b93-86f7-578e2f41021c   <none>           <none>
csi-smb-node-7zp6r   2/3     CrashLoopBackOff    172        14h   1.2.3.11    dce37ac7-bfd9-4996-9c8d-a31002fec280   <none>           <none>
csi-smb-node-8gzwl   2/3     RunContainerError   155        14h   1.2.3.12   0c743158-5d9d-404f-94b4-372146e8a5e8   <none>           <none>
csi-smb-node-dn9pl   2/3     CrashLoopBackOff    153        14h   1.2.3.13   aa3ffe51-7a2e-4ee7-a6b3-f69002407f89   <none>           <none>
csi-smb-node-dz5vc   2/3     CrashLoopBackOff    162        14h   1.2.3.14    7df31584-42ae-4f4f-a146-fb597baab05d   <none>           <none>
csi-smb-node-tl7zb   2/3     CrashLoopBackOff    170        14h   1.2.3.15    1a9e7673-b7f8-472d-a002-c6ed88bbca86   <none>           <none>
csi-smb-node-w5cgc   2/3     CrashLoopBackOff    157        14h   1.2.3.16    12eda9df-bb3e-40ed-b6bd-356dc476694a   <none>           <none>
csi-smb-node-xj7zz   2/3     CrashLoopBackOff    165        14h   1.2.3.17   d0c628ce-0e92-48cd-b14c-09779dfd0333   <none>           <none>

and if I check one of one of the crashed pod:

Name:                 csi-smb-node-7zp6r
Namespace:            output-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 dce37ac7-bfd9-4996-9c8d-a31002fec280/1.2.3.10
Start Time:           Mon, 09 Aug 2021 23:22:01 +0200
Labels:               app=csi-smb-node
                      controller-revision-hash=7fb95cd7bd
                      pod-template-generation=2
Annotations:          kubernetes.io/psp: cnbb-privileged
Status:               Running
IP:                   1.2.3.10
IPs:
  IP:           1.2.3.10
Controlled By:  DaemonSet/csi-smb-node
Containers:
  liveness-probe:
    Container ID:  docker://d09a153149749650fa43a395dc52aaa827682adc2154a0955c835179bc59d273
    Image:         k8s-grc-docker-remote.artifactory.example.com/sig-storage/livenessprobe:v2.3.0
    Image ID:      docker-pullable://k8s-grc-docker-remote.artifactory.example.com/sig-storage/livenessprobe@sha256:1b7c978a792a8fa4e96244e8059bd71bb49b07e2e5a897fb0c867bdc6db20d5d
    Port:          <none>
    Host Port:     <none>
    Args:
      --csi-address=/csi/csi.sock
      --probe-timeout=3s
      --health-port=29643
      --v=2
    State:          Running
      Started:      Mon, 09 Aug 2021 23:22:03 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:        10m
      memory:     20Mi
    Environment:  <none>
    Mounts:
      /csi from socket-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-smb-controller-sa-token-szcqh (ro)
  node-driver-registrar:
    Container ID:  docker://38eb4a83553a2783870bcc9ca4b84a68a80b4475c63f5613f1a3a739059cce33
    Image:         k8s-grc-docker-remote.artifactory.example.com/sig-storage/csi-node-driver-registrar:v2.2.0
    Image ID:      docker-pullable://k8s-grc-docker-remote.artifactory.example.com/sig-storage/csi-node-driver-registrar@sha256:2dee3fe5fe861bb66c3a4ac51114f3447a4cd35870e0f2e2b558c7a400d89589
    Port:          <none>
    Host Port:     <none>
    Args:
      --csi-address=$(ADDRESS)
      --kubelet-registration-path=$(DRIVER_REG_SOCK_PATH)
      --v=2
    State:          Running
      Started:      Mon, 09 Aug 2021 23:22:05 +0200
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:     10m
      memory:  20Mi
    Environment:
      ADDRESS:               /csi/csi.sock
      DRIVER_REG_SOCK_PATH:  /var/lib/kubelet/plugins/smb.csi.k8s.io/csi.sock
    Mounts:
      /csi from socket-dir (rw)
      /registration from registration-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-smb-controller-sa-token-szcqh (ro)
  smb:
    Container ID:  docker://95e22c0234770dd7a6e080e42fd189273650306ac1c55663f0c8dfcc5ad162b8
    Image:         artifactory-mirror.example.com/k8s/csi/smb-csi:v1.2.0
    Image ID:      docker-pullable://artifactory-mirror.example.com/k8s/csi/smb-csi@sha256:dedf9b4fbf860e0933210583ee4b6b41b0c2c551bf296370873689ee60df2644
    Port:          29643/TCP
    Host Port:     29643/TCP
    Args:
      --v=5
      --endpoint=$(CSI_ENDPOINT)
      --nodeid=$(KUBE_NODE_NAME)
      --metrics-address=0.0.0.0:29645
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       ContainerCannotRun
      Message:      path /var/lib/kubelet is mounted on / but it is not a shared mount
      Exit Code:    128
      Started:      Tue, 10 Aug 2021 13:42:36 +0200
      Finished:     Tue, 10 Aug 2021 13:42:36 +0200
    Ready:          False
    Restart Count:  173
    Limits:
      cpu:     400m
      memory:  200Mi
    Requests:
      cpu:     10m
      memory:  20Mi
    Liveness:  http-get http://:healthz/healthz delay=30s timeout=10s period=30s #success=1 #failure=5
    Environment:
      CSI_ENDPOINT:    unix:///csi/csi.sock
      KUBE_NODE_NAME:   (v1:spec.nodeName)
    Mounts:
      /csi from socket-dir (rw)
      /var/lib/kubelet/ from mountpoint-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from csi-smb-controller-sa-token-szcqh (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  socket-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins/smb.csi.k8s.io
    HostPathType:  DirectoryOrCreate
  mountpoint-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/
    HostPathType:  DirectoryOrCreate
  registration-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/kubelet/plugins_registry/
    HostPathType:  DirectoryOrCreate
  csi-smb-controller-sa-token-szcqh:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  csi-smb-controller-sa-token-szcqh
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  kubernetes.io/os=linux
Tolerations:
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable:NoExecute
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason   Age                    From     Message
  ----     ------   ----                   ----     -------
  Normal   Pulled   21m (x169 over 14h)    kubelet  Container image "artifactory-mirror.example.com/k8s/csi/smb-csi:v1.2.0" already present on machine
  Warning  BackOff  116s (x3994 over 14h)  kubelet  Back-off restarting failed container

Anything else we need to know?:

We are using VMWare PKS for the Kubernetes cluster.

Environment:

CSI Driver version: V1.2
Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.12", GitCommit:"7cd5e9086de8ae25d6a1514d0c87bac67ca4a481", GitTreeState:"clean", BuildDate:"2020-11-12T09:18:55Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.9+vmware.1", GitCommit:"f856d899461199c512c21d0fdc67d49cc70a8963", GitTreeState:"clean", BuildDate:"2021-03-19T23:57:11Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}

OS (e.g. from /etc/os-release): Ubuntu 16.04 LTS
Kernel (e.g. uname -a):

Linux docbase-deployment-8584bc6979-9fpbb 4.15.0-142-generic #146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

The text was updated successfully, but these errors were encountered:

andyzhangx · 2021-08-11T11:49:06Z

would you add /bin/mount --make-shared /var/lib/kubelet in your agent node config? That should solve the issue.

Full configuration of kubelet is like this on AKS:

[Unit]
Description=Kubelet
ConditionPathExists=/usr/local/bin/kubelet


[Service]
Restart=always
EnvironmentFile=/etc/default/kubelet
SuccessExitStatus=143
ExecStartPre=/bin/bash /opt/azure/containers/kubelet.sh
ExecStartPre=/bin/mkdir -p /var/lib/kubelet
ExecStartPre=/bin/mkdir -p /var/lib/cni
ExecStartPre=/bin/bash -c "if [ $(mount | grep \"/var/lib/kubelet\" | wc -l) -le 0 ] ; then /bin/mount --bind /var/lib/kubelet /var/lib/kubelet ; fi"
ExecStartPre=/bin/mount --make-shared /var/lib/kubelet

ExecStartPre=-/sbin/ebtables -t nat --list
ExecStartPre=-/sbin/iptables -t nat --numeric --list

ExecStart=/usr/local/bin/kubelet \
        --enable-server \
        --node-labels="${KUBELET_NODE_LABELS}" \
        --v=2 --container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock \
        --volume-plugin-dir=/etc/kubernetes/volumeplugins \
        --kubeconfig /var/lib/kubelet/kubeconfig \
        --bootstrap-kubeconfig /var/lib/kubelet/bootstrap-kubeconfig \
        $KUBELET_FLAGS \
        $KUBELET_REGISTER_NODE $KUBELET_REGISTER_WITH_TAINTS

[Install]
WantedBy=multi-user.target
```

k8s-triage-robot · 2021-11-09T12:15:28Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2021-12-09T13:02:13Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-01-08T13:25:10Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-01-08T13:25:13Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 9, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 9, 2021

k8s-ci-robot closed this as completed Jan 8, 2022

andyzhangx mentioned this issue Mar 23, 2022

Liveness Probe of Container "node-driver-registrar" failes #434

Closed

zwwhdls mentioned this issue May 6, 2023

[BUG][WSL2] failed to generate spec: path "/var/lib/juicefs/volume" is mounted on "/" but it is not a shared mount juicedata/juicefs-csi-driver#647

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem during driver installation: "path /var/lib/kubelet is mounted on / but it is not a shared mount" #335

Problem during driver installation: "path /var/lib/kubelet is mounted on / but it is not a shared mount" #335

ptitvert commented Aug 10, 2021

andyzhangx commented Aug 11, 2021

k8s-triage-robot commented Nov 9, 2021

k8s-triage-robot commented Dec 9, 2021

k8s-triage-robot commented Jan 8, 2022

k8s-ci-robot commented Jan 8, 2022

Problem during driver installation: "path /var/lib/kubelet is mounted on / but it is not a shared mount" #335

Problem during driver installation: "path /var/lib/kubelet is mounted on / but it is not a shared mount" #335

Comments

ptitvert commented Aug 10, 2021

andyzhangx commented Aug 11, 2021

k8s-triage-robot commented Nov 9, 2021

k8s-triage-robot commented Dec 9, 2021

k8s-triage-robot commented Jan 8, 2022

k8s-ci-robot commented Jan 8, 2022