Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importer pod Permission denied for Block storage #2638

Closed
yuntianfeijing opened this issue Mar 14, 2023 · 24 comments
Closed

importer pod Permission denied for Block storage #2638

yuntianfeijing opened this issue Mar 14, 2023 · 24 comments
Labels

Comments

@yuntianfeijing
Copy link

yuntianfeijing commented Mar 14, 2023

What happened:
when I set pvc as:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    cdi.kubevirt.io/storage.import.certConfigMap: ca.crt
    cdi.kubevirt.io/storage.import.endpoint: https://xxx/debian-10.12.3-20220518-qga-amd64.qcow2
  name: img-debian-test
  namespace: kubevirt
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: ceph-rbd
  volumeMode: Block

the importer pod get error:

I0314 06:05:45.792984       1 importer.go:103] Starting importer
E0314 06:05:45.794803       1 importer.go:129] exit status 1, blockdev: cannot open /dev/cdi-block-volume: Permission denied

kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceBlock
pkg/util/util.go:139
kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceByVolumeMode
pkg/util/util.go:106
main.main
cmd/cdi-importer/importer.go:127
runtime.main
GOROOT/src/runtime/proc.go:250
runtime.goexit
GOROOT/src/runtime/asm_amd64.s:1571

What you expected to happen:
I think This is caused by securityContext setting runAsNonRoot: true and runAsUser: 107
image

How to reproduce it (as minimally and precisely as possible):
Steps to reproduce the behavior.

Additional context:
Add any other context about the problem here.

Environment:

  • CDI version (use kubectl get deployments cdi-deployment -o yaml): v1.56.0
  • Kubernetes version (use kubectl version): N/A
  • DV specification: v1.56.0
  • Cloud provider or hardware configuration: N/A
  • OS (e.g. from /etc/os-release): N/A
  • Kernel (e.g. uname -a): N/A
  • Install tools: N/A
  • Others: N/A
@alromeros
Copy link
Collaborator

Hello @yuntianfeijing, thanks for opening this issue.

I think you are right in pointing out runAsNonRoot as the cause of your problem since some users have had similar issues. We have some documentation that helps in these cases (please, check #2458) and explains how to configure the CRI to properly set permissions on the device.

You can also check https://kubernetes.io/blog/2021/11/09/non-root-containers-and-devices/ for more context.

Let us know if that helps!

@caijian76
Copy link

How can I config device_ownership_from_security_context = true in Docker?

@alromeros
Copy link
Collaborator

Hi @caijian76,

That really depends on your container runtime, probably containerd since you are using Docker.

You can check https://github.com/containerd/containerd/blob/main/docs/PLUGINS.md for more information about plugins in containerd. Remember, you need to add:

[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    device_ownership_from_security_context = true

@caijian76
Copy link

caijian76 commented Mar 15, 2023

Hi @alromeros
I configed the /etc/containerd/config.toml ,and add

version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = true

systemctl restart containerd.service

but ,still show Permission denied

@alromeros
Copy link
Collaborator

Is your error the same as the one reported in this issue? (some information about your case and how you reproduce it could be helpful). If you do kubectl get nodes -o wide, which container runtime appears in your nodes?

@moshuipan
Copy link

moshuipan commented Mar 16, 2023

if your kubelet is using dockershim ,is not support the flag. containerd use the flag in cri server,but docker will not call it.

@caijian76
Copy link

THK all , I recommend that the development team indicate the operating environment in the manual to avoid confusion.

@alromeros
Copy link
Collaborator

Hi @caijian76, documenting all non-CRI compliant container runtimes might be out of scope for CDI, but I'll discuss with the team to see if we can improve that documentation.

@dimm0
Copy link

dimm0 commented Mar 18, 2023

Why not use securityContext/fsGroup parameter to chown the volume?

@alromeros
Copy link
Collaborator

It's not recommended for reasons that could or couldn't be relevant to your case (see https://kubernetes.io/blog/2021/11/09/non-root-containers-and-devices/ for more information, the example there references issues with GPUs, not block devices). Still, I guess it's possible to workaround this with fsGroup as you mentioned.

@awels
Copy link
Member

awels commented Mar 20, 2023

So fsGroup does not apply to block devices. We do in fact set fsGroup and the appropriate securityContext for the pods. But the fsGroup doesn't apply to block devices as noted in the blog post.

@alromeros
Copy link
Collaborator

Right, @awels thanks for the clarification!

@dimm0
Copy link

dimm0 commented Mar 21, 2023

Are you suggesting to change the default CRI settings to make CDI work??
I still don't get it, why not simply enable fsGroup? It works fine for all Ceph block devices in jupyterhub in our cluster

@awels
Copy link
Member

awels commented Mar 21, 2023

fsGroup like the name suggests applies to file systems. It does NOT apply to block devices. Also we set fsGroup as noted by the original poster.

@dimm0
Copy link

dimm0 commented Mar 21, 2023

fsGroup like the name suggests applies to file systems. It does NOT apply to block devices.

It perfectly applies to block devices in jupyterhub running in my cluster.

Also we set fsGroup as noted by the original poster.

In which version? I just upgraded from 1.52.2 to 1.56.0, and I still don't see it in my securityContext:

    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      runAsNonRoot: true
      runAsUser: 107

@dimm0
Copy link

dimm0 commented Mar 21, 2023

I see now, it works when setting volumeMode: Filesystem for ceph block folume.

@awels
Copy link
Member

awels commented Mar 22, 2023

When I say block device I mean volumeMode: block in the pod spec. All storage at the end is on some block device, it just has a file system on it. I looked up jupyterhub and it appeared to only use filesystem volume modes.

@a180285
Copy link

a180285 commented Mar 29, 2023

Hello, I got the same issue. And being blocked.

Here is my error

I0329 13:30:49.456092       1 importer.go:103] Starting importer
E0329 13:30:49.458047       1 importer.go:129] exit status 1, blockdev: cannot open /dev/cdi-block-volume: Permission denied

kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceBlock
	pkg/util/util.go:139
kubevirt.io/containerized-data-importer/pkg/util.GetAvailableSpaceByVolumeMode
	pkg/util/util.go:106
main.main
	cmd/cdi-importer/importer.go:127
runtime.main
	GOROOT/src/runtime/proc.go:250
runtime.goexit
	GOROOT/src/runtime/asm_amd64.s:1571

@a180285
Copy link

a180285 commented Mar 29, 2023

And here is securityContext get from my importer pod.

      securityContext:
        capabilities:
          drop:
            - ALL
        runAsUser: 107
        runAsNonRoot: true
        allowPrivilegeEscalation: false
        seccompProfile:
          type: RuntimeDefault

@awels
Copy link
Member

awels commented Mar 29, 2023

Which CRI are you using? containerd? Also which version of CDI

@a180285
Copy link

a180285 commented Mar 29, 2023

Hi @awels , Thx for your quick reply

here is my env info:

OS: linux (amd64)
OS Image: Ubuntu 22.04.1 LTS
Kernel version: 5.15.0-43-generic
Container runtime: containerd://1.6.14
Kubelet version: v1.25.2

CDI status:
status:
observedVersion: v1.56.0
operatorVersion: v1.56.0
phase: Deployed
targetVersion: v1.56.0

@awels
Copy link
Member

awels commented Mar 29, 2023

Please read this article https://kubernetes.io/blog/2021/11/09/non-root-containers-and-devices/ it is highly likely that is your problem. Essentially there is no fsGroup mechanism for block devices, and since 1.55.0 CDI is running rootless containers for its workloads. In the article it describes a work around for various CRIs, in particular for containerd, which is what you are using.

@a180285
Copy link

a180285 commented Mar 30, 2023

Hi @awels , add following plugin works for me. Thanks again.

[plugins]
  [plugins."io.containerd.grpc.v1.cri"]
    device_ownership_from_security_context = true

For later ones who have same issue.
Config path is: /etc/containerd/config.toml
And then restart the containerd service: sudo systemctl restart containerd

@aglitke
Copy link
Member

aglitke commented Apr 10, 2023

It seems that this issue has been resolved so I will close it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants