Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbd: Snapshot creation fails when Ceph user is restricted to a rados namespace #3231

Closed
BenoitKnecht opened this issue Jul 5, 2022 · 0 comments · Fixed by #3232
Closed

Comments

@BenoitKnecht
Copy link
Contributor

Describe the bug

Volume snapshots cannot be created if the Ceph CSI is setup to use a rados namespace and its user only has permissions on that namespace (as opposed to the whole pool).

Environment details

  • Image/version of Ceph CSI driver : 3.6.1
  • Helm chart version : 3.6.1
  • Kernel version : 4.18
  • Mounter used for mounting PVC (for cephFS its fuse or kernel. for rbd its
    krbd or rbd-nbd) : krbd
  • Kubernetes cluster version : 1.23
  • Ceph cluster version : 16.2.9

Steps to reproduce

I have setup ceph-csi-rbd to write volumes into a namespace within the rbd pool of our Ceph cluster:

[
  {
    "clusterID": "56dfe7fc-83fe-42b1-8e7c-be9cc71c55a8",
    "monitors": [
      "10.0.0.1:3300",
      "10.0.0.2:3300",
      "10.0.0.3:3300"
    ],
    "rbd": {
      "radosNamespace": "k8s.test"
    }
  }
]

And the user for ceph-csi-rbd only has permission to work in that namespace:

[client.k8s.test]
        key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX==
        caps mgr = "profile rbd pool=rbd namespace=k8s.test"
        caps mon = "profile rbd"
        caps osd = "profile rbd pool=rbd namespace=k8s.test"

I'm able to create PVCs in k8s and use them without any issue.

But then I configured the external-snapshotter with the following VolumeSnapshotClass:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: default
driver: rbd.csi.ceph.com
parameters:
  clusterID: 56dfe7fc-83fe-42b1-8e7c-be9cc71c55a8

  csi.storage.k8s.io/snapshotter-secret-namespace: ceph-csi-rbd
  csi.storage.k8s.io/snapshotter-secret-name: csi-rbd-secret
deletionPolicy: Delete

When I try and create a snapshot like this:

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
  name: rbd-pvc-snapshot
spec:
  volumeSnapshotClassName: default
  source:
    persistentVolumeClaimName: rbd-pvc

I can see a snapshot get created and deleted repeatedly in Ceph, but from the perspective of the Kubernetes cluster, it fails:

Status:
  Bound Volume Snapshot Content Name:  snapcontent-2e734aba-76ef-4879-b5b5-5d01ffffc77b
  Error:
    Message:     Failed to check and update snapshot content: failed to take snapshot of the volume 0001-0024-56dfe7fc-83fe-42b1-8e7c-be9cc71c55a8-0000000000000011-a989a43d-d1fa-11ec-8f1e-22c5cc23c07d: "rpc error: code = Internal desc = rb
d: ret=-1, Operation not permitted"
    Time:        2022-07-05T13:38:47Z
  Ready To Use:  false
Events:
  Type    Reason            Age   From                 Message
  ----    ------            ----  ----                 -------
  Normal  CreatingSnapshot  17s   snapshot-controller  Waiting for a snapshot default/rbd-pvc-snapshot to be created by the CSI driver.

The only way I managed to make it work is to grant the client.k8s.test user caps beyond its namespace:

[client.k8s.test]
        key = XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX==
        caps mgr = "profile rbd pool=rbd"
        caps mon = "profile rbd"
        caps osd = "profile rbd pool=rbd"

After changing caps and restarting the provisioner with kubectl -n ceph-csi-rbd delete pod -l app=ceph-csi-rbd,component=provisioner, I can successfully create a snapshot.

Actual results

The volume snapshot is never properly created.

Expected behavior

The volume snapshot should be created even if the user is restricted to a single rados namespace. Creating volumes (PVCs) works fine with those permissions, only snapshots are broken.

Logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant