Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency issue when provisioning multiple encrypted CephFS instances #4592

Closed
NymanRobin opened this issue Apr 28, 2024 · 12 comments · Fixed by #4609
Closed

Concurrency issue when provisioning multiple encrypted CephFS instances #4592

NymanRobin opened this issue Apr 28, 2024 · 12 comments · Fixed by #4609
Labels
bug Something isn't working component/cephfs Issues related to CephFS rebase update the version of an external component

Comments

@NymanRobin
Copy link
Contributor

When creating multiple pods with separate PVC's the ceph-csi might fail to provision some of the encrypted cephfs instances.

The error from the pods logs is the following:

MountVolume.SetUp failed for volume "pvc-443ef844-03ce-4b87-affe-a802009aa915" : rpc error: code = Internal desc = no fscrypt policy set on directory "/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.cephfs.csi.ceph.com/32418680d66053d8dac8f56697397c8f91f36ca386dbb8d0a0d642bb76609721/globalmount/ceph-csi-encrypted": file or directory "/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.cephfs.csi.ceph.com/32418680d66053d8dac8f56697397c8f91f36ca386dbb8d0a0d642bb76609721/globalmount/ceph-csi-encrypted" is not encrypted   

From this it seems setting up the encryption fails in the ceph-csi from the cephfs-csi plugin logs this can be seen, which makes me suspect a concurrency issue:

E0426 12:53:04.134079       1 fscrypt.go:262] ID: 9782 Req-ID: 0001-0009-rook-ceph-0000000000000002-a3186144-9960-42dd-9fce-4cc125117e1a fscrypt: Failed to apply protector (see also kernel log): %!w(*actions.ErrDifferentFilesystem=&{0xc00157b130 0xc00208a140}) 

Environment

This problem is reproducible in the rook development environment from master and kernel support can be setup for example by following the instructions that was provided in this PR #3460

The following bash script can be executed and the problem should be seen at least for me it happens every time without exception only variance is how many pods fail to deploy

#!/bin/bash

for i in {1..10}
do
    # Generate unique names for the pod and PVC using the counter variable
    pod_name="pod-$i"
    pvc_name="pvc-$i"

    echo "${pvc_name}"
    echo "${pod_name}"

    # Your command to create a pod and PVC with unique names
    cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: $pod_name
spec:
  containers:
  - name: my-container
    image: nginx
    volumeMounts:
    - mountPath: /var/myVolume
      name: my-volume
  volumes:
  - name: my-volume
    persistentVolumeClaim:
      claimName: $pvc_name
EOF

    cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
 name: $pvc_name
spec:
 storageClassName: rook-cephfs
 accessModes:
    - ReadWriteOnce
 resources:
    requests:
      storage: 1Gi
EOF
    # If sleep is uncommented this code runs without problems!
    # sleep 10 
done

I will try to debug this further but if anyone has any ideas or pointer regarding it I am all ears or if anyone has seen this by chance and has some ideas, thanks!

@NymanRobin
Copy link
Contributor Author

Will keep digging into the issue, can also upload the full logs here if anyone is interested!

@nixpanic nixpanic added the component/cephfs Issues related to CephFS label Apr 29, 2024
@Madhu-1 Madhu-1 added the bug Something isn't working label Apr 30, 2024
@NymanRobin
Copy link
Contributor Author

It appears to me that this would be ideally addressed within the fscrypt module. I'll open an issue there and see what their response is. Explanation below, other option I can think of currently would be to limit the concurrency.

The problem seems to arise from the fact that when a new mount is created, the UpdateMountInfo function in the fscrypt module is called. However, this function recreates the mountsByDevice map each time it's called. Consequently, the memory references of the mount objects get updated. This results in a mismatch in policy.apply cause we store the old reference in the fscryptContext (%!w(*actions.ErrDifferentFilesystem=&{0xc00157b130 0xc00208a140}).

So, when the lookup from the map is performed and compared to the context in the policy.apply, the memory addresses don't match, even though the device number and path are the same in the new mount object.

I tested if we, keep the old references in the map and only update new this issue is resolved so let's hope that is acceptable for fscrypt

@nixpanic nixpanic added the rebase update the version of an external component label Apr 30, 2024
@nixpanic
Copy link
Member

Great find, thanks for identifying the issue! Let us know if you need assistance with a fix for fscrypt.

@NymanRobin
Copy link
Contributor Author

Thanks for the offer, but the change was small so I got it!
Here is the PR: google/fscrypt#413

@nixpanic should we keep this issue open to upgrade the fscrypt version once it is available?

@nixpanic
Copy link
Member

nixpanic commented May 2, 2024

@nixpanic should we keep this issue open to upgrade the fscrypt version once it is available?

If you want to keep this open until Ceph-CSI rebased the package, that is fine. I leave it up to you what you want to do.

@NymanRobin
Copy link
Contributor Author

The fscrypt PR is merged, but when discussing with maintainers they mentioned it might be months until the next release..

@Madhu-1
Copy link
Collaborator

Madhu-1 commented May 8, 2024

The fscrypt PR is merged, but when discussing with maintainers they mentioned it might be months until the next release..

@NymanRobin we can update the go.mod to point to the exact commit that we require until we get the next release.

@NymanRobin
Copy link
Contributor Author

@Madhu-1 Thanks for the information! I opened a PR for this, let me know if it looks okay to you
Also is it possible to get the cherry-pick to 3.11 once this PR gets merged?

@Madhu-1
Copy link
Collaborator

Madhu-1 commented May 8, 2024

@Madhu-1 Thanks for the information! I opened a PR for this, let me know if it looks okay to you Also is it possible to get the cherry-pick to 3.11 once this PR gets merged?

Yes we can do that if someone is blocked because of this fix.

@NymanRobin
Copy link
Contributor Author

Awesome, yes that would help us get further with this feature!

I see the PR cannot merge because of this error:

Unable to update: user NymanRobin is unknown.
Please make sure NymanRobin has logged in Mergify dashboard.

@NymanRobin
Copy link
Contributor Author

Thanks for the help getting the changes in all!
Is there any plans for an patch release of 3.11, this is not super urgent just thought I would ask in case there is a plan already

@Madhu-1
Copy link
Collaborator

Madhu-1 commented May 15, 2024

@NymanRobin not yet, we will plan it next 1 or 2 weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working component/cephfs Issues related to CephFS rebase update the version of an external component
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants