-
Notifications
You must be signed in to change notification settings - Fork 561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MountVolume.SetUp failed for volume "efs-pv" #192
Comments
I also have this issue.
|
You might be able to solve this problem by using an updated image of amazon/aws-efs-csi-driver, as shown in the stackoverflow article below. https://stackoverflow.com/questions/62447132/mounting-efs-in-eks-cluster-example-deployment-fails |
I have this issue as well. |
We have this issue also.
On Sunday our efs csi pod died and restarted; I've been using pvc
pv
|
I was able to solve the problem using an updated image of amazon/aws-efs-csi-driver, as shown in the stackoverflow article below. https://stackoverflow.com/questions/62447132/mounting-efs-in-eks-cluster-example-deployment-fails Here are all my steps: Using EFS with EKS Follow this doc : https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html Be careful when you install the csi driver make sure to use the image amazon/aws-efs-csi-driver:latest after installing csi driver check your efs-csi-node name with 'kubectl get pod -n kube-system' Then check the Image version with 'kubectl describe pod efs-csi-node-xxxxx -n kube-system' if you don't have Image: amazon/aws-efs-csi-driver:latest you need to update, to do so :
From
To
Then apply with the option -k because it's a kustomization kind, it must point a folder for exemple: 'kubectl apply -k aws-efs-csi-driver/deploy/kubernetes/overlays/stable/' Then it should be ok you can check using the command to describe the pod and keep following the aws documentation mentioned in the beginning. |
Still didnt resolve my issue... i am getting Mounting arguments: -t efs -o tls fs-xxxxxxx:/ /var/lib/kubelet/pods/xxxxx-xxxx-xxx-xxx/volumes/kubernetes.io~csi/efs-pv/mount |
@nomopo45 I feel 100% sure that I'm using the latest
As you can see I'm using |
I concur...I have pulled the latest and the issue still persist. |
@nmtulloch27 I think I'm on to something; just to confirm - are you using the helm chart for your installation or the kustomize based solution that @nomopo45 suggested? |
In the absense of a regular release cycle the community has started using `:latest`. `helm/values.yaml` provides a `pullPolicy` for the main container, however that is not threaded through to the daemonset; as such kuberentes applies the default "IfNotPresent" value. From a users point of view you're locked into the image at the time you first installed the chart. Additionally you can now specify a pull policy for each of the side car containers also. fixes kubernetes-sigs#192
i am using kustomize based solution that @nomopo45 suggested |
@nmtulloch27 hmm, OK. If you take a look at #193 you can see I was getting tripped up by imagePullPolicy not being set correctly. If you add "imagePullPolicy: Always" under (or around) this line you'll pull the actual latest, not the version that you pulled when you first installed the driver. |
I see, so then you would run kubectl apply -k aws-efs-csi-driver/deploy/kubernetes/base instead of kubectl apply -k aws-efs-csi-driver/deploy/kubernetes/overlays/stable/ is that correct? |
Say, b3baff8 added a new mount in the efs-plugin container, and relies on it existing in the watchdog. If you bounced your workers, but didn't reapply the DaemonSet from latest, that could cause failures that might look like this. |
Since the [fix](kubernetes-sigs/aws-efs-csi-driver#185) for [issue #167](kubernetes-sigs/aws-efs-csi-driver#167) merged, the AWS EFS CSI driver overwrote the [`latest` image tag](sha256-962619a5deb34e1c4257f2120dd941ab026fc96adde003e27f70b65023af5a07?context=explore) to include it. For starters, this means we can update this operator to use the [new method of specifying access points](https://github.com/kubernetes-sigs/aws-efs-csi-driver/tree/0ae998c5a95fe6dbee7f43c182997e64872695e6/examples/kubernetes/access_points#edit-persistent-volume-spec) via a colon-delimited `volumeHandle` as opposed to in `mountOptions`. However, the same update to `latest` also brought in a [commit](kubernetes-sigs/aws-efs-csi-driver@b3baff8) that requires an additional mount in the `efs-plugin` container of the DaemonSet. So we need to update our YAML for that resource at the same time, or everything is broken (this might be upstream [issue #192](kubernetes-sigs/aws-efs-csi-driver#192). This update to the DaemonSet YAML also syncs with [upstream](https://github.com/kubernetes-sigs/aws-efs-csi-driver/blob/0ae998c5a95fe6dbee7f43c182997e64872695e6/deploy/kubernetes/base/node.yaml) by bumping the image versions for the other two containers (csi-node-driver-registrar: v1.1.0 => v1.3.0; and livenessprobe: v1.1.0 => livenessprobe:v2.0.0).
@nmtulloch27 not necessarily. The base doesn't have an "imagePullPolicy" specified; so you'd need to add that. Unfortunately Kustomize doesn't allow changing the pullPolicy via a transformer based on this thread: kubernetes-sigs/kustomize#1493 (caveat: I don't use Kustomize, so maybe I'm reading this incorrectly). So here's what I recommend, for now:
The dev overlay specifies the ":latest" image, and the imagePullPolicy change ensures you're pulling the actual latest rather than the latest from whenever you first ran that command. |
@nmtulloch27 I was intrigued enough to check it out myself; see #195 for a branch which implements setting the imagePullPolicy as discussed. |
I made the changes and applied them as per your instructions. It seems to have resolved that issue. However i am getting "MountVolume.MountDevice failed for volume "efs-pv" : rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /var/lib/kubelet/plugins/efs.csi.aws.com/csi.sock: connect: connection refused". Did you experience this as well? |
@nmtulloch27 I did not. What does: |
Hey i have placed the gist below: |
@nmtulloch27 that isn't at all what I was expecting but it clearly calls out the image is not there to be fetched. I'll ask the appropriate question over in #195 |
Okay, i'll move to that thread. |
@nmtulloch27 I've pushed an update to my branch; give it a shot and update #195 with success / failure. |
Okay cool..how do i update it with success or failure? Still fairly new to the site. Also, running the new build i am getting this error now "error: json: cannot unmarshal object into Go struct field Kustomization.patchesStrategicMerge of type patch.StrategicMerge" i think is because of how the kustomization is structured. |
@nmtulloch27 oh, interesting. This is going back to my earlier comment of "I'm not familiar with kustomize". It turns out there are two ways to call kustomize:
RE: "update with success/ failure" - I just meant: write a message saying: "this works for me", or "this doesn't work for me" :) I'm done for the day, but I'll be back tomorrow to take a look at this issue. |
Oh haha and okay cool... have a good night! |
@nmtulloch27 I can't help myself! I took a look and brought both options |
Haha nice, I'll will do that right now! This is what i see now: Did you see this as well? Its complaining about the latest_image file.. under containers maybe efs-plugin: should be - name: efs-plugin |
@nmtulloch27 thanks - that was super helpful. |
Just for anyone running into this issue, if I can help save you some debugging time. I saw this issue when I used the incorrect Security Groups for my EFS filesystem. Worth double checking this. The SG applied to each mount target should be the one that "Allows inbound NFS traffic from within the VPC" which is created in these steps: https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html |
So i closed the ticket because i thought the issue was solved since it worked for me after using the latest image and not the v0.3.0, but after trying again today it's not working anymore, even with the latest image, i checked my SG also and i fully opened it just to test but it is still not working (same error as before). I also tried the fix proposed by @ossareh by cloning his repo and apling the change but i run in the same issue each time. Here is the error i now have : pod has unbound immediate PersistentVolumeClaims |
@nomopo45 Since #202, From now until we have a new release, I would suggest keeping your YAML and image tag in sync. For example, the current tip of the master branch is at commit 778131e, which corresponds to image tag 778131e. I do this in my operator by locking the YAML from that commit to that image tag. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind bug
Hello,
I followed this documentation : https://docs.aws.amazon.com/eks/latest/userguide/efs-csi.html, as written in the userguide i'm trying to use the Multiple Pods Read Write Many
I have an EKS cluster, and i have an issue with my pod creation, apparently he is unable to mount the EFS Volume.
here is some log that i found :
and
I checked many things, my SG are fully open in and out, my pv.yaml, pod1.yaml, classtorage.yml and claim.yaml are exactly the same as here https://github.com/kubernetes-sigs/aws-efs-csi-driver/tree/master/examples/kubernetes/multiple_pods
Environment*
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.8-eks-e16311", GitCommit:"e163110a04dcb2f39c3325af96d019b4925419eb", GitTreeState:"clean", BuildDate:"2020-03-27T22:37:12Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}
Driver version: I guess latest, i installed it today following the doc
If you have any idea or recommendation it would be very nice, the userguide look so simple i'm frustrated that i'm not able to make it work and i can't see what i did wrong.
Thanks in advance
The text was updated successfully, but these errors were encountered: