Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVC s are not syncing after the PV s are expanded #226

Closed
rajendraindukuri opened this issue Nov 7, 2022 · 5 comments
Closed

PVC s are not syncing after the PV s are expanded #226

rajendraindukuri opened this issue Nov 7, 2022 · 5 comments

Comments

@rajendraindukuri
Copy link

Hi,

We are trying to resize the PVC s and we are seeing that PV s are getting resized as expected but the PVC s are not syncing with the new size as expected. This is not happening consistently. Please find the event logs as below:

  Type     Reason              Age                    From                                    Message
  ----     ------              ----                   ----                                    -------
  Normal   Resizing            5m17s (x616 over 15h)  external-resizer csi-unity.dellemc.com  External resizer is resizing volume staging1-p2-xxxxx
  Warning  VolumeResizeFailed  4m26s (x615 over 15h)  external-resizer csi-unity.dellemc.com  updating capacity of PV "staging1-p2-xxxxx" to 0 failed: update capacity of PV staging1-p2-xxxxxxx failed: PersistentVolume "staging1-p2-xxxxxx" is invalid: spec.capacity[storage]: Invalid value: "0": must be greater than zero

Also find the resizer logs as below:

I1012 14:11:41.139731       1 controller.go:483] Resize volume succeeded for volume "staging1-p2-xxxx", start to update PV's capacity
I1012 14:11:41.139737       1 controller.go:590] Resize volume succeeded for volume "staging1-p2-xxxx", start to update PV's capacity
E1012 14:11:41.144484       1 controller.go:286] Error syncing PVC: updating capacity of PV "staging1-p2-xxxx" to 0 failed: update capacity of PV staging1-p2-xxxx failed: PersistentVolume "staging1-p2-xxxx" is invalid: spec.capacity[storage]: Invalid value: "0": must be greater than zero
I1012 14:11:41.144518       1 event.go:285] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"data", Name:"abc", UID:"XXXX", APIVersion:"v1", ResourceVersion:"1234", FieldPath:""}): type: 'Warning' reason: 'VolumeResizeFailed' updating capacity of PV "staging1-p2-xxxx" to 0 failed: update capacity of PV staging1-p2-xxxx failed: PersistentVolume "staging1-p2-xxxx" is invalid: spec.capacity[storage]: Invalid value: "0": must be greater than zero

More info: : Storage calls in our case are asynchronous and sometimes due to some congestion, they might take more time for provisioning/resize operations. Is there any retry logic in resizer which will timeout and retry the operations after some time?

Just assuming if there is a timeout kept in resizer logic , due to the long time taken by the operation..it is timing out and by the time it retries the resize operation got successful in the backend and hence we are seeing the above issue.

Can you please help us resolve this issue?

Environment Details:
Kubernetes: 1.24
OS: RHEL 8.x

@gnufied
Copy link
Contributor

gnufied commented Nov 9, 2022

Resize operation is always retried if previous operation times out. It appears that, in some cases after successful resize, the CSI driver is returning "0" as new expanded size and that is why this error is happening. Please note - ControllerExpandVolume should be idempotent and hence if retried it should return same result as if expansion is happening for the first time successfully.

Can you double check, why driver is returning "0" as new size after expansion?

@rajendraindukuri
Copy link
Author

rajendraindukuri commented Nov 11, 2022

Hi @gnufied
Thanks for responding.

I tried checking if the driver is returning "0" as the new expanded size in retries using the controller logs and it is never passing "0" as the new size. I compared the resizer logs corresponding to each request from driver logs but driver is never passing expanded size as "0". I checked all the requests in the controller logs and observation is same,

I put the driver controller and resizer logs together for few requests to trace the scenario completely (not the timing in both the logs). Please find the snippets as below:

tracing_contrller_resizer_logs_1

tracing_contrller_resizer_logs_2

The above retry scenario repeats continuously where as the volume size is expanded as expected on the storage side. This scenario does not happen consistently for all expansions.

Any pointers from your side would be helpful.

Thanks
Rajendra

@gashof
Copy link

gashof commented Nov 11, 2022

Hi,
The customer experienced the issue at Dell CSI Driver 2.2 with resizer:v1.4.0 and also at 2.4 with resizer:v1.5.0. I will send the logs to your email and cc rajendraindukuri that opened this issue.

@gnufied
Copy link
Contributor

gnufied commented Nov 16, 2022

Check - https://github.com/dell/csi-unity/blob/main/service/controller.go#L803 , it clear as day that driver is not being compliant with CSI spec.

/close

@k8s-ci-robot
Copy link
Contributor

@gnufied: Closing this issue.

In response to this:

Check - https://github.com/dell/csi-unity/blob/main/service/controller.go#L803 , it clear as day that driver is not being compliant with CSI spec.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sunnylovestiramisu added a commit to sunnylovestiramisu/external-resizer that referenced this issue Apr 26, 2023
4133d1df Merge pull request kubernetes-csi#226 from msau42/cloudbuild
8d519d23 Pin buildkit to v0.10.6 to workaround v0.11 bug with docker manifest
6e04a030 Merge pull request kubernetes-csi#224 from msau42/cloudbuild
26fdfffd Update cloudbuild image

git-subtree-dir: release-tools
git-subtree-split: 4133d1df083eaa65bdeddd0530d54278529c7a60
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants