Skip to content

Commit

Permalink
Address comments about race condition
Browse files Browse the repository at this point in the history
  • Loading branch information
gnufied committed Jan 29, 2020
1 parent b55707c commit aad8bba
Showing 1 changed file with 30 additions and 8 deletions.
38 changes: 30 additions & 8 deletions keps/sig-storage/20200127-recover-from-resize.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ superseded-by:
- [User flow stories](#user-flow-stories)
- [Case 1 (controller+node expandable):](#case-1-controllernode-expandable)
- [Case 2 (controller+node expandable):](#case-2-controllernode-expandable)
- [Case 3](#case-3)
- [Risks and Mitigations](#risks-and-mitigations)
- [Graduation Criteria](#graduation-criteria)
- [Test Plan](#test-plan)
Expand All @@ -56,11 +57,22 @@ to be tracked accurately if user reduces size of their PVCs.

## Motivation

Sometimes a user may expand a PVC to a size which may not be supported by underlying storage provider. Lets say PVC size was 10GB and user expanded it to 500GB but underlying storage provider can not support 500GB volume for some reason. We want to make it easier for user to retry resizing by requesting more reasonable value say - 100GB.

Problem in allowing users to reduce size is - a malicious user can use this feature to abuse the quota system. They can do so by rapidly expanding and then shrinking the PVC. In general - both CSI and in-tree volume plugins have been
designed to never perform actual shrink on underlying volume, but it can fool the quota system in believing the user is using less storage than he/she actually is. To solve this problem - we propose that although users are allowed to reduce size of their PVC (as long as requested size >= `pvc.Status.Capacity`), quota calculation will only use previously requested higher value. This is covered in more detail in [Implementation](#implementation).
Sometimes a user may expand a PVC to a size which may not be supported by
underlying storage provider. Lets say PVC size was 10GB and user expanded
it to 500GB but underlying storage provider can not support 500GB volume
for some reason. We want to make it easier for user to retry resizing by
requesting more reasonable value say - 100GB.

Problem in allowing users to reduce size is - a malicious user can use
this feature to abuse the quota system. They can do so by rapidly
expanding and then shrinking the PVC. In general - both CSI
and in-tree volume plugins have been designed to never perform actual
shrink on underlying volume, but it can fool the quota system in believing
the user is using less storage than he/she actually is. To solve this
problem - we propose that although users are allowed to reduce size of their
PVC (as long as requested size >= `pvc.Status.Capacity`), quota calculation
will only use previously requested higher value. This is covered in more
detail in [Implementation](#implementation).


### Goals
Expand Down Expand Up @@ -95,8 +107,8 @@ In practice above limitation should be fine. We will add events to PV and PVC wh

###### Case 1 (controller+node expandable):

- User increases 10Gi PVC to 100Gi
- Api Server sets `pvc.spec.allocatedResources` to 100Gi.
- User increases 10Gi PVC to 100Gi by changing - `pvc.spec.resources.requests["storage"] = "100Gi"`.
- Api Server sets `pvc.spec.allocatedResources` to 100Gi via pvc storage strategy.
- Quota controller uses `max(pvc.Spec.AllocatedResources, pvc.Spec.Resources)` and adds `90Gi` to used quota.
- Expansion to 100Gi fails and hence `pv.Spec.Capacity` and `pvc.Status.Capacity `stays at 10Gi.
- User requests size to 20Gi.
Expand All @@ -107,8 +119,8 @@ In practice above limitation should be fine. We will add events to PV and PVC wh

###### Case 2 (controller+node expandable):

- User increases 10Gi PVC to 100Gi
- Api Server sets `pvc.spec.allocatedResources` to 100Gi.
- User increases 10Gi PVC to 100Gi by changing - `pvc.spec.resources.requests["storage"] = "100Gi"`
- Api Server sets `pvc.spec.allocatedResources` to 100Gi via pvc storage strategy.
- Quota controller uses `max(pvc.Spec.AllocatedResources, pvc.Spec.Resources)` and adds `90Gi` to used quota.
- Expansion to 100Gi fails and hence `pv.Spec.Capacity` and `pvc.Status.Capacity `stays at 10Gi.
- User requests size to 20Gi.
Expand All @@ -120,6 +132,16 @@ In practice above limitation should be fine. We will add events to PV and PVC wh
- Quota controller uses `max(pvc.Spec.AllocatedResources, pvc.Spec.Resources)` and adds `20Gi` to used quota.
- Expansion to 120Gi fails and hence `pv.Spec.Capacity` and `pvc.Status.Capacity `stays at 20Gi.

###### Case 3

- User increases 10Gi PVC to 100Gi by changing `pvc.spec.resources.requests["storage"] = "100Gi"`
- API server sets pvc.spec.allocatedResources to 100Gi via PVC storage strategy.
- Quota controller uses max(pvc.Spec.AllocatedResources, pvc.Spec.Resources) and adds 90Gi to used quota. (100Gi is the total space taken by the PVC)
- Volume plugin starts slowly expanding to 100Gi. pv.Spec.Capacity and pvc.Status.Capacity stays at 10Gi until the resize is finished.
- While the storage backend is resizing the volume, user requests size 20Gi by changing pvc.spec.resources.requests["storage"] = "20Gi"
- Quota controller sees no change in storage usage by the PVC because pvc.Spec.AllocatedResources is 100Gi.
- Expansion succeeds and pvc.Status.Capacity and pv.Spec.Capacity report new size as 100Gi, as that's what the volume plugin did.

### Risks and Mitigations

- One risk as mentioned above is, if expansion failed and user retried expansion(successfully) with smaller value, the quota code will keep reporting higher value. In practice though - this should be acceptable since such expansion failures should be rare and admin can unblock the user by increasing the quota or rebuilding PVC if needed. We will emit events on PV and PVC to alerts admins and users.
Expand Down

0 comments on commit aad8bba

Please sign in to comment.