Add new sections for CRD storageversion updates #4551

richabanker · 2024-03-15T00:02:01Z

One-line PR description: Add an 'Agreements' section for CRD storageversion updates

Issue link: StorageVersion API for HA API servers #2339

richabanker · 2024-03-15T00:06:25Z

cc @roycaihw

richabanker · 2024-03-15T00:09:53Z

cc @alexzielenski

richabanker · 2024-03-21T16:44:47Z

cc @jpbetz

roycaihw · 2024-03-22T00:29:12Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

@@ -347,6 +349,30 @@ correct order.
 [enables]:https://github.com/kubernetes/kubernetes/blob/220498b83af8b5cbf8c1c1a012b64c956d3ebf9b/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go#L703
 [filter]:#updating-storageversion

+#### Agreements
+
+1. Storageversion updates will be triggered by CRD creates and updates


IIUC you mean SV updates are triggered at CRD create/update watch events.

[Edited on 3/22]: To clarify, I assumed that the proposal here was also "Storageversion updates will not be triggered by CR write requests"-- which is something the old PR does. Please let me know if I misunderstood.

How do we clean up SV if a CRD is deleted?

I agree this we want to reconcile SV with CRDs.

We should be careful how we word this. I'd like to make sure we always reconcile SVs. If we take a purely edge triggered approach then if the apiserver (in a single control plane node configuration) crashes, it could miss the event. So we should perform reconciliation fully (i.e. take a level triggered approach). It's quite possible that during a relist we'll find that a SV is not reconciled with CRDs, we should update it at that time..

IIUC you mean SV updates are triggered at CRD create/update watch events.

That's right. Updated to reflect the same.

How do we clean up SV if a CRD is deleted?

Yeah been thinking about that. We need to introduce a delete capability. There was probably no need for a delete SV behavior for built-in resources so I guess we dont have a deleteSV() in staging/src/k8s.io/apiserver/pkg/storageversion as of now?

We need to introduce a delete capability

Agreed. Let's make sure we have a cleanup strategy

How do we clean up SV if a CRD is deleted?

Why do we need to clean up SVs when a CRD is deleted?

If a CRD is deleted, it should trigger garbage collection, which means individual CRs (of that CRD) get deleted, which means we have no corresponding entry in etcd for individual objects (of that CRD).

That's a fair point. For all SV objects we make their owner ref to be the respective CRDs. So once the CRDs go away, so should the SV objects. @roycaihw WDYT?

[Edited on 3/22]: To clarify, I assumed that the proposal here was also "Storageversion updates will not be triggered by CR write requests"-- which is something the old PR does. Please let me know if I misunderstood.

Yes, this is correct. I have now clearly laid out the 2 scenarios where we will be triggering SV udpates: 1. when a CRD create/update watch event is received and 2. when the apiserver is started.

I'd like to make sure we always reconcile SVs. If we take a purely edge triggered approach then if the apiserver (in a single control plane node configuration) crashes, it could miss the event. So we should perform reconciliation fully (i.e. take a level triggered approach)

@jpbetz fully agreed, thanks for raising this. I have updated the KEP to reflect that we will perform SV updates in 2 scenarios:

edge triggered scenario: when a CRD is created/updated

level triggered scenario: when the apiserver starts

roycaihw · 2024-03-22T00:38:50Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

+#### Limitations
+
+When a storageversion of a CRD is updated, we will ensure that all new CR writes:
+1. wait for the latest storageversion to be published


If a CRD is being updated to store v2 instead of v1, and two requests come in at the same time:

Create a new CR

Update the CRD to store v3 instead of v2

Which one is the latest storageversion here? v2 or v3?

For "at the same time": without a happens-before constraint, I don't think we should define the behavior.

Our guarantee should be that once a response to the CRD Update is sent, a happens-before is established, and all future CR creates store v3 (because clients can observe this ordering). After the CRD update is received and before the response is sent, storage at either version is possible.

Depends on which request will be processed by the apiserver first. Following scenarios are possible:

CR request is received first: it will wait for the v2 update to finish.

1a. If no new CRD update is received yet, storageversion is successfully updated to v2, and the CR is written in the latest storageversion at this point which is v2. Now the CRD update to v3 is received, and the storageversion is updated to v3.

1b. If a new CRD update to v3 is received at this time, our async SV update loop will switch to processing the latest storageversion - v3, and will never process the v2 update. CR writes waiting for v2 update to finish will time out.

CRD request is received first: our async SV update loop switches to processing the v3 update. IF now a CR request is received, it will wait for v3 update to finish and will be served in the latest storageversion at this point, v3.

jpbetz · 2024-03-22T16:35:05Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

@@ -347,6 +349,30 @@ correct order.
 [enables]:https://github.com/kubernetes/kubernetes/blob/220498b83af8b5cbf8c1c1a012b64c956d3ebf9b/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go#L703
 [filter]:#updating-storageversion

+#### Agreements
+
+1. Storageversion updates will be triggered by CRD creates and updates


I agree this we want to reconcile SV with CRDs.

We should be careful how we word this. I'd like to make sure we always reconcile SVs. If we take a purely edge triggered approach then if the apiserver (in a single control plane node configuration) crashes, it could miss the event. So we should perform reconciliation fully (i.e. take a level triggered approach). It's quite possible that during a relist we'll find that a SV is not reconciled with CRDs, we should update it at that time..

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

jpbetz · 2024-03-22T16:44:40Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

+#### Limitations
+
+When a storageversion of a CRD is updated, we will ensure that all new CR writes:
+1. wait for the latest storageversion to be published


For "at the same time": without a happens-before constraint, I don't think we should define the behavior.

Our guarantee should be that once a response to the CRD Update is sent, a happens-before is established, and all future CR creates store v3 (because clients can observe this ordering). After the CRD update is received and before the response is sent, storage at either version is possible.

jpbetz · 2024-03-22T16:54:27Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

+
+This implies that we will take a write outage for the duration of the storageversion update. We will allow this for the following reasons:
+1. blocking CR writes till the SV update is finished is in-line with how we handle requests for built-in resources
+2. if we update the CRD handler to the new handler **before** a storageversion update completes, we risk a server crash - and objects maybe written in new version but storageversion API would not reflect that


I don't think it's possible to always avoid this situation. I do think it's possible to always update the CRD handler before SV updates, but I don't think we can guarantee atomicity of the CRD handler update and SV update (consider network partitions or process crashes happening at inconvenient times). So while it seems nice to try to get the storageversion upgraded in the happy path, I think we need to also to tolerate the skew.

IF we wan't to hold all writes indefinitely until SV is updated, we should spell out the requirements carefully. What happens on apiserver startup when a CRD has been updated to a new storage version but the SV hasn't been updated yet. What happens when updating the SV and the request times out?

In both cases, my understanding is that we loose some form of availability of the CRD until the SV update completes. we should call that out in the limitations and say exactly what is unavailable (CR writes only?).

What happens on apiserver startup when a CRD has been updated to a new storage version but the SV hasn't been updated yet. What happens when updating the SV and the request times out?

As discussed on chat, we will ensure that SVs are always reconciled when the server is started/restarted before we serve any CR writes.
I have updated this part to explicitly state the 2 scenarios where we will take an outage on CR write requests.

About the SV update timing out, I have included a line saying we will reattempt the update a certain number of times before failing the SV udpate and consequently the dependent CR writes. Does this look ok?

Jefftree · 2024-03-22T19:24:38Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

+1. wait for the CRD handler to be replaced
+
+This implies that we will take a write outage for the duration of the storageversion update. We will allow this for the following reasons:
+1. blocking CR writes till the SV update is finished is in-line with how we handle requests for built-in resources


This is slightly different than built-ins right? Built-in blocking can only occur once on apiserver startup while CR writes can be blocked every time a CRD is updated? Nothing wrong with maintaining the same approach, just something to consider.

That's right, for CRDs - we will always block new CR writes until the latest SV update is finished. Allowing CR writes while the SV update is pending can cause problems like the old handler not being able to understand the new CRD version that may be persisted in etcd by another server.

…ers/README.md Co-authored-by: Joe Betz <[email protected]>

richabanker · 2024-03-25T19:20:18Z

cc @deads2k @liggitt for their thoughts

jpbetz · 2024-03-28T01:03:10Z

/lgtm

Looks like all my concerns are addressed. I think this reflects my understanding of how it should work.

alexzielenski · 2024-03-28T01:39:50Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

+      1. old storage of the same CRD is deleted
+      2. pending CR writes using old storageversion of the same CRD are completed
+   2. We will do this to prevent publishing a newer storageversion while a pending CR write is still in progress. Otherwise, if the pending CR write finishes writing the CR using an old version, the storageversion API would not reflect that and the object will remain in etcd, encoded in an old version forever 
+4. We will block new CR writes for a CRD until we have published its latest storageversion. This is discussed more in the [limitations] section


Curious about the meaning of "published" in this context. Does this imply that any future reads of the StorageVersion resource are guaranteed to reflect the update? (asking as I am not sure myself about what guarantee the apiserver itself provides when it responds to the UPDATE request)

Does this imply that any future reads of the StorageVersion resource are guaranteed to reflect the update?

yes, this was the intention for this statement. That once a CRD update happens, and its storageversion has been updated(or published to the SV API) only then will we unblock the CR writes.

sftim · 2024-04-09T09:39:12Z

keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers/README.md

@@ -347,6 +349,39 @@ correct order.
 [enables]:https://github.com/kubernetes/kubernetes/blob/220498b83af8b5cbf8c1c1a012b64c956d3ebf9b/staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go#L703
 [filter]:#updating-storageversion

+#### Agreements


What's an “agreement” in this context?

Like a contract/principles that we propose to use while processing CR / CRD requests that may need to interact with the possible storageversion updates (for the same CRD) happening in parallel.

k8s-ci-robot · 2024-04-09T18:47:17Z

New changes are detected. LGTM label has been removed.

k8s-ci-robot · 2024-04-09T18:47:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: richabanker
Once this PR has been reviewed and has the lgtm label, please ask for approval from jpbetz. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

keps/sig-api-machinery/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

deads2k · 2024-04-10T18:41:29Z

it's pretty detailed and I'm open to change as we go, but it seems like a reasonable place to start.

k8s-triage-robot · 2024-07-09T18:43:46Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-08-08T19:15:47Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle rotten
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2024-09-07T20:12:28Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen
Mark this PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2024-09-07T20:12:33Z

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Reopen this PR with /reopen

Mark this PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 15, 2024

k8s-ci-robot requested a review from fedebongio March 15, 2024 00:02

k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Mar 15, 2024

k8s-ci-robot requested a review from jpbetz March 15, 2024 00:02

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 15, 2024

richabanker changed the title ~~Add an Agreements section for CRD storageversion updates~~ Add new sections for CRD storageversion updates Mar 15, 2024

richabanker mentioned this pull request Mar 15, 2024

Add a limitations section for CRD storageversion update #4550

Closed

richabanker force-pushed the sv-api-2 branch 3 times, most recently from 7043b7a to 1b1ade7 Compare March 15, 2024 00:40

Add a limitations section for CRD storageversion update

cd43e19

richabanker force-pushed the sv-api-2 branch from 1b1ade7 to 5a157fd Compare March 19, 2024 00:30

roycaihw reviewed Mar 22, 2024

View reviewed changes

Add an Agreements section for CRD storageversion update

90c058f

richabanker force-pushed the sv-api-2 branch from 5a157fd to 90c058f Compare March 22, 2024 16:51

jpbetz reviewed Mar 22, 2024

View reviewed changes

Jefftree reviewed Mar 22, 2024

View reviewed changes

Update keps/sig-api-machinery/2339-storageversion-api-for-ha-api-serv…

3f9ab7f

…ers/README.md Co-authored-by: Joe Betz <[email protected]>

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 25, 2024

Address comments

2346a86

richabanker force-pushed the sv-api-2 branch from 24716ae to 2346a86 Compare March 25, 2024 18:59

k8s-ci-robot assigned jpbetz Mar 28, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 28, 2024

alexzielenski reviewed Mar 28, 2024

View reviewed changes

sftim reviewed Apr 9, 2024

View reviewed changes

more clarification

161aac7

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 9, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 9, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 8, 2024

k8s-ci-robot closed this Sep 7, 2024

Add new sections for CRD storageversion updates #4551

Add new sections for CRD storageversion updates #4551

Conversation

richabanker commented Mar 15, 2024

richabanker commented Mar 15, 2024

richabanker commented Mar 15, 2024

richabanker commented Mar 21, 2024

roycaihw Mar 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richabanker Mar 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpbetz Mar 22, 2024 • edited Loading

Choose a reason for hiding this comment

jpbetz Mar 22, 2024 • edited Loading

Choose a reason for hiding this comment

richabanker Mar 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richabanker commented Mar 25, 2024

jpbetz commented Mar 28, 2024

alexzielenski Mar 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

k8s-ci-robot commented Apr 9, 2024

k8s-ci-robot commented Apr 9, 2024

deads2k commented Apr 10, 2024

k8s-triage-robot commented Jul 9, 2024

k8s-triage-robot commented Aug 8, 2024

k8s-triage-robot commented Sep 7, 2024

k8s-ci-robot commented Sep 7, 2024

roycaihw Mar 22, 2024 •

edited

Loading

richabanker Mar 22, 2024 •

edited

Loading

jpbetz Mar 22, 2024 •

edited

Loading

jpbetz Mar 22, 2024 •

edited

Loading

richabanker Mar 25, 2024 •

edited

Loading

alexzielenski Mar 28, 2024 •

edited

Loading