-
Notifications
You must be signed in to change notification settings - Fork 382
Delete field for instance and binding #2037
Changes from 9 commits
bf96b33
63f3c31
0859979
d3fae37
a5f126f
93dad71
fd8ca77
4821928
fe2f92a
283522c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
# Declarative Delete Field | ||
|
||
Concept committed to [during 2018 April 12F2F](https://docs.google.com/document/d/1O7_fws7hwZ6qV3okAjbV5qdFaEq4bt1LYJ9bTu-IwvU) | ||
|
||
This document 2018-05. | ||
|
||
## Abstract | ||
|
||
Add a field to instance and binding to indicate that they have been asked to | ||
send a DELETE to the backend. | ||
|
||
## Motivation | ||
|
||
When the Originating-Identity header is used, the broker and refuse a delete | ||
call when the user info in the Originating-Identity header does not have | ||
permissions to delete the instance or binding. As this happens after kubernetes | ||
has accepted the DELETE as an true indication to do a delete, it has set the | ||
DeletionTimestamp, which is an irreversible step on the way to carrying out the | ||
necessary actions to delete a kubernetes resource. | ||
|
||
## Proposed Design | ||
|
||
- Add a field to indicate deletion | ||
- Add a condition to indicate permanent end state of deletion success. | ||
|
||
Once the field is activated the controller deprovisions the broker | ||
resource. When successfully deleted the controller then deletes the | ||
kubernetes resource. | ||
|
||
A feature flag to enable/disable this behavior. | ||
|
||
RBAC changes to allow the controller to issue deletes. | ||
|
||
Helm chart changes to enable the flag and the rbac rules. | ||
|
||
### API Resource Changes | ||
|
||
Add a condition for ServiceInstanceConditionType, ServiceBindingConditionType. | ||
```go | ||
ServiceInstanceConditionDeleted ServiceBindingConditionType = "Deleted" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. copy/pasta error here |
||
ServiceBindingConditionDeleted ServiceBindingConditionType = "Deleted" | ||
``` | ||
|
||
Add a field to spec of ServiceInstance, ServiceBinding. | ||
```go | ||
ExistsExternally bool `json:"existsExternally,omitempty"` | ||
``` | ||
|
||
This field is set to true by default. The user indicates that it wants | ||
to delete the backing resource by setting this field to false. The | ||
controller will see this update and attempt to delete the backing | ||
broker resource. If the broker delete succeeds, it sets the deleted | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The happy path is obvious, it works without this new flag (via normal Kubernetes deletion). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The happy-path is the currently broken path. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @MHBauer for example? Currently, if resource is successfully deprovisioned, we just remove the finalizer and proceed with deletion. Do I miss something? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This isn't about successful vs uncsuccessful. This is about the delete not being allowed by the broker, yet kube will allow it. Once kube has allowed it, it cannot be unallowed. This denies the appropriate user the ability to do updates or deletes at all. The broker resource is stuck existing. |
||
status condition. | ||
|
||
If the field is set to false, no updates are allowed besides | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why this limitation? Perhaps you want to stage a bunch of updates to a non-existent resource. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is a simpler case. What would it mean to update something that is going away? If there is some update required for the deletion to go through, I would make the update first and make sure it is committed before deleting. What would staging updates to a non-existent resource mean? What would it do? Once the resource is gone, it is not going to come back. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ExistsExternally=false means the broker holds no resource for you and you are staging changes or you shutdown the external resource. flipping to true means provision it again.
I am staging changes, or I am getting a whole set of resources preped for a deploy. Or I want to shut the services down for a bit and then intend to bring them back later.
Stop thinking of it as deleting, it is deprovisiong, and it can be provisioned again in the same way.
Let it come back if ExistsExternally = true. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://github.com/kubernetes-incubator/service-catalog/pull/2037/files#r189993692 |
||
setting the field back to true. | ||
|
||
Once the status is changed to deleted the controller should issue the | ||
DELETE to the apiserver removing the resource. | ||
|
||
### API Server Changes | ||
|
||
Additional field results in generated code changes. | ||
|
||
Validation/Strategy to prevent changes after a deletion is started and is successful or | ||
until the deletion is rolled back. | ||
|
||
### Controller-Manager Changes | ||
|
||
Additional condition to determine delete flow is added. This condition is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What safe-gaurds need to be added to prevent Also this will change how There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand the case. If someone calls delete directly after setting that, it's not an update. We can't block deletes anyway, so it'll add the finalizer as usual and go through the existing flow. Part of this work should be coming up with rbac rules to prevent access to DELETE. I think this is an extension of rbac that does not currently exist. Created #2061 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you are making the assumption that marking ExistsExternally=false means service catalog will garbage collect that resource at some point in the future, acting as a delayed delete. This is not the usecase we are suggesting at all. We are saying the resource continues to exist in k8s indefinitely until a real DELETE is called, as normal. RBAC could be applied to prevent normal people from removing resources and an elevated robot account could garbage collect, but I think this should happen externally to service catalog. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yea, that's not my assumption. A workflow of:
is not a very good UX. I'd prefer to not force them to do steps 2 and 3 when we can do it for them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the feature you want, and I do not believe it is a feature anyone else wants. A feature that would do what you want and still be useful to the community would be the described ExistsExternally functionality + a robot script on your side gives you the workflow you want. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, its a svc-cat feature but its being done in an attempt to be OSBAPI compliant. From past discussions no one else seems to care about this aspect of the spec and are comfortable with the idea of a broker saying "no" after the Kube delete process has started, and can't be stopped. So, I'm looking for a nice UX solution for people who will actually use this feature, not for people who won't. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #2037 (comment) The summary seems to be that we agree that we need a field to indicate that we want this process to start. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think we agree at all yet. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@n3wscott it is a confusing feature for Kubernetes users, who are the target users of Service Catalog, not the ones who know OSB API. Normal users won't set the flag, they will just So I would argue that this delete flag is only useful for emulating OSBAPI deprovisioning semantics to address a corner case of rejected deleted request. If there was no such corner case, users would never ask for deprovisioning without Kubernetes delete. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly, a corner case caused by the interaction between the differing behaviors of kubernetes and osbapi. |
||
dependent on the state of the new field. Reuse all applicable status | ||
objects as appropriate. | ||
|
||
Proceed through the normal deletion flow and if the broker allows the delete, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. and if not? |
||
set a new condition indicating the delete has occurred. If the delete has | ||
occurred, when running the finalizer, allow the delete to finish and occur | ||
without any additional actions involving broker communication. | ||
|
||
|
||
|
||
### SVCat CLI Changes | ||
|
||
New types result in test input/output changes to golden files. |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -774,6 +774,12 @@ type ServiceInstanceSpec struct { | |
// allows for parameters to be updated with any out-of-band changes that have | ||
// been made to the secrets from which the parameters are sourced. | ||
UpdateRequests int64 | ||
|
||
// DeletionRequested provides a mechanism for requesting a delete without | ||
// performing a direct delete (which cannot be undone in cases where the | ||
// delete is rejected by the broker). This is the recommended way to | ||
// delete, to avoid resources becoming stuck in the deletion phase. | ||
DeletionRequested metav1.Time | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still think this should be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. with this being a timestamp, do we expect people to actually set the time? That seems odd to me. And what does its value mean? What if its Jan 1, 1970? What is its Dec 31, 2020? Are we really supporting scheduled deletes? Is that even a thing in kube? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. people should not set the time, svcat would, and if we can hijack the k8s delete command, that would. time before now means do a delete. So if Jan 1, 1970 and the resource is not deleted, try to delete it. k8s does not support a future date, see deletionTimestamp There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did we agree to hijack delete? And we need it to work with vanilla kubectl, so I do not think asking people to set a time is a good UX, when 99+% of the time I bet they mean "do it now". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it is bad UX to not have a time tagged to when you attempted that action. How do you know if the status changed before of after a delete was requested? How do you know when service catalog should use a bigger hammer to unbind or deprovision the thing? the validator could hijack the setting of these field to be set to now. so user sets it to anything they want, including true or now, and then the validator replaces that with time.Now(). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not against there being a timestmap someplace. What I'm against is making the user set a timestamp as their UX for delete(). That's a weird design. The normal path for Kube is the user does an HTTP DELETE and as a side effect a timestamp is set. That's very different. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We are outside the area of calling DELETE directly on any of our objects. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I am fine with keeping the name as it, with a single field that's just a timestamp and relying on svcat to handle calling it appropriately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we stay with an api server, the clear choice is to make a sub-resource for each type to do the deprovision or unbind, and those methods modify this field. then the user flow is |
||
} | ||
|
||
// ServiceInstanceStatus represents the current status of an Instance. | ||
|
@@ -1025,6 +1031,12 @@ type ServiceBindingSpec struct { | |
// settable by the end-user. User-provided values for this field are not saved. | ||
// +optional | ||
UserInfo *UserInfo | ||
|
||
// DeletionRequested provides a mechanism for requesting a delete without | ||
// performing a direct delete (which cannot be undone in cases where the | ||
// delete is rejected by the broker). This is the recommended way to | ||
// delete, to avoid resources becoming stuck in the deletion phase. | ||
DeletionRequested metav1.Time | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still think this should be |
||
} | ||
|
||
// ServiceBindingStatus represents the current status of a ServiceBinding. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -865,6 +865,12 @@ type ServiceInstanceSpec struct { | |
// been made to the secrets from which the parameters are sourced. | ||
// +optional | ||
UpdateRequests int64 `json:"updateRequests"` | ||
|
||
// DeletionRequested provides a mechanism for requesting a delete without | ||
// performing a direct delete (which cannot be undone in cases where the | ||
// delete is rejected by the broker). This is the recommended way to | ||
// delete, to avoid resources becoming stuck in the deletion phase. | ||
DeletionRequested metav1.Time `json:"deletionRequested,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this still the right name now that it's not a bool? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nope |
||
} | ||
|
||
// ServiceInstanceStatus represents the current status of an Instance. | ||
|
@@ -1127,6 +1133,12 @@ type ServiceBindingSpec struct { | |
// settable by the end-user. User-provided values for this field are not saved. | ||
// +optional | ||
UserInfo *UserInfo `json:"userInfo,omitempty"` | ||
|
||
// DeletionRequested provides a mechanism for requesting a delete without | ||
// performing a direct delete (which cannot be undone in cases where the | ||
// delete is rejected by the broker). This is the recommended way to | ||
// delete, to avoid resources becoming stuck in the deletion phase. | ||
DeletionRequested metav1.Time `json:"deletionRequested,omitempty"` | ||
} | ||
|
||
// ServiceBindingStatus represents the current status of a ServiceBinding. | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if deprovision via OSB fails? There are various cases, e.g.:
400 BadRequest
,403 Forbidden
or some other 4xx client-side error that should have no side effects5xx
server errorlast_operation
returns failureThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this the normal cases that are already handled? We're doing a deprovision outside of calling delete. Everything else about the broker operation and responses is the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. What's the point then? What do you achieve by doing exactly the same as with normal delete?
As far as I understand, the only reason for adding new flag is because you can't "undelete" Kubernetes resource: once "deletionTimestamp" is added, you can't remove it and can't change the spec. And the only reason for wanting to "undelete" is when broker rejects a deprovisioning request.
Do you unset the flag or allow updating the spec in some cases? If not then I don't see the point of having it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, in case you don't know. Currently when user marks
ServiceInstance
for deletion, it doesn't get deleted by GC immediately, because we have a custom "Service Catalog finalizer" added to it. Once deprovisioning succeeds, svc-cat controller removes the finalizer and GC proceeds with deletion. If not, we retry until exceed limits, and then get stuck in the "failed to deprovision" state (and the only way to delete this ServiceInstance resource is to manually delete the finalizer).It looks like you try to emulate exactly the same mechanics in your design. Why???
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because, once the deletion timestamp is added, it cannot be removed. The bool can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you remove it in the middle of deprovisioning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on when you mean in the middle. If it's after the deprovision call, that is synchronous and is a yes or no. If we catch it before the deprovision, it wouldn't do it. And if we catch it after, it's already done. We should prevent it from being unset, but the key thing is that the status changes to some form of "gone, permanently". I never proposed and do not intend to implement or support flipping back and causing a (re)provision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so then you'll need to add some extra flag like
permanentDeletion bool
to the status.