-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RHOAIENG-2974: Restrict DSCI deletion before DSC #1094
RHOAIENG-2974: Restrict DSCI deletion before DSC #1094
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need test case updates in the webhook_suite_test.go
7016ad6
to
b83b9cb
Compare
b83b9cb
to
c6e8f45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two small comments, otherwise i am fine with the change
controllers/webhook/webhook.go
Outdated
@@ -101,12 +103,24 @@ func (w *OpenDataHubWebhook) checkDupCreation(ctx context.Context, req admission | |||
fmt.Sprintf("Only one instance of %s object is allowed", req.Kind.Kind)) | |||
} | |||
|
|||
func (w *OpenDataHubWebhook) checkDeletion(ctx context.Context, req admission.Request) admission.Response { | |||
if req.Kind.Kind != "DSCInitialization" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we want to have a check here, we can do a if kind == DSC ,be more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
/retest |
there are some failure in "linter" @Sara4994 i will approve once that is fixed. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zdtsw The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2439104
into
opendatahub-io:incubation
I might be missing something fundamental, but what I instantly thought when seeing this PR is - why not cascade deletion using owner ref and finalizers? DSCI would be an owner of DSC so deleting the former (DSCI) would result in always deleting the latter (DSC). With the approach brought by this PR, we force the user to trigger those two deletions in order manually. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same question as @bartoszmajsak in #1094 (comment) -- why are we not using owner ref and finalizers?
I think one reason to do this is: if user mistakenly deleted DSCI (as origin issue described) this cascading would automatically delete DSC as well which in turn all components' resource etc. |
Conceptually, what is the reason for DSC and DSCI being separate entities? If deleting the DSCI CR produces a broken cluster, aren't the two actually linked so tightly together that it makes sense to treat them as very tightly connected? If the user/customer has the Operator nonfunctional but the components (seem to) work, is that a supported configuration? And yes, I understand that implications of this would need to be figured out / investigated. So the webhook approach might be a reasonable stop gap. |
The very old idea (i was not involved) is to have one DSCI for one cluster, but we could have multiple DSC:s in one cluster (I can only guess it is prep for the multi-tenant case) . But later on, this had been changed which makes it only supports one DSC and one DSCI in one cluster. |
* RHOAIENG-2974: Restrict DSCI deletion before DSC * Added tests and minor enhancements * Admission allow DSC * lint fix (cherry picked from commit 2439104) Add DELETE operation for webhook in CSV, upstream 60a44c2 ("update: cleanup remove kfdef during uninstallation (opendatahub-io#1100)")
* RHOAIENG-2974: Restrict DSCI deletion before DSC * Added tests and minor enhancements * Admission allow DSC * lint fix (cherry picked from commit 2439104) Add DELETE operation for webhook in CSV, upstream 60a44c2 ("update: cleanup remove kfdef during uninstallation (opendatahub-io#1100)") Amend client -> Client due to already applied 03c1abc ("upgrade: controller-runtime and code change accordingly (opendatahub-io#1189)")
* RHOAIENG-2974: Restrict DSCI deletion before DSC * Added tests and minor enhancements * Admission allow DSC * lint fix (cherry picked from commit 2439104) Add DELETE operation for webhook in CSV, upstream 60a44c2 ("update: cleanup remove kfdef during uninstallation (opendatahub-io#1100)") Amend client -> Client due to already applied 03c1abc ("upgrade: controller-runtime and code change accordingly (opendatahub-io#1189)")
Description
This PR fixes a race condition that happens while uninstalling the odh-operator. The issue occurs while uninstalling, at times DSCI instance gets removed first and DSC instance hangs over with error status expecting DSCI instance to be created. Manual deletion also does not remove the hanging DSC instance. This PR address the issue by blocking the deletion of DSCI before DSC through webhooks.
JIRA issue: https://issues.redhat.com/browse/RHOAIENG-2974
How Has This Been Tested?
Issue was reported as a race condition. To reproduce the race condition manually, followed the below steps
Screenshot or short clip
Merge criteria