-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dont wait for delete operations to be completed by default #770
Conversation
In it's current state it now takes me about 25 seconds for the `kn delete` to complete. Before knative#682 it used to be almost immediate. This is because we now pass in the `DeletePropagationBackground` flag. I believe this is a mistake, not only because of the 20+ seconds of additional time to delete things, but IMO the CLI should talk to the server in the same way regardless of the --wait flag. That flag should just be a CLI thing to indicate if the user wants the CLI to wait for the server to complete but not HOW the server should do the delete. Signed-off-by: Doug Davis <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duglin: 0 warnings.
In response to this:
In it's current state it now takes me about 25 seconds for the
kn delete
to complete. Before #682 it used to be
almost immediate. This is because we now pass in the
DeletePropagationBackground
flag. I believe this is a mistake, not only
because of the 20+ seconds of additional time to delete things, but IMO
the CLI should talk to the server in the same way regardless of the --wait
flag. That flag should just be a CLI thing to indicate if the user wants the CLI
to wait for the server to complete but not HOW the server should do the delete.Signed-off-by: Doug Davis [email protected]
Description
Changes
Reference
Fixes #
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
It was introduced as a part of synchronous delete operation to ensure all dependant resources are deleted for the service and we wait in-sync for the service resource to be gone.
From my local test |
I think the biggest issue is that However, I still wonder if But I actually think that's a bit too low level for our CLI. I think the biggest concern for people is "when can I create a new ksvc with the same name w/o getting an error about that name being used?". And the default/DeletePropagationForeground does that. How long it takes to delete other resources that are hidden from the user, and would not influence any subsequent cmd they execute, isn't really that interesting except to a pretty advanced user. |
The motivation to have
The overall goal to have a synchronous delete was to ensure there aren't unexpected race condition between calling CRUD commands in rapid succession, e.g. scripts, E2E tests etc. The case of "service name collision" is on point here as well. I understand Wrt the delete policy flag, I lean strongly toward "too low level" approach. The policy should be opinionated default of synchronous delete. In this case it's part of delete everything strategy. |
Totally agree, but in this case I think we only need to wait for the ksvc to vanish, not the hidden resources. |
Net: 20+ seconds for an out-of-the-box |
Kubectl uses cascading delete by default, and we should do that, too. Because if you don't do that, who is going to delete the revisions of a service afterwards ? (afaik, if you don't do a cascading delete those revisions, even when they have an ownerReference will just stay around forever. tbv) However, I think for So my suggestion is:
|
My fear is if we don't do a cascade delete we are left with orphaned objects. |
to be clear... I don't think cascading or not is the issue, either way we'll delete stuff. The question is how we ask the server to do the delete and whether we wait on the CLI side or not. Switching to I'll repeat what I said above:
I'd recommend that we keep |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make more sense to have a --cascade-delete
or similar flag that does not default to true? Seems like removing that feature completely is not a step forward...
Also @duglin consider adding tests :) |
For me whether we wait for all objects to be deleted (foreground) or to only the main object to be deleted (background) is also a client concern, there is no different background operation. Serverside the eventual result will be the same, whether its foreground or background deletion (with foreground delete policy relying on the kubernetes garbage collector, thanks @dsimansk for the link). So the end result will be the same: The object and all its dependent objects are deleted (the option here is actually not related whether to cascade or not, I misunderstood that). Its just the way how the client instructs the deletion, so I'm totally fine that this way differs for
|
I would do always a cascade delete as I don't see the use case for non-cascade delete on the abstraction level of |
I'm not thrilled with the idea that If you believe that the client might care about how the delete happens then linking it with whether the CLI returns immediately or not would be bad, because this means that someone can't choose 2 of the 4 possible combination of semantics. In the end, I strongly believe that So I'd like to suggest that we deal with that first and then have a discussion around whether, and how, to change the semantics of the delete on the server. I still like a new flag to control that because I do believe linking it with |
The following is the coverage report on the affected files.
|
Signed-off-by: Doug Davis <[email protected]>
/test pull-knative-client-go-coverage |
Updated PR to just set the default to be |
|
||
// Special-case 'delete' command so it comes back to the user immediately | ||
noWaitDefault := false | ||
if action == "Delete" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather see default value provided through function's argument. The current approach will change default for all delete operations, e.g. also Revisions. What about a new config struct that for now can have 2 fields, default value and timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I didn't realize you could be more specific... I'll fix....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, I would be happy to switch to no-wait as default for all delete operations (this would then be also easy to reason about).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do like consistency :-) We just need to figure out which way it spans. Making all deletes the same (as long as it doesn't have negative side-effects) seems ok to me. @dsimansk ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure about all the usage. However, wait flags are used in Service (delete) and Revision (delete), no objections to go with --no-wait
for all delete then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - if we do that, then I think the PR is ready for review
Weird - the CLA bot was happy yesterday |
/check-cla |
@duglin this error: "cla/google — CLAs are signed, but unable to verify author consent" seems odd? You clearly have signed the CLA... not sure what's going on. First time seeing this. |
@rgregg any ideas on the CLA issue? |
@googlebot rescan |
OK CLA check is now happy - ready for review |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: duglin, rhuss The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Remove the delete propagation flag In it's current state it now takes me about 25 seconds for the `kn delete` to complete. Before knative#682 it used to be almost immediate. This is because we now pass in the `DeletePropagationBackground` flag. I believe this is a mistake, not only because of the 20+ seconds of additional time to delete things, but IMO the CLI should talk to the server in the same way regardless of the --wait flag. That flag should just be a CLI thing to indicate if the user wants the CLI to wait for the server to complete but not HOW the server should do the delete. Signed-off-by: Doug Davis <[email protected]> * try just tweaking the --no-wait flag Signed-off-by: Doug Davis <[email protected]>
* (refactor) address the e2e extract / refactor of issue #763 (#765) * (refactor) address the e2e extract / refactor of issue #763 * various updates to address reviewers feedback * renamed lib/test/integration to lib/test and package to test Signed-off-by: Roland Huß <[email protected]> # Conflicts: # CHANGELOG.adoc # test/e2e/service_export_import_apply_test.go # test/e2e/trigger_test.go * fix(plugin): Fix plugin lookup with file ext on Windows (#774) * fix(plugin): Fix plugin lookup with file ext on Windows * chore: Update changelog * fix: Reflect review feedback * fix: Reflect review feedback and add future todo Signed-off-by: Roland Huß <[email protected]> # Conflicts: # CHANGELOG.adoc * fix(issue #762): correct error message when updating service (#778) * fix(issue #762): correct error message when updating service * correct message when updating service and passing many names * fix issue with TestServiceUpdateWithMultipleImages running create vs update * * added TestServiceDescribeWithMultipleNames * added TestServiceCreateWithMultipleNames * fix error message for service delete since many names can be passed * Use vendored deps while running e2e locally (#783) Also set GO111MODULE=on unconditionally * Update sink binding create usage string (#785) * Add "--target-utilization" to manage "autoscaling.knative.dev/targetUtilizationPercentage" annotation (#788) * Support setting "autoscaling.knative.dev/targetUtilizationPercentage" annotation. Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/service_options_test.go * Remove the delete propagation flag (#770) * Remove the delete propagation flag In it's current state it now takes me about 25 seconds for the `kn delete` to complete. Before #682 it used to be almost immediate. This is because we now pass in the `DeletePropagationBackground` flag. I believe this is a mistake, not only because of the 20+ seconds of additional time to delete things, but IMO the CLI should talk to the server in the same way regardless of the --wait flag. That flag should just be a CLI thing to indicate if the user wants the CLI to wait for the server to complete but not HOW the server should do the delete. Signed-off-by: Doug Davis <[email protected]> * try just tweaking the --no-wait flag Signed-off-by: Doug Davis <[email protected]> * Fix error when output is set to name (#775) * fix error when output is set to name * add e2e test * change to flags/listprint.go Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/basic_workflow_test.go * Show all revisions when run `service describe -v` (#790) * The `kn service describe -v` command shows repetitive revisions, because the revision would be covered by next one. * Fix resource listing with -oname flag (#799) * Fix resource listing with -oname flag * add e2e tests Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/ping_test.go # test/e2e/revision_test.go # test/e2e/route_test.go # test/e2e/source_apiserver_test.go # test/e2e/source_binding_test.go # test/e2e/trigger_test.go * Make wait, no-wait and async flags per bool var CLI convention (#802) * Make wait, no-wait and async flags per bool var CLI convention Fixes #800 - Deprecated bool vars can be supported for CLI convention - Bind --async flag value to --no-wait - Only one flag among [wait, no-wait, async] can be provided, else raise an error * Simplify conditionals * Add unit tests for deprecated flag async * Fix a typo * e2e: Foreground delete for revisions and services in e2e (#794) * e2e: Foreground delete for revisions and services in e2e to avoid any race conditions and flakes * Use --wait instead of --no-wait=false Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/basic_workflow_test.go # test/e2e/revision_test.go * e2e: Run tekton e2e against pipeline v0.11.1 (#803) * Use buildah task from master branch and paramterize FORMAT * Configure pipeline v0.11.1 * DNM: Run tekton e2e in this PR * Revert "DNM: Run tekton e2e in this PR" This reverts commit 903f5be. * Update CHANGELOG for v0.13.2 (#804) * Pin serving to v0.13.2 and update version command (#797) * Pin serving v0.13.2 dep to v0.13.2 * Update version command now points to serving v0.13.2 and eventing v0.13.6 * Copy go.sum as generated in CI Signed-off-by: Roland Huß <[email protected]> # Conflicts: # go.mod # go.sum # vendor/modules.txt * add missing vendored files * fixed error reporting for traffics tests * Updated test * fix formatting * e2e for service export (#739) * e2e for service export * e2e for service export * e2e for service export * e2e for service export * e2e for service export Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/service_export_import_apply_test.go Co-authored-by: dr.max <[email protected]> Co-authored-by: David Simansky <[email protected]> Co-authored-by: Navid Shaikh <[email protected]> Co-authored-by: Lv Jiawei <[email protected]> Co-authored-by: Doug Davis <[email protected]> Co-authored-by: Ying Chun Guo <[email protected]> Co-authored-by: Murugappan Chetty <[email protected]>
In it's current state it now takes me about 25 seconds for the
kn delete
to complete. Before #682 it used to be
almost immediate. This is because we now pass in the
DeletePropagationBackground
flag. I believe this is a mistake, not onlybecause of the 20+ seconds of additional time to delete things, but IMO
the CLI should talk to the server in the same way regardless of the --wait
flag. That flag should just be a CLI thing to indicate if the user wants the CLI
to wait for the server to complete but not HOW the server should do the delete.
Signed-off-by: Doug Davis [email protected]