Dont wait for delete operations to be completed by default #770

duglin · 2020-04-01T17:32:54Z

In it's current state it now takes me about 25 seconds for the kn delete
to complete. Before #682 it used to be
almost immediate. This is because we now pass in the
DeletePropagationBackground flag. I believe this is a mistake, not only
because of the 20+ seconds of additional time to delete things, but IMO
the CLI should talk to the server in the same way regardless of the --wait
flag. That flag should just be a CLI thing to indicate if the user wants the CLI
to wait for the server to complete but not HOW the server should do the delete.

Signed-off-by: Doug Davis [email protected]

In it's current state it now takes me about 25 seconds for the `kn delete` to complete. Before knative#682 it used to be almost immediate. This is because we now pass in the `DeletePropagationBackground` flag. I believe this is a mistake, not only because of the 20+ seconds of additional time to delete things, but IMO the CLI should talk to the server in the same way regardless of the --wait flag. That flag should just be a CLI thing to indicate if the user wants the CLI to wait for the server to complete but not HOW the server should do the delete. Signed-off-by: Doug Davis <[email protected]>

knative-prow-robot

@duglin: 0 warnings.

In response to this:

In it's current state it now takes me about 25 seconds for the kn delete
to complete. Before #682 it used to be
almost immediate. This is because we now pass in the
DeletePropagationBackground flag. I believe this is a mistake, not only
because of the 20+ seconds of additional time to delete things, but IMO
the CLI should talk to the server in the same way regardless of the --wait
flag. That flag should just be a CLI thing to indicate if the user wants the CLI
to wait for the server to complete but not HOW the server should do the delete.

Signed-off-by: Doug Davis [email protected]

Description

Changes

Reference

Fixes #

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dsimansk · 2020-04-02T12:57:53Z

It was introduced as a part of synchronous delete operation to ensure all dependant resources are deleted for the service and we wait in-sync for the service resource to be gone.

DeletePropagationBackground is added for --no-wait and shouldn't cause any blocking. Afaik, it should be cluster default behaviour, but I do agree we could pass empty value to ensure server-side is honoured.
DeletePropagationForeground is current default and configured by --wait flag. It's part of synchronous delete operation, otherwise there should be other mechanism to ensure removal of everything related to service.

From my local test delete --no-wait is immediate and delete --wait takes some time as expected from synchronous operation.

duglin · 2020-04-02T13:12:07Z

I think the biggest issue is that --no-wait is not the default for kn so the new 20+second delay I was seeing was very noticeable/unexpected and gave me a bad UX. Perhaps just changing the default would be sufficient

However, I still wonder if --no-wait should change what we send to the server. I suspect many folks will think of it the way I did... it's just CLI thingy that only changes how long it takes for my cmd prompt to come back to me - it doesn't change what we tell the server to do - which I believe is true for other --no-wait flags on our other commands. If we want the user to be able to influence the DeletePropagation flag then I think we should introduce a flag to specifically allow the user to control that knob w/o overloading the --no-wait flag.

But I actually think that's a bit too low level for our CLI. I think the biggest concern for people is "when can I create a new ksvc with the same name w/o getting an error about that name being used?". And the default/DeletePropagationForeground does that. How long it takes to delete other resources that are hidden from the user, and would not influence any subsequent cmd they execute, isn't really that interesting except to a pretty advanced user.

dsimansk · 2020-04-02T13:45:24Z

I think the biggest issue is that --no-wait is not the default for kn so the new 20+second delay I was seeing was very noticeable/unexpected and gave me a bad UX. Perhaps just changing the default would be sufficient

The motivation to have --wait as default is alignment to create or update. Personally, I don't have a strong preference here.

However, I still wonder if --no-wait should change what we send to the server. I suspect many folks will think of it the way I did... it's just CLI thingy that only changes how long it takes for my cmd prompt to come back to me - it doesn't change what we tell the server to do - which I believe is true for other --no-wait flags on our other commands. If we want the user to be able to influence the DeletePropagation flag then I think we should introduce a flag to specifically allow the user to control that knob w/o overloading the --no-wait flag.

But I actually think that's a bit too low level for our CLI. I think the biggest concern for people is "when can I create a new ksvc with the same name w/o getting an error about that name being used?". And the default/DeletePropagationForeground does that. How long it takes to delete other resources that are hidden from the user, and would not influence any subsequent cmd they execute, isn't really that interesting except to a pretty advanced user.

The overall goal to have a synchronous delete was to ensure there aren't unexpected race condition between calling CRUD commands in rapid succession, e.g. scripts, E2E tests etc. The case of "service name collision" is on point here as well. I understand Service resource as an umbrella that should ensure removal of everything related to prevent unwanted dangling resources.

Wrt the delete policy flag, I lean strongly toward "too low level" approach. The policy should be opinionated default of synchronous delete. In this case it's part of delete everything strategy.

duglin · 2020-04-02T14:05:41Z

The motivation to have --wait as default is alignment to create or update. Personally, I don't have a strong preference here.

Totally agree, but in this case I think we only need to wait for the ksvc to vanish, not the hidden resources.

duglin · 2020-04-02T14:06:16Z

Net: 20+ seconds for an out-of-the-box kn service delete is not good :-)

rhuss · 2020-04-02T15:24:31Z

Kubectl uses cascading delete by default, and we should do that, too. Because if you don't do that, who is going to delete the revisions of a service afterwards ? (afaik, if you don't do a cascading delete those revisions, even when they have an ownerReference will just stay around forever. tbv)

However, I think for delete we should really switching to async by default. Because when you delete something you more often don't care anymore about the thing, which is different than when you create something which you might want to use immediate.

So my suggestion is:

Do cascading delete all the time
Switch to async delete by default, which can be switched on with --wait .We probably will have to adapt the E2E tests, too to add --wait

rhuss · 2020-04-02T15:32:54Z

My fear is if we don't do a cascade delete we are left with orphaned objects.

duglin · 2020-04-02T15:36:52Z

to be clear... I don't think cascading or not is the issue, either way we'll delete stuff. The question is how we ask the server to do the delete and whether we wait on the CLI side or not.

Switching to --no-wait by default is fine, as long as it will at least wait until the ksvc is gone - and that's always been quick for me and what kn used to do. I think the issue here is whether kn's --wait flag is just a CLI thing or not because we're mixing two concepts into one with the current code base.

I'll repeat what I said above:

However, I still wonder if --no-wait should change what we send to the server. I suspect many folks will think of it the way I did... it's just CLI thingy that only changes how long it takes for my cmd prompt to come back to me - it doesn't change what we tell the server to do - which I believe is true for other --no-wait flags on our other commands. If we want the user to be able to influence the DeletePropagation flag then I think we should introduce a flag to specifically allow the user to control that knob w/o overloading the --no-wait flag.

I'd recommend that we keep --wait controlling how long the CLI waits for the server to complete it's task - strictly a CLI-side flag. We can then discuss a second flag to indicate what that "task" is - meaning is it what it used to do (background delete propagation), or is it a foreground delete.

maximilien

Wouldn't it make more sense to have a --cascade-delete or similar flag that does not default to true? Seems like removing that feature completely is not a step forward...

maximilien · 2020-04-02T15:48:39Z

Also @duglin consider adding tests :)

rhuss · 2020-04-02T16:24:39Z

For me whether we wait for all objects to be deleted (foreground) or to only the main object to be deleted (background) is also a client concern, there is no different background operation. Serverside the eventual result will be the same, whether its foreground or background deletion (with foreground delete policy relying on the kubernetes garbage collector, thanks @dsimansk for the link).

So the end result will be the same: The object and all its dependent objects are deleted (the option here is actually not related whether to cascade or not, I misunderstood that).

Its just the way how the client instructs the deletion, so I'm totally fine that this way differs for --wait vs --no-wait, to make --no-wait as fast as possible and --wait as safe as possible (important for automated use cases like our e2e tests):

Use the option that @dsimansk has selected for --wait (foreground) and --no-wait (background), so keep the original code
However: Make --no-wait the default for the reasons mentioned in Dont wait for delete operations to be completed by default #770 (comment) to keep a good interactive UX.

rhuss · 2020-04-02T16:26:31Z

Wouldn't it make more sense to have a --cascade-delete or similar flag that does not default to true? Seems like removing that feature completely is not a step forward...

I would do always a cascade delete as I don't see the use case for non-cascade delete on the abstraction level of kn. The question is whether we do the cascade delete explicit (foreground) or rely on the k8s garbage collector (background).

duglin · 2020-04-02T17:43:57Z

I'm not thrilled with the idea that --wait has different semantics based on the CLI cmd. If it means "CLI returns immediately but the server action is the same" for everything except delete then I think that inconsistency is a problem and not an ideal UX.

If you believe that the client might care about how the delete happens then linking it with whether the CLI returns immediately or not would be bad, because this means that someone can't choose 2 of the 4 possible combination of semantics.

In the end, I strongly believe that kn service delete foo needs to return as quickly as possible and it means that I can reuse that Ksvc name immediately, like it used to. Forcing a flag to get that semantics would be bad IMO.

So I'd like to suggest that we deal with that first and then have a discussion around whether, and how, to change the semantics of the delete on the server. I still like a new flag to control that because I do believe linking it with --wait is mixing topics/concerns.

knative-metrics-robot · 2020-04-02T23:18:08Z

The following is the coverage report on the affected files.
Say /test pull-knative-client-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/kn/commands/wait_flags.go	100.0%	87.5%	-12.5

Signed-off-by: Doug Davis <[email protected]>

duglin · 2020-04-03T00:30:28Z

/test pull-knative-client-go-coverage

duglin · 2020-04-03T01:09:07Z

Updated PR to just set the default to be --no-wait per our slack chat

dsimansk · 2020-04-03T08:25:18Z

pkg/kn/commands/wait_flags.go

+
+	// Special-case 'delete' command so it comes back to the user immediately
+	noWaitDefault := false
+	if action == "Delete" {


I'd rather see default value provided through function's argument. The current approach will change default for all delete operations, e.g. also Revisions. What about a new config struct that for now can have 2 fields, default value and timeout.

oh I didn't realize you could be more specific... I'll fix....

btw, I would be happy to switch to no-wait as default for all delete operations (this would then be also easy to reason about).

I do like consistency :-) We just need to figure out which way it spans. Making all deletes the same (as long as it doesn't have negative side-effects) seems ok to me. @dsimansk ?

I wasn't sure about all the usage. However, wait flags are used in Service (delete) and Revision (delete), no objections to go with --no-wait for all delete then.

ok - if we do that, then I think the PR is ready for review

duglin · 2020-04-03T13:20:06Z

Weird - the CLA bot was happy yesterday

dsimansk · 2020-04-03T14:55:02Z

/check-cla

maximilien · 2020-04-03T19:14:39Z

@duglin this error: "cla/google — CLAs are signed, but unable to verify author consent" seems odd? You clearly have signed the CLA... not sure what's going on. First time seeing this.

duglin · 2020-04-03T19:53:05Z

@rgregg any ideas on the CLA issue?

duglin · 2020-04-04T18:41:56Z

@googlebot rescan

duglin · 2020-04-04T18:42:44Z

OK CLA check is now happy - ready for review

rhuss · 2020-04-08T10:16:04Z

/lgtm
/approve

knative-prow-robot · 2020-04-08T10:16:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: duglin, rhuss

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [rhuss]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* Remove the delete propagation flag In it's current state it now takes me about 25 seconds for the `kn delete` to complete. Before knative#682 it used to be almost immediate. This is because we now pass in the `DeletePropagationBackground` flag. I believe this is a mistake, not only because of the 20+ seconds of additional time to delete things, but IMO the CLI should talk to the server in the same way regardless of the --wait flag. That flag should just be a CLI thing to indicate if the user wants the CLI to wait for the server to complete but not HOW the server should do the delete. Signed-off-by: Doug Davis <[email protected]> * try just tweaking the --no-wait flag Signed-off-by: Doug Davis <[email protected]>

* (refactor) address the e2e extract / refactor of issue #763 (#765) * (refactor) address the e2e extract / refactor of issue #763 * various updates to address reviewers feedback * renamed lib/test/integration to lib/test and package to test Signed-off-by: Roland Huß <[email protected]> # Conflicts: # CHANGELOG.adoc # test/e2e/service_export_import_apply_test.go # test/e2e/trigger_test.go * fix(plugin): Fix plugin lookup with file ext on Windows (#774) * fix(plugin): Fix plugin lookup with file ext on Windows * chore: Update changelog * fix: Reflect review feedback * fix: Reflect review feedback and add future todo Signed-off-by: Roland Huß <[email protected]> # Conflicts: # CHANGELOG.adoc * fix(issue #762): correct error message when updating service (#778) * fix(issue #762): correct error message when updating service * correct message when updating service and passing many names * fix issue with TestServiceUpdateWithMultipleImages running create vs update * * added TestServiceDescribeWithMultipleNames * added TestServiceCreateWithMultipleNames * fix error message for service delete since many names can be passed * Use vendored deps while running e2e locally (#783) Also set GO111MODULE=on unconditionally * Update sink binding create usage string (#785) * Add "--target-utilization" to manage "autoscaling.knative.dev/targetUtilizationPercentage" annotation (#788) * Support setting "autoscaling.knative.dev/targetUtilizationPercentage" annotation. Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/service_options_test.go * Remove the delete propagation flag (#770) * Remove the delete propagation flag In it's current state it now takes me about 25 seconds for the `kn delete` to complete. Before #682 it used to be almost immediate. This is because we now pass in the `DeletePropagationBackground` flag. I believe this is a mistake, not only because of the 20+ seconds of additional time to delete things, but IMO the CLI should talk to the server in the same way regardless of the --wait flag. That flag should just be a CLI thing to indicate if the user wants the CLI to wait for the server to complete but not HOW the server should do the delete. Signed-off-by: Doug Davis <[email protected]> * try just tweaking the --no-wait flag Signed-off-by: Doug Davis <[email protected]> * Fix error when output is set to name (#775) * fix error when output is set to name * add e2e test * change to flags/listprint.go Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/basic_workflow_test.go * Show all revisions when run `service describe -v` (#790) * The `kn service describe -v` command shows repetitive revisions, because the revision would be covered by next one. * Fix resource listing with -oname flag (#799) * Fix resource listing with -oname flag * add e2e tests Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/ping_test.go # test/e2e/revision_test.go # test/e2e/route_test.go # test/e2e/source_apiserver_test.go # test/e2e/source_binding_test.go # test/e2e/trigger_test.go * Make wait, no-wait and async flags per bool var CLI convention (#802) * Make wait, no-wait and async flags per bool var CLI convention Fixes #800 - Deprecated bool vars can be supported for CLI convention - Bind --async flag value to --no-wait - Only one flag among [wait, no-wait, async] can be provided, else raise an error * Simplify conditionals * Add unit tests for deprecated flag async * Fix a typo * e2e: Foreground delete for revisions and services in e2e (#794) * e2e: Foreground delete for revisions and services in e2e to avoid any race conditions and flakes * Use --wait instead of --no-wait=false Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/basic_workflow_test.go # test/e2e/revision_test.go * e2e: Run tekton e2e against pipeline v0.11.1 (#803) * Use buildah task from master branch and paramterize FORMAT * Configure pipeline v0.11.1 * DNM: Run tekton e2e in this PR * Revert "DNM: Run tekton e2e in this PR" This reverts commit 903f5be. * Update CHANGELOG for v0.13.2 (#804) * Pin serving to v0.13.2 and update version command (#797) * Pin serving v0.13.2 dep to v0.13.2 * Update version command now points to serving v0.13.2 and eventing v0.13.6 * Copy go.sum as generated in CI Signed-off-by: Roland Huß <[email protected]> # Conflicts: # go.mod # go.sum # vendor/modules.txt * add missing vendored files * fixed error reporting for traffics tests * Updated test * fix formatting * e2e for service export (#739) * e2e for service export * e2e for service export * e2e for service export * e2e for service export * e2e for service export Signed-off-by: Roland Huß <[email protected]> # Conflicts: # test/e2e/service_export_import_apply_test.go Co-authored-by: dr.max <[email protected]> Co-authored-by: David Simansky <[email protected]> Co-authored-by: Navid Shaikh <[email protected]> Co-authored-by: Lv Jiawei <[email protected]> Co-authored-by: Doug Davis <[email protected]> Co-authored-by: Ying Chun Guo <[email protected]> Co-authored-by: Murugappan Chetty <[email protected]>

googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Apr 1, 2020

knative-prow-robot reviewed Apr 1, 2020

View reviewed changes

knative-prow-robot requested review from cppforlife and evankanderson April 1, 2020 17:33

knative-prow-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Apr 1, 2020

maximilien suggested changes Apr 2, 2020

View reviewed changes

knative-prow-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 2, 2020

try just tweaking the --no-wait flag

0a14e7f

Signed-off-by: Doug Davis <[email protected]>

duglin force-pushed the fixDelTime branch from fd42204 to 0a14e7f Compare April 2, 2020 23:51

dsimansk reviewed Apr 3, 2020

View reviewed changes

knative-prow-robot assigned rhuss Apr 8, 2020

knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 8, 2020

knative-prow-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 8, 2020

knative-prow-robot merged commit de12484 into knative:master Apr 8, 2020

navidshaikh changed the title ~~Remove the delete propagation flag~~ Dont wait for delete operations to be completed by default Apr 15, 2020

navidshaikh added the backport/candidate Consider this PR to be backported to the release branch label Apr 15, 2020

rhuss mentioned this pull request Apr 15, 2020

Pr/release 0.13.2 backports #806

Merged

navidshaikh added backport/pr A backport PR which is target to a release branch. and removed backport/candidate Consider this PR to be backported to the release branch labels Apr 20, 2020

rhuss added backported-to/0.13 and removed backport/pr A backport PR which is target to a release branch. labels Apr 20, 2020

duglin deleted the fixDelTime branch August 31, 2020 02:56

dsimansk added a commit to dsimansk/client that referenced this pull request Aug 9, 2021

Sync latest spec file to main (knative#770)

e0708bd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dont wait for delete operations to be completed by default #770

Dont wait for delete operations to be completed by default #770

duglin commented Apr 1, 2020 •

edited

Loading

knative-prow-robot left a comment

dsimansk commented Apr 2, 2020 •

edited

Loading

duglin commented Apr 2, 2020

dsimansk commented Apr 2, 2020

duglin commented Apr 2, 2020

duglin commented Apr 2, 2020 •

edited

Loading

rhuss commented Apr 2, 2020

rhuss commented Apr 2, 2020

duglin commented Apr 2, 2020

maximilien left a comment

maximilien commented Apr 2, 2020

rhuss commented Apr 2, 2020 •

edited

Loading

rhuss commented Apr 2, 2020

duglin commented Apr 2, 2020

knative-metrics-robot commented Apr 2, 2020

duglin commented Apr 3, 2020

duglin commented Apr 3, 2020

dsimansk Apr 3, 2020

duglin Apr 3, 2020

rhuss Apr 3, 2020

duglin Apr 3, 2020

dsimansk Apr 3, 2020 •

edited

Loading

duglin Apr 3, 2020

duglin commented Apr 3, 2020

dsimansk commented Apr 3, 2020

maximilien commented Apr 3, 2020

duglin commented Apr 3, 2020

duglin commented Apr 4, 2020

duglin commented Apr 4, 2020

rhuss commented Apr 8, 2020

knative-prow-robot commented Apr 8, 2020

Dont wait for delete operations to be completed by default #770

Dont wait for delete operations to be completed by default #770

Conversation

duglin commented Apr 1, 2020 • edited Loading

knative-prow-robot left a comment

Choose a reason for hiding this comment

Description

Changes

Reference

dsimansk commented Apr 2, 2020 • edited Loading

duglin commented Apr 2, 2020

dsimansk commented Apr 2, 2020

duglin commented Apr 2, 2020

duglin commented Apr 2, 2020 • edited Loading

rhuss commented Apr 2, 2020

rhuss commented Apr 2, 2020

duglin commented Apr 2, 2020

maximilien left a comment

Choose a reason for hiding this comment

maximilien commented Apr 2, 2020

rhuss commented Apr 2, 2020 • edited Loading

rhuss commented Apr 2, 2020

duglin commented Apr 2, 2020

knative-metrics-robot commented Apr 2, 2020

duglin commented Apr 3, 2020

duglin commented Apr 3, 2020

dsimansk Apr 3, 2020

Choose a reason for hiding this comment

duglin Apr 3, 2020

Choose a reason for hiding this comment

rhuss Apr 3, 2020

Choose a reason for hiding this comment

duglin Apr 3, 2020

Choose a reason for hiding this comment

dsimansk Apr 3, 2020 • edited Loading

Choose a reason for hiding this comment

duglin Apr 3, 2020

Choose a reason for hiding this comment

duglin commented Apr 3, 2020

dsimansk commented Apr 3, 2020

maximilien commented Apr 3, 2020

duglin commented Apr 3, 2020

duglin commented Apr 4, 2020

duglin commented Apr 4, 2020

rhuss commented Apr 8, 2020

knative-prow-robot commented Apr 8, 2020

duglin commented Apr 1, 2020 •

edited

Loading

dsimansk commented Apr 2, 2020 •

edited

Loading

duglin commented Apr 2, 2020 •

edited

Loading

rhuss commented Apr 2, 2020 •

edited

Loading

dsimansk Apr 3, 2020 •

edited

Loading