-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(service): Improvements for waiting on service readiness #431
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall nice. Definitely better UX. Left some feedback to consider and couple minor nit.
@@ -59,7 +59,7 @@ kn service create NAME --image IMAGE [flags] | |||
--requests-memory string The requested memory (e.g., 64Mi). | |||
--revision-name string The revision name to set. Must start with the service name and a dash as a prefix. Empty revision name will result in the server generating a name for the revision. Accepts golang templates, allowing {{.Service}} for the service name, {{.Generation}} for the generation, and {{.Random [n]}} for n random consonants. (default "{{.Service}}-{{.Random 5}}-{{.Generation}}") | |||
--service-account string Service account name to set. Empty service account name will result to clear the service account. | |||
--wait-timeout int Seconds to wait before giving up on waiting for service to be ready. (default 60) | |||
--wait-timeout int Seconds to wait before giving up on waiting for service to be ready. (default 600) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to also have a --no-wait
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
600 seconds == 10 minutes. Is that too much for a default? I feel like one minute (60 seconds) should be good enough. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had 60s before, but that was often too short, especially when the created Pods still have to pull the image, which is often longer than one minute. In that case a "timeout error" occurs as in #423, which is confusing as the service itself is ready eventually.
As I'm assuming that as long the "Ready" condition is in the state "UNKNOWN" I rely on Knative that this eventually will be reconciled to either "TRUE" or "FALSE". If this is the case, the command will return. It will only run into a timeout when staying in UNKNOWN. So from my experience, we should rely on Knative serving to do that reconciliation and e.g. if it can't get the service up in a certain amount of time, will be put the "Ready" condition to "FALSE". Hence this large timeout, which could maybe even longer (not sure if there are image that large that it will take longer than 10 mins, but then a sync wait on the CLI without a real progress meter is tedious, too).
I added the detailed feedback on the event as a poor-men progress meter replacement, so that the user sees that something is going on even when the creation/update takes a bit longer.
That output is also helpful for scripting use cases as it helps in debugging timing issues.
Still, with a -q
we should certainly not show that messages.
One can argue that one migt move these event to --verbose
, but then one would lose the simple progress indicator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--no-wait
is achieved with --async
(that was chosen over no-wait at the time when the feature was introduced)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I can live with that. Thanks.
pkg/wait/wait_for_ready.go
Outdated
@@ -38,8 +39,8 @@ type waitForReadyConfig struct { | |||
type WaitForReady interface { | |||
|
|||
// Wait on resource the resource with this name until a given timeout | |||
// and write status out on writer | |||
Wait(name string, timeout time.Duration) error | |||
// and write event messages for uknown event to the status writer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*unknown
/ok-to-test |
From e2e, |
fefb84e
to
657e103
Compare
Yes, that makes sense for the sync case as the revision would be available at that point, too (we could just add it to the 'showUrl()' method). Or we could add it to the event list, so that the final entry could look like:
Another idea would be to have a This would us also allow to not think about how to format human-readable output that it can be easily parsed (which I think is dangerous anyway in the long run as we can easily break things if we make the concrete output format part of a kind of "API") |
A related note on this, not all The |
BTW, one reason why I would love to have the URL on a separate line also for human readable output is highlighted in https://youtu.be/9rnrOK0Ifqs?t=1876 : As you can see its not so easy to select an URL in floating text than having a double (or tripple) mouse click on a single line. Not sure whether we should do that for a revision, too. Maye we can rephrase the last URL line as:
|
@rhuss : The URL on a separate (last) line is good, +1 for that. For,
the service might not necessarily be serving from this single revision. WDYT about this: For service create:
For update creating a new revision:
For update not creating a new revision:
|
I thought for |
The revision can be inferred best only after the service is ready, not when the Service CR is created. The output for sync create looks like (with a revision added after the service reconciled):
I don't think we need to repeat the namespace for the revision name. |
3d65623
to
a0cb9d8
Compare
@navidshaikh I added now the revision name (uncoditionally, but for I also updated the asciinema video in the issue comment to reflect this change. |
@navidshaikh updated now with "(unchanged)" as suffix to a revision if that revision has not changed (also did an update of the video) |
* Increased default timeout to 600s. This timeout will be triggered only when the Ready condition stays in UNKNOWN for that long. If its True or False then the command will return anyway sooner. So it makes sense to go for a much longer timeout than 60s * Enhanced output to indicate the progress This change needs some updates to the API and introduces a 'MessageCallback' type which is calle for each intermediate event with the "Ready" condition message.
d650365
to
6017c23
Compare
The following is the coverage report on the affected files.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: navidshaikh, rhuss The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* [release-v1.14] Add kn-plugin-source-kafka v1.14 * [release-v1.14] Add kn-plugin-event v1.14
only when the Ready condition stays in UNKNOWN for that long. If its
True or False then the command will return anyway sooner.
So it makes sense to go for a much longer timeout than 60s
This change needs some updates to the API and introduces a 'MessageCallback'
type which is calle for each intermediate event with the "Ready" condition message.
Example of the UX: