-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the defaults for PodsReadyTimeout backoff more practical #2025
Make the defaults for PodsReadyTimeout backoff more practical #2025
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
d17b5de
to
9eac5c4
Compare
9eac5c4
to
f31ab70
Compare
/assign @tenzen-y |
/cc @alculquicondor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for creating this PR!
Basically, lgtm.
Could you update API documentation (site)? Also, I'm not sure the reason why our CI didn't detect outdated API documentation...
For comparison, considering `.waitForPodsReady.timeout=300s` (default), | ||
the workload will spend `50min` total waiting for pods ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you mention backoffLimitCount
the same as before since the backoffLimitCount
never defaults any value, and the requeueing will occur forever?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but I set it to 10 now PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the backoffLimitCount
in Config API. In this PR, we introduced the fixed value to Duration
in backoff calculation, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// When it is null, the workloads will repeatedly and endless re-queueing.
Isn't that enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the backoffLimitCount in Config API. In this PR, we introduced the fixed value to Duration in backoff calculation, right?
I discussed this with @mimowo offline. It seemed that our understanding was missing each other, but we could sync our opinions.
// When it is null, the workloads will repeatedly and endless re-queueing. Isn't that enough?
I'm ok with removing this sentence since it seems that this example seems to lose importance.
83d64ed
to
3aa079e
Compare
|
||
type ControllerOption func(*ControllerOptions) | ||
|
||
func WithControllerRequeuingBaseDelaySeconds(value int32) ControllerOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessary for the cherry-pick
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use it to pass different base in prod and integration tests. When I use 10s in integration tests they fail, as the Timeout is 5s only. I considered the following approaches:
- Expose the configuration via API (deferred to a follow up since this shouldn't be cherry-picked)
- Bump the timeout for the subset of integration tests for PodsReady - this would work, but seems wasteful
- Expose configuration which allows me to pass different values to
SetupControllers
(chosen)
Let me know if there is another approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh ok, let's keep this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: 2fa23b305d3603f6365f0b44615630e3d0a1362b
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, mimowo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks for spotting the issue, I opened: #2032 |
/lgtm /cherry-pick release-0.6 |
@tenzen-y: #2025 failed to apply on top of branch "release-0.6":
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@mimowo Could you submit a cherry-pick PR? |
…gs#2025) Change-Id: Icf5937311c40f2a28050d35e1fc3189a855c9aa4
…ff more practical (#2033) * Make the defaults for PodsReady backoff more practical * Fix API reference for PodsReady config
/remove-kind documentation |
What type of PR is this?
/kind bug
/kind documentation
What this PR does / why we need it:
Which issue(s) this PR fixes:
Part of #2009
Special notes for your reviewer:
WIP because still testing, and I need to update the estimations from KEP and API comments.
Early feedback is welcome.
Does this PR introduce a user-facing change?