-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add APIs for configuring fair sharing #2070
Add APIs for configuring fair sharing #2070
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
cdb3926
to
87453cf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment for a nit.
87453cf
to
2dac953
Compare
9cb7292
to
fc94535
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally, lgtm
I left a few comments.
By the way, it seems that the fairSharing
feature overlaps borrowWithinCohort at some points. So, in the v1 API, we might be able to consolidate both APIs into a single one or provide a comprehensive feature somehow.
Did you mean that borrowWithinCohort can be replaced by using If so, I agree. I would like to completely remove |
Change-Id: I74b24ba8715290753c0bd9b966a109b5b01078b4
Change-Id: If347eabdb17af643d46a1e4ad78b79e73c424011
Change-Id: I6938d83399c55fecf952a570e2e431ad4ab479b2
Change-Id: I8b49d21f0b0e7a7d2607d9589ba49a4a914cd2aa
Change-Id: I68ba0c546ce27d9e6565a30584aacd67fd318ede
Change-Id: I4d56c73f4a949fa5c2b2053fd83a7161553755b0
e1b68d8
to
a93e785
Compare
// This strategy doesn't depend on the share usage of the workload being preempted. | ||
// As a result, the strategy chooses to preempt workloads with the lowest priority and | ||
// newest start time first. | ||
// The default strategy is ["LessThanOrEqualToFinalShare"]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per offline discussions we consider the two combinations as most likely in use:
[LessThanOrEqualToFinalShare, LessThanInitialShare]
and[LessThanInitialShare]
So, I would suggest setting the default to one of the configurations.
I would vote for the [LessThanOrEqualToFinalShare, LessThanInitialShare]
since it matches the KEP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
if s[0] == config.LessThanInitialShare { | ||
result[0] = lessThanInitialShare | ||
// This rule is a superset of the other rule, no need to check other strategies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we go for the "mathematical" approach, I would suggest to be precise, and don't make the assumption that Rule B is a superset of Rule A (since unlikely scenarios may happen).
pkg/config/validation.go
Outdated
allErrs = append(allErrs, field.Invalid(fsPreemptionStrategiesPath, fs.PreemptionStrategies, "Must be empty when fair sharing is disabled")) | ||
} | ||
strategies := sets.New[configapi.PreemptionStrategy]() | ||
validStrategies := []configapi.PreemptionStrategy{configapi.LessThanInitialShare, configapi.LessThanOrEqualToFinalShare} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should either validate against non-supported configurations or support them. I'm leaning towards validation, because it will keep the code simpler. IIUC this is the only non-supported option for now: [LessThanInitialShare, LessThanOrEqualToFinalShare]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed validation to have an explicit list of supported combinations.
Yes, the biggest motivation to have borrowWithinCohort was to allow certain ClusterQueues to always be preempted whenever some other CQ needs the resources. And this can be easily achieved when enabling fair sharing, by setting the weight on these "preemptible" CQs to zero. |
Change-Id: I6893b7854656601e80f157b60caea27e27dfe4ea
2f4311c
to
1adb8e0
Compare
Change-Id: I6379c4304cfb45a08940d57a24b79c6b1564ccca
@tenzen-y anything to add? |
@@ -50,3 +50,6 @@ integrations: | |||
# - key: kubernetes.io/metadata.name | |||
# operator: NotIn | |||
# values: [ kube-system, kueue-system ] | |||
# fairSharing: | |||
# enable: true | |||
# preemptionStrategies: [LessThanOrEqualToFinalShare] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe align this with the current default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/config/validation.go
Outdated
if !fs.Enable && len(fs.PreemptionStrategies) != 0 { | ||
allErrs = append(allErrs, field.Invalid(fsPreemptionStrategiesPath, fs.PreemptionStrategies, "Must be empty when fair sharing is disabled")) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need it? I guess it can be handy to keep th configuration when temporarily disabling the feature.
Similarly as for waitForPodsReady.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are no drawbacks (similar to additional computation, and potentially bugs in preemptor), I would prefer to apply the @mimowo proposal here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SG. Done
I did another pass for fair sharing, lgtm, just the two nit-like comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thank you!
pkg/config/validation.go
Outdated
if !fs.Enable && len(fs.PreemptionStrategies) != 0 { | ||
allErrs = append(allErrs, field.Invalid(fsPreemptionStrategiesPath, fs.PreemptionStrategies, "Must be empty when fair sharing is disabled")) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are no drawbacks (similar to additional computation, and potentially bugs in preemptor), I would prefer to apply the @mimowo proposal here.
Change-Id: I1ec9229cb866d31ece5046741ac7686157639758
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hugely LGTM!
The implementation of the FairSharing feature inspired by Dominant Resource Fairness was not a really easy journey! Awsome 🎉
/lgtm
/approve
LGTM label has been added. Git tree hash: 07fccade5bb30526c1bae6d4b147b6d21d43dcf8
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, tenzen-y The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* Add APIs for configuring fair sharing Change-Id: I74b24ba8715290753c0bd9b966a109b5b01078b4 * Validation and defaulting for fair sharing Change-Id: If347eabdb17af643d46a1e4ad78b79e73c424011 * Implement multiple fair strategies Change-Id: I6938d83399c55fecf952a570e2e431ad4ab479b2 * Implement fair sharing weight Change-Id: I8b49d21f0b0e7a7d2607d9589ba49a4a914cd2aa * Fix flaky integration Change-Id: I68ba0c546ce27d9e6565a30584aacd67fd318ede * review Change-Id: I375166b8c7fdc300eeb43ede25fbcbe73298c7a2 * Update documentation for strategies Change-Id: I4d56c73f4a949fa5c2b2053fd83a7161553755b0 * Change default configuration to match KEP Change-Id: I6893b7854656601e80f157b60caea27e27dfe4ea * Disable fair sharing by default Change-Id: I6379c4304cfb45a08940d57a24b79c6b1564ccca * Relax validation Change-Id: I1ec9229cb866d31ece5046741ac7686157639758
What type of PR is this?
/kind feature
What this PR does / why we need it:
Add APIs for fair sharing:
KueueConfiguration
, a newFairSharing
struct to enable the feature and configure strategies.ClusterQueueSpec
, a newFairSharing
struct containing the weight.Which issue(s) this PR fixes:
Fixes #1714
Special notes for your reviewer:
At this point, I'm looking for feedback on the types and field documentation.
Does this PR introduce a user-facing change?