-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not consider priorities when sorting workloads from different ClusterQueues #1283
Comments
/assign |
Having that StrictFIFO is actually a priority queueing, I couldn't reproduce the bug that was described in #1024. Maybe there was no bug but actually not desired behavior when low priority workload was admitted and it was preempted just in the next scheduling cycle and we wanted to avoid such scenario and schedule higher priority workload in the first place? Anyway, my understanding is that we want to have the priority sorting in the scheduling loop as optional (enabled by default), so feature gate could be a solution here. |
The important bit here is that the job X can fit by borrowing. But borrowing workloads are sorted last kueue/pkg/scheduler/scheduler.go Lines 496 to 501 in 8431cbd
But you are right, priorities shouldn't really matter, just the timing. However, I think this was solved by #1039. Before that, the only option was to give jobs in Team-A-Standard a higher priority. If we can prove that a higher priority is no longer necessary (through an integration test that does the above), then we can proceed with optionally disabling priority checks in the cohort. |
I've added a test cases that you described #1283 (comment) in #1399 and it turns out that the pending workload is still blocking borrowing. Note, that priorities doesn't matter here, I putted the highest priority for borrowing workload. The reason is that the pending workload has "Preempt" FlavorAssignmentMode and has no borrowing, so in sorting it goes before any workload that requires borrowing and since it considered as Preempt, scheduler accounts for its resources in the cohort. It's definitely a bug, however it's not related to disabling priority feature. So I've added a feature gate to a separate PR #1406 and will look further how to fix this bug. |
For completeness when investigating a solution: in the following case, workload See this comment: kueue/pkg/scheduler/flavorassigner/flavorassigner.go Lines 201 to 204 in e75090a
I think a potential solution could be to change this logic kueue/pkg/scheduler/scheduler.go Line 190 in e75090a
|
Probably adding minimum from |
@yaroslava-serdiuk @alculquicondor I think this is almost done. I guess that leaving task is adding docs for featureGate, |
yes, we should document this feature. |
What would you like to be added:
Use priorities to sort workloads within a ClusterQueue, but ignore the priorities when sorting the heads of multiple cluster queues within a cohort.
The design needs to avoid the race condition presented here #1024
Why is this needed:
In organizations where teams do not know each other, they might be incentivized to use higher priorities to always be ahead of the rest. Ignoring priorities across ClusterQueues would remove this incentive, while allowing users to use priorities within their ClusterQueue.
Maybe this calls for a setting in a new Cohort object.
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.
The text was updated successfully, but these errors were encountered: