-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom percentageOfNodesToScore in a PreFilter plugin #14
Comments
Given that it would affect other workloads, I don't think it's a good idea. One approach could be to have the percentageOfNodesToScore to be part of a scheduling profile and then have pods use the profile that gives them better performance if they don't care much about scoring. But if the whole point is to have faster scheduling, there are better things to do:
|
@alculquicondor Thanks for your comment.
A PreFilter plugin changes the parameter value for the pod to be scheduled before the Filter stage. When the scheduler (pkg/scheduler/core/generic_scheduler.go) calls the following function to calculate numFeasibleNodesToFind , it will use the new value of g.percentageOfNodesToScore. We will need to save the default global value and reset it for future pods without specifying custom values.
|
Below is a possible implementation for a dummy CustomParameter PreFilterPlugin.
|
|
The motivation is to customize the threshold for different workloads running in the same cluster. For example, a long running service that cares a lot about the scheduling quality will use a high threshold to achieve a better scheduling quality while a large batch job that is looking for a quick turnaround may set a =lower threshold for a quicker scheduling.
We use 1.18. Again, the assumption is scheduling a large number of pods (> 1k pods) in an ultra large cluster (>1k nodes). |
We run such big clusters as well, except that we only target 100 pods/s You could consider using multiple profiles and disabling some (or all) plugins in one of them. Then, your jobs use the optimized profile. And you can pair that with a percentage of nodes to score that makes sense for both types of workloads, such as 30%. Note that we use the percentage in a windowed fashion, so eventually all nodes are tested for a big enough deployment. Making a pod directly affect scheduler configuration might not be the best API. Scheduling profiles is a much better way. |
Thanks, per profile threshold will be certainly helpful, but the value is set statically. A PreFilter plugin can support dynamic and adaptive threshold at a single pod level, which will provide a greater flexibility. I agree we need to think more about the use cases and tradeoff. |
Atm Given that |
A side note: Anyway, if we read it as "the eventual percentage of nodes to score", then it's not that misleading. |
It controls the number of feasible nodes to find by the filter stage and hence affects how many nodes will be scored and ranked in the score stage. It hence affects the scoring stage’s performance, which can be the bottleneck. |
Thanks, we will explore this option. |
Opened an issue: support per scheduling profile configuration to kube-scheduler kubernetes/kubernetes#93270 |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/close discussion moved to kubernetes/kubernetes#93270 |
@alculquicondor: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[KNI] refer to specific Makefile in Dockerfile
…mp-golang-dockerfile [WG] update golang for containerized builds
The scheduler option percentageOfNodeToScore controls how many nodes should be checked when scheduling a pod. It has an important impact on the scheduling performance.
To better balance the scheduling performance and quality to meet different scheduling needs of diverse workloads, an idea is to introduce a PreFilter plugin that updates the default global value if a custom threshold is specified through a Pod label.
This plugin sets the value of percentageOfNodesToScore according to the value associated with a label. For example,
We’d like to have your input and suggestions, particularly
Is it a valid and useful feature?
Is it possible to implement? A problem we notice is that the current scheduling framework does not provide a mechanism for plugins to access and update the scheduler options. Would it possible to change the plugin APIs with an additional argument, e.g. a scheduler option pointer?
Thanks a lot!
The text was updated successfully, but these errors were encountered: