-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CPUManager policy option to distribute CPUs across NUMA nodes instead of packing them #105631
Add CPUManager policy option to distribute CPUs across NUMA nodes instead of packing them #105631
Conversation
/sig node |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: klueska The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/triage accepted |
/cc @fromanirh @swatisehgal The core algorithm hasn't been implemented yet, but the base infrastructure is now in place. Please take note of:
|
@fromanirh @swatisehgal |
357ba1d
to
6f155d0
Compare
This parameter ensures that CPUs are always allocated in groups of size 'cpuGroupSize'. This is important, for example, to ensure that all CPUs (i.e. hyperthreads) from the same core are handed out together. Signed-off-by: Kevin Klues <[email protected]>
Signed-off-by: Kevin Klues <[email protected]>
002a080
to
d9725e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again a batch of minor comments, mostly for future cleanups (when we move to beta or so). The algorithm itself is nicely written and very nicely commented. I want to review the tests carefully today/tomorrow, and if everything goes as I expect, I'll add my LGTM.
@@ -264,6 +308,25 @@ func (a *cpuAccumulator) isFailed() bool { | |||
return a.numCPUsNeeded > a.details.CPUs().Size() | |||
} | |||
|
|||
func (a *cpuAccumulator) iterateCombinations(n []int, k int, f func([]int)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this can (and probably should) be a free function as well, because it doesn't use cpuAccumulator
fields at all. Not worth changing now, let's just not it down for the future refactoring.
type LoopControl bool | ||
|
||
const ( | ||
Continue LoopControl = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: using bool here raise a flag. Could we use a naked bool maybe? or should we just move to a enum using integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with changing this to an enum. It's only a bool, because my first iteration of this didn't use constants at all -- I had the functions directly returning a bool, with true
for continue and false
for break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can totally wait for the refactoring, but it seems to suggest a plain bool could be simpler and equally readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@fromanirh
Yes, as I pointed out here, I struggled with where to put this because I wanted it namespaced somehow. It ill be pulled out to a top-level function once we move to a |
Signed-off-by: Kevin Klues <[email protected]>
Sure, I was recording them all. I understand the issue at hand and I'm actually pushing (slightly :) ) to wait and address all these nits in the future cleanup work. |
d9725e6
to
86f9c26
Compare
Based on the feedback above, I am removing the hold on this PR. Just need an lgtm from here then. |
/test pull-kubernetes-integration |
/lgtm |
/lgtm |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR implements the logic describe in KEP-2902.
It adds a new
CPUManager
policy option calleddistribute-cpus-across-numa
to the staticCPUManager
policy. When enabled, this will trigger theCPUManager
to evenly distribute CPUs across NUMA nodes in cases where more than one NUMA node is required to satisfy the allocation.Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: