-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DaemonSet, Add Priority class #42
Conversation
@maiqueb can you review please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this might be a really good catch.
I think it makes sense, but I find it strange I found no mention whatsoever of this when I checked the multus / kubevirt / ... projects.
For instance, I find it strange to find the macvtap-(dp+cni) pod more important than virt-handler.
Do you mind pointing out some examples of where this is used ?
Some PRs that are freshly merged (there are more on the way): As part of the effort on https://bugzilla.redhat.com/show_bug.cgi?id=1953482 Its true that now network component will have higher priority than kubevirt, but its per design, Will update the PR with more info as the 2nd comment say as well. |
Note that KubeVirt should switch to native classes too https://bugzilla.redhat.com/show_bug.cgi?id=1953479. The relative priority between this CNI and KubeVirt is not really that important. To give you a backstory for this - on environments where a single node is used for both user workload and control-plane, control-plane pods should never starve out on account of user workload. If we preempt all user workload and there is still not enough space for both CNI and KubeVirt, the cluster is just flawed and can never function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to rebuild manifests: https://github.com/kubevirt/macvtap-cni/blob/main/manifests/macvtap.yaml
Add system-node-critical to macvtap-cni DaemonSet. Since macvtap-cni pod should run on each node, assign system-node-critical pc to it. This will make the control plane less sensitive to preemption than user workloads. Signed-off-by: Or Shoval <[email protected]>
Nice done thanks (using we should add check for that i think ? |
I guess we should, at least when the templates are updated. You're welcome to implement that, if you want. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: maiqueb, oshoval, phoracek The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
As part of https://bugzilla.redhat.com/show_bug.cgi?id=1953482
We are adding Priority Class [1] to each network component.
The motivation is to make the control plane pods less sensitive to preemption
than user workloads.
Pods that are node specific will have the higher build-in priority,
since preempting them from a specific node, makes them unavailable
until they are rescheduled on that specific node.
Pods that are network control plane, but are not node specific
will have
system-cluster-critical
, which would still make them more importantthan user and custom workloads, but less than
system-node-critical
.Add
system-node-critical
tomacvtap-cni
DaemonSet.Since
macvtap-cni
pod should run on each node,assign
system-node-critical
pc to it.[1] https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/
Signed-off-by: Or Shoval [email protected]