You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What would you like to be added:
Add an option to compute the fingerprint of the current pod set,
and report it as a top-level attribute in NRT objects. This is meant to
enable better caching in the scheduler side.
To use such an attribute we first need to update NRT API version to v1alpha2.
Why is this needed:
To minimize, or even eliminate, the occurrence of TopologyAffinityErrors TopologyManager and Scheduler should have a synched view of node resources.
However it is hard for the NUMA aware scheduler plugins to track the resource allocation with NUMA zone granularity.
A new reserve plugin implementing a scheduler-side cache will improve this situation. More info here
A prerequisite for this plugin to work is that topology updater agents, like NFD, should compute and provide a "node state" in NRT objects along with the resource information.
A pod set fingerprint is a compact representation of this "node state" so this enhancement will allow NFD to work nicely with this new NUMA-aware scheduler-side cache and improve the NUMA-aware scheduling process.
thanks for filing this issue!
the "fingerprint" is a opaque string which can be checked for equality or not - no other operation is defined.
We are prepping a new version of the NRT API, the fingerprint should be represented as top-level attribute.
thanks for filing this issue! the "fingerprint" is a opaque string which can be checked for equality or not - no other operation is defined. We are prepping a new version of the NRT API, the fingerprint should be represented as top-level attribute.
As the new version of the NRT API has been released PR is gonna be changed to use the top-level attribute instead of the annotation.
What would you like to be added:
Add an option to compute the fingerprint of the current pod set,
and report it as a top-level attribute in NRT objects. This is meant to
enable better caching in the scheduler side.
To use such an attribute we first need to update NRT API version to
v1alpha2
.Why is this needed:
To minimize, or even eliminate, the occurrence of
TopologyAffinityError
s TopologyManager and Scheduler should have a synched view of node resources.However it is hard for the NUMA aware scheduler plugins to track the resource allocation with NUMA zone granularity.
A new reserve plugin implementing a scheduler-side cache will improve this situation. More info here
A prerequisite for this plugin to work is that topology updater agents, like NFD, should compute and provide a "node state" in NRT objects along with the resource information.
A pod set fingerprint is a compact representation of this "node state" so this enhancement will allow NFD to work nicely with this new NUMA-aware scheduler-side cache and improve the NUMA-aware scheduling process.
Additional Info and Links
The text was updated successfully, but these errors were encountered: