Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to compute the pod set fingerprint to improve NUMA aware scheduling #1048

Closed
jlojosnegros opened this issue Feb 3, 2023 · 4 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@jlojosnegros
Copy link
Contributor

jlojosnegros commented Feb 3, 2023

What would you like to be added:
Add an option to compute the fingerprint of the current pod set,
and report it as a top-level attribute in NRT objects. This is meant to
enable better caching in the scheduler side.
To use such an attribute we first need to update NRT API version to v1alpha2.

Why is this needed:
To minimize, or even eliminate, the occurrence of TopologyAffinityErrors TopologyManager and Scheduler should have a synched view of node resources.
However it is hard for the NUMA aware scheduler plugins to track the resource allocation with NUMA zone granularity.
A new reserve plugin implementing a scheduler-side cache will improve this situation. More info here
A prerequisite for this plugin to work is that topology updater agents, like NFD, should compute and provide a "node state" in NRT objects along with the resource information.
A pod set fingerprint is a compact representation of this "node state" so this enhancement will allow NFD to work nicely with this new NUMA-aware scheduler-side cache and improve the NUMA-aware scheduling process.

Additional Info and Links

@jlojosnegros jlojosnegros added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 3, 2023
@ffromani
Copy link
Contributor

ffromani commented Feb 3, 2023

thanks for filing this issue!
the "fingerprint" is a opaque string which can be checked for equality or not - no other operation is defined.
We are prepping a new version of the NRT API, the fingerprint should be represented as top-level attribute.

@ffromani
Copy link
Contributor

ffromani commented Feb 3, 2023

/cc

1 similar comment
@swatisehgal
Copy link
Contributor

/cc

@jlojosnegros
Copy link
Contributor Author

thanks for filing this issue! the "fingerprint" is a opaque string which can be checked for equality or not - no other operation is defined. We are prepping a new version of the NRT API, the fingerprint should be represented as top-level attribute.

As the new version of the NRT API has been released PR is gonna be changed to use the top-level attribute instead of the annotation.

Gonna edit the description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants