Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create extended resources with NodeFeatureRule #1081

Closed
marquiz opened this issue Mar 9, 2023 · 5 comments · Fixed by #1099
Closed

Create extended resources with NodeFeatureRule #1081

marquiz opened this issue Mar 9, 2023 · 5 comments · Fixed by #1099
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Milestone

Comments

@marquiz
Copy link
Contributor

marquiz commented Mar 9, 2023

What would you like to be added:

In addition to labels, taints (#540) and annotations (#863), support also management of extended resources with the NodeFeatureRule CRD API.

Why is this needed:

There are usage scenarios where people want to advertise features as extended resources instead of labels (or annotations), e.g. #1079. There has also been use cases like accounting number of network bridge interfaces (or ports, I don't remember) and GPU memory for example. We have long had experimental support for extended resources via the -resource-labels command line flag in nfd-master but that is very unwieldy for users (but obviously has worked as we haven't basically got any bug reports or complaints about that).

@marquiz marquiz added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 9, 2023
@marquiz
Copy link
Contributor Author

marquiz commented Mar 9, 2023

ping @uniemimu @mythi @fidencio

@marquiz
Copy link
Contributor Author

marquiz commented Mar 9, 2023

/help

On top of my head an outline/list of sub-tasks related to the implementation:

  1. Extend the NodeFeatureRule CRD API and helpers
    1. Add new extendedResources field to NodeFeatureRuleSpec in pkg/apis/nfd/v1alpha1/types.go
    2. Run make generate to update yamls and auto-generated code
    3. simple update of the RuleOutput type (and related methods/funcs) in pkg/apis/nfd/v1alpha1/rule.go to return also extended resources (in addition to labels and taints)
  2. Implement support in nfd-master (not super complex as the basic infrastructure for handling extended resources is already in place)
    1. update processNodeFeatureRule() to return extended resources
    2. update refreshNodeFeatures() to correspondingly update the node status with these extended resources
  3. Documentation: at least introduction.md and customization-guide.md
  4. Extend e2e-tests to cover extended resources

Few extra considerations regarding "namespacing" i.e. prefixing of names of extended resources:

  1. We probably want to specify a "default namespace" that gets applied if no namespace (i.e. <namespace>/ prefix is give). Basically add a new const like ExtendedResourceNs in pkg/apis/nfd/v1alpha1/annotations_labels.go. Could be e.g. the same as with labels i.e. feature.node.kubernetes.io or nfd.kubernetes.io (dunno what would make most sense).
  2. It might be best to just allow any vendor specific namespaces, but might want to deny kubernetes.io/ and its sub-namespaces (*.kubernetes.io/) (not sure about this either)

EDIT: fixed misspelling NodeFeatureSpec -> NodeFeatureRuleSpec

@k8s-ci-robot
Copy link
Contributor

@marquiz:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

On top of my head an outline/list of sub-tasks related to the implementation:

  1. Extend the NodeFeatureRule CRD API and helpers
    1. Add new extendedResources field to NodeFeatureSpec in pkg/apis/nfd/v1alpha1/types.go
    2. Run make generate to update yamls and auto-generated code
    3. simple update of the RuleOutput type (and related methods/funcs) in pkg/apis/nfd/v1alpha1/rule.go to return also extended resources (in addition to labels and taints)
  2. Implement support in nfd-master (not super complex as the basic infrastructure for handling extended resources is already in place)
    1. update processNodeFeatureRule() to return extended resources
    2. update refreshNodeFeatures() to correspondingly update the node status with these extended resources
  3. Documentation: at least introduction.md and customization-guide.md
  4. Extend e2e-tests to cover extended resources

Few extra considerations regarding "namespacing" i.e. prefixing of names of extended resources:

  1. We probably want to specify a "default namespace" that gets applied if no namespace (i.e. <namespace>/ prefix is give). Basically add a new const like ExtendedResourceNs in pkg/apis/nfd/v1alpha1/annotations_labels.go. Could be e.g. the same as with labels i.e. feature.node.kubernetes.io or nfd.kubernetes.io (dunno what would make most sense).
  2. It might be best to just allow any vendor specific namespaces, but might want to deny kubernetes.io/ and its sub-namespaces (*.kubernetes.io/) (not sure about this either)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 9, 2023
@marquiz marquiz added this to the v0.13.0 milestone Mar 9, 2023
@ArangoGutierrez
Copy link
Contributor

/assign

@PiotrProkop
Copy link
Contributor

@marquiz @ArangoGutierrez Another example of how it can be useful(in other words we would definitely use it!) is to expose Intel RDT num_closid the max amount of RDT control groups in the system (prolly -1 for default group).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants