-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NVIDIA GPU node labeller to scheduling-gpus.md #16090
Conversation
Signed-off-by: Jordan Jacobelli <[email protected]>
Welcome @Ethyling! |
Deploy preview for kubernetes-io-master-staging ready! Built with commit 646cbfd https://deploy-preview-16090--kubernetes-io-master-staging.netlify.com |
Hey @vishh @Rajakavitha1, can we have some review on this? :) |
ping on this :) ! |
/hold @Ethyling We're in the middle of a larger conversation about how to handle third party content. I'm holding this discussion pending the outcome of a KEP from #15748. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/sig node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ethyling - SIG Docs hasn't yet got a clear and agreed policy on 3rd party content.
My own opinion: documentation should cover what's needed to run Kubernetes based on the main kubernetes/kubernetes repository and its dependencies. With exceptions where warranted.
Also my opinion, this is on the borderline of whether or not it's acceptable. An alternative might be to pop this table into a repo or other site that NVIDIA control, and hyperlink there.
If you're willing to revise this, here's some feedback. I hope it's useful.
| nvidia.com/cuda.runtime.major | Integer | Major of the version of CUDA | | ||
| nvidia.com/cuda.runtime.minor | Integer | Minor of the version of CUDA | | ||
| nvidia.com/cuda.driver.major | Integer | Major of the version of NVIDIA driver | | ||
| nvidia.com/cuda.driver.minor | Integer | Minor of the version of NVIDIA driver | | ||
| nvidia.com/cuda.driver.rev | Integer | Revision of the version of NVIDIA driver | | ||
| nvidia.com/gpu.family | String | Architecture family of the GPU | | ||
| nvidia.com/gpu.machine | String | Machine type | | ||
| nvidia.com/gpu.product | String | Model of the GPU | | ||
| nvidia.com/gpu.memory | Integer | Memory of the GPU in Mb | | ||
| nvidia.com/gpu.compute.major | Integer | Major of the compute capabilities | | ||
| nvidia.com/gpu.compute.minor | Integer | Minor of the compute capabilities | | ||
| nvidia.com/gfd.timestamp | Integer | Timestamp of the generated labels | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put the label names in backticks, and sort them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed!
For the alternative, this is surfaced here: https://github.com/NVIDIA/gpu-feature-discovery#labels https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#node-labeller An alternative path would be to follow the example set by kubeadm with network plugins: a content area with multiple panels (aka tabs). |
@Ethyling , |
@kbhawkey: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@RenaudWasTaken: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Does this fit with doc-policies-for-third-party-content ? |
@sftim reading through the document, it seems that we fall under story #5
If my understanding is correct then this PR fit with the policies-for-third-party-content. |
Signed-off-by: Renaud Gaubert <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @steveperry-53 |
/remove-lifecycle rotten |
To help with review, @Ethyling, can you amend the PR description with an explanation of why this fits in with the current content guide? |
/close Does not seem to fit in with current content guide |
@sftim: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Improving the
scheduling-gpus.md
documentation by adding the NVIDIA GPU node labeller