Add NVIDIA GPU node labeller to scheduling-gpus.md #16090

jjacobelli · 2019-08-26T20:37:37Z

Improving the scheduling-gpus.md documentation by adding the NVIDIA GPU node labeller

Signed-off-by: Jordan Jacobelli <[email protected]>

k8s-ci-robot · 2019-08-26T20:37:37Z

Welcome @Ethyling!

It looks like this is your first PR to kubernetes/website 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/website has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

netlify · 2019-08-26T20:40:12Z

Deploy preview for kubernetes-io-master-staging ready!

Built with commit 646cbfd

https://deploy-preview-16090--kubernetes-io-master-staging.netlify.com

jjacobelli · 2019-09-06T22:52:21Z

Hey @vishh @Rajakavitha1, can we have some review on this? :)

RenaudWasTaken · 2019-09-18T19:20:48Z

ping on this :) !
@vishh @Rajakavitha1

zacharysarah · 2019-10-09T23:23:31Z

/hold

@Ethyling We're in the middle of a larger conversation about how to handle third party content. I'm holding this discussion pending the outcome of a KEP from #15748.

fejta-bot · 2020-01-08T00:00:55Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

sftim · 2020-01-14T23:54:21Z

/sig node

sftim

@Ethyling - SIG Docs hasn't yet got a clear and agreed policy on 3rd party content.

My own opinion: documentation should cover what's needed to run Kubernetes based on the main kubernetes/kubernetes repository and its dependencies. With exceptions where warranted.

Also my opinion, this is on the borderline of whether or not it's acceptable. An alternative might be to pop this table into a repo or other site that NVIDIA control, and hyperlink there.

If you're willing to revise this, here's some feedback. I hope it's useful.

sftim · 2020-01-16T23:14:32Z

content/en/docs/tasks/manage-gpus/scheduling-gpus.md

+| nvidia.com/cuda.runtime.major  | Integer    | Major of the version of CUDA             |
+| nvidia.com/cuda.runtime.minor  | Integer    | Minor of the version of CUDA             |
+| nvidia.com/cuda.driver.major   | Integer    | Major of the version of NVIDIA driver    |
+| nvidia.com/cuda.driver.minor   | Integer    | Minor of the version of NVIDIA driver    |
+| nvidia.com/cuda.driver.rev     | Integer    | Revision of the version of NVIDIA driver |
+| nvidia.com/gpu.family          | String     | Architecture family of the GPU           |
+| nvidia.com/gpu.machine         | String     | Machine type                             |
+| nvidia.com/gpu.product         | String     | Model of the GPU                         |
+| nvidia.com/gpu.memory          | Integer    | Memory of the GPU in Mb                  |
+| nvidia.com/gpu.compute.major   | Integer    | Major of the compute capabilities        |
+| nvidia.com/gpu.compute.minor   | Integer    | Minor of the compute capabilities        |
+| nvidia.com/gfd.timestamp       | Integer    | Timestamp of the generated labels        |


I would put the label names in backticks, and sort them.

RenaudWasTaken · 2020-01-26T00:12:58Z

Also my opinion, this is on the borderline of whether or not it's acceptable. An alternative might be to pop this table into a repo or other site that NVIDIA control, and hyperlink there.

For the alternative, this is surfaced here: https://github.com/NVIDIA/gpu-feature-discovery#labels
I'm happy to update this pull request with your suggestion, it is however not clear how I should treat the existing documentation that describes this feature (node labelling) but for AMD GPUs.

https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/#node-labeller

An alternative path would be to follow the example set by kubeadm with network plugins: a content area with multiple panels (aka tabs).
With the default tab being empty / describing the feature.

kbhawkey · 2020-02-11T13:40:45Z

@Ethyling ,
I plan to close this PR as it has been open for some time.
As there are outstanding changes, feel free to reopen the pull request.
/close

k8s-ci-robot · 2020-02-11T13:40:48Z

@kbhawkey: Closed this PR.

In response to this:

@Ethyling ,
I plan to close this PR as it has been open for some time.
As there are outstanding changes, feel free to reopen the pull request.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

RenaudWasTaken · 2020-02-16T22:28:47Z

/reopen

Hello @kbhawkey !
We are waiting for some feedback on how to structure the PR.
If you have some feedback, I'm happy to update the PR.

cc @sftim did you have time to look at my previous comment?

Thanks!

k8s-ci-robot · 2020-02-16T22:28:50Z

@RenaudWasTaken: Reopened this PR.

In response to this:

/reopen

Hello @kbhawkey !
We are waiting for some feedback on how to structure the PR.
If you have some feedback, I'm happy to update the PR.

cc @sftim did you have time to look at my previous comment?

Thanks!

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

fejta-bot · 2020-03-17T23:18:29Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

sftim · 2020-03-17T23:54:41Z

Does this fit with doc-policies-for-third-party-content ?

RenaudWasTaken · 2020-03-20T06:51:02Z

@sftim reading through the document, it seems that we fall under story #5

In PR #16766 @pouledodue proposed adding Hertzner Cloud Controller to the list of vendors that have implemented a cloud controller manager. That PR was held pending the outcome of this KEP, then later merged.

If my understanding is correct then this PR fit with the policies-for-third-party-content.

Signed-off-by: Renaud Gaubert <[email protected]>

k8s-ci-robot · 2020-03-20T06:56:40Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign steveperry-53
You can assign the PR to them by writing /assign @steveperry-53 in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

content/en/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

RenaudWasTaken · 2020-03-20T06:57:01Z

/assign @steveperry-53

RenaudWasTaken · 2020-03-20T06:57:33Z

Actually sorry I should have assigned @sftim
/assign @sftim

RenaudWasTaken · 2020-03-20T06:57:57Z

/remove-lifecycle rotten

sftim · 2020-03-20T10:47:50Z

To help with review, @Ethyling, can you amend the PR description with an explanation of why this fits in with the current content guide?

sftim · 2020-04-13T10:56:29Z

/close

Does not seem to fit in with current content guide

k8s-ci-robot · 2020-04-13T10:56:49Z

@sftim: Closed this PR.

In response to this:

/close

Does not seem to fit in with current content guide

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Add NVIDIA GPU node labeller to scheduling-gpus.md

e0b37a6

Signed-off-by: Jordan Jacobelli <[email protected]>

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 26, 2019

k8s-ci-robot requested review from Rajakavitha1 and vishh August 26, 2019 20:38

k8s-ci-robot added language/en Issues or PRs related to English language sig/docs Categorizes an issue or PR as relevant to SIG Docs. labels Aug 26, 2019

jjacobelli marked this pull request as ready for review September 3, 2019 22:01

k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 3, 2019

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 9, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 8, 2020

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Jan 14, 2020

sftim reviewed Jan 16, 2020

View reviewed changes

k8s-ci-robot closed this Feb 11, 2020

k8s-ci-robot reopened this Feb 16, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 17, 2020

k8s-ci-robot added the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 17, 2020

Reorder NVIDIA GPU labels

646cbfd

Signed-off-by: Renaud Gaubert <[email protected]>

k8s-ci-robot assigned steveperry-53 Mar 20, 2020

k8s-ci-robot assigned sftim Mar 20, 2020

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 20, 2020

k8s-ci-robot closed this Apr 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add NVIDIA GPU node labeller to scheduling-gpus.md #16090

Add NVIDIA GPU node labeller to scheduling-gpus.md #16090

jjacobelli commented Aug 26, 2019

k8s-ci-robot commented Aug 26, 2019

netlify bot commented Aug 26, 2019 •

edited

Loading

jjacobelli commented Sep 6, 2019

RenaudWasTaken commented Sep 18, 2019

zacharysarah commented Oct 9, 2019

fejta-bot commented Jan 8, 2020

sftim commented Jan 14, 2020

sftim left a comment

sftim Jan 16, 2020

RenaudWasTaken Mar 20, 2020

RenaudWasTaken commented Jan 26, 2020 •

edited

Loading

kbhawkey commented Feb 11, 2020

k8s-ci-robot commented Feb 11, 2020

RenaudWasTaken commented Feb 16, 2020

k8s-ci-robot commented Feb 16, 2020

fejta-bot commented Mar 17, 2020

sftim commented Mar 17, 2020

RenaudWasTaken commented Mar 20, 2020

k8s-ci-robot commented Mar 20, 2020

RenaudWasTaken commented Mar 20, 2020

RenaudWasTaken commented Mar 20, 2020

RenaudWasTaken commented Mar 20, 2020

sftim commented Mar 20, 2020

sftim commented Apr 13, 2020

k8s-ci-robot commented Apr 13, 2020

Add NVIDIA GPU node labeller to scheduling-gpus.md #16090

Add NVIDIA GPU node labeller to scheduling-gpus.md #16090

Conversation

jjacobelli commented Aug 26, 2019

k8s-ci-robot commented Aug 26, 2019

netlify bot commented Aug 26, 2019 • edited Loading

jjacobelli commented Sep 6, 2019

RenaudWasTaken commented Sep 18, 2019

zacharysarah commented Oct 9, 2019

fejta-bot commented Jan 8, 2020

sftim commented Jan 14, 2020

sftim left a comment

Choose a reason for hiding this comment

sftim Jan 16, 2020

Choose a reason for hiding this comment

RenaudWasTaken Mar 20, 2020

Choose a reason for hiding this comment

RenaudWasTaken commented Jan 26, 2020 • edited Loading

kbhawkey commented Feb 11, 2020

k8s-ci-robot commented Feb 11, 2020

RenaudWasTaken commented Feb 16, 2020

k8s-ci-robot commented Feb 16, 2020

fejta-bot commented Mar 17, 2020

sftim commented Mar 17, 2020

RenaudWasTaken commented Mar 20, 2020

k8s-ci-robot commented Mar 20, 2020

RenaudWasTaken commented Mar 20, 2020

RenaudWasTaken commented Mar 20, 2020

RenaudWasTaken commented Mar 20, 2020

sftim commented Mar 20, 2020

sftim commented Apr 13, 2020

k8s-ci-robot commented Apr 13, 2020

netlify bot commented Aug 26, 2019 •

edited

Loading

RenaudWasTaken commented Jan 26, 2020 •

edited

Loading