Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The nvidia-device-plugin label GPU node automatically? #70

Closed
pytimer opened this issue Aug 30, 2018 · 6 comments
Closed

The nvidia-device-plugin label GPU node automatically? #70

pytimer opened this issue Aug 30, 2018 · 6 comments

Comments

@pytimer
Copy link

pytimer commented Aug 30, 2018

Hi,

I deploy the nvidia-device-plugin on my kubernetes cluster. But on my cluster, different nodes have different GPU type, i hope my pod can deploy the specify GPU type node.

Now i must label GPU on the node manually, i hope nvidia-device-plugin can label GPU type automatically.

Is it reasonable?

@flx42
Copy link
Member

flx42 commented Aug 30, 2018

There is a plan in the Kubernetes community: kubernetes/community#2265
The problem with a per-node label is that it wouldn't work for heterogeneous nodes (different GPU types on the same node). In addition, injecting those labels from the device plugin today would require RBAC and the deployment would be more complex.

@pytimer
Copy link
Author

pytimer commented Sep 2, 2018

Thanks for your reply.

I know. But now if i want to implement my case, Should i must be label GPU type on the node?

@cliffburdick
Copy link

@flx42 we're interested in this too. Can the nvidia agent on each node report to the kubelet or API server how many GPUs and of what type there are? The node label seems restrictive since it's static and the amount of each isn't tracked.

@Bharathkumarraju
Copy link

I did this with some terraform varible with this simple command, I did node-labeling

sed -i '/\/usr\/bin\/kubelet/ a \ \ --node-labels nodetype=${var.instance-label} \\' /etc/systemd/system/kubelet.service

@pytimer
Copy link
Author

pytimer commented Sep 13, 2018

@Bharathkumarraju which program add --node-labels value to kubelet? Your own program or when you deploy this node?

Now i write a daemonset to do it, when i deploy a GPU pod, i have to choose a node through my program, this program add nodeSelector to yaml.

I think it is not good, but i don't know other methods to do it now.

@RenaudWasTaken
Copy link
Contributor

This is now possible with the GPU feature discovery: https://github.com/NVIDIA/gpu-feature-discovery

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants