Skip to content

Commit

Permalink
Document NFD for GPU Labeling
Browse files Browse the repository at this point in the history
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
  • Loading branch information
ArangoGutierrez committed Jan 27, 2024
1 parent 54ab2e8 commit 1a1b976
Showing 1 changed file with 16 additions and 6 deletions.
22 changes: 16 additions & 6 deletions content/en/docs/tasks/manage-gpus/scheduling-gpus.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,12 +81,22 @@ kubectl label nodes node2 accelerator=other-gpu-k915
That label key `accelerator` is just an example; you can use
a different label key if you prefer.

## Automatic node labelling {#node-labeller}
## Automatic node feature discovery {#node-feature-discovery}

If you're using AMD GPU devices, you can deploy
As and administrator, you can automatically discover and label all your GPU enabled nodes
by deploying the K8S-Sig project Node Feature Discovery [NFD](https://github.com/kubernetes-sigs/node-feature-discovery).
NFD enables node feature discovery for Kubernetes.
It detects hardware features available on each node in a Kubernetes cluster, and advertises those features using node labels and optionally node extended resources, annotations and node taints.
Node Feature Discovery is compatible with any recent version of Kubernetes (v1.21+).

Administrators can leverage NFD to also taint nodes with specific features, so that only pods that request those features can be scheduled on those nodes.

NFD exposes an API to allow vendors to leverage the automatic labeling functionality.
NVIDIA has implemented this API in the [GPU feature discovery](https://github.com/NVIDIA/gpu-feature-discovery/blob/main/README.md).

### Using custom labellers

For AMD GPUs, you can use the
[Node Labeller](https://github.com/RadeonOpenCompute/k8s-device-plugin/tree/master/cmd/k8s-node-labeller).
Node Labeller is a {{< glossary_tooltip text="controller" term_id="controller" >}} that automatically
labels your nodes with GPU device properties.

Similar functionality for NVIDIA is provided by
[GPU feature discovery](https://github.com/NVIDIA/gpu-feature-discovery/blob/main/README.md).
labels nodes in a Kubernetes cluster with AMD GPU device properties.

0 comments on commit 1a1b976

Please sign in to comment.