Update CUDA docs to use k3s suggested method #1430
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
This PR updates the existing documentation for how to run CUDA workloads on k3d.
Why
It's mentioned in this issue that the documentation is outdated and will run into errors without some edits to the config files. The issue creator suggested some edits the config files to work around the errors. Some of them are implemented in this PR, but there is also a way to get k3s to automatically detect the NVIDIA container runtime without the need for a custom
config.toml
file. This should result in a more robust solution that will smoothly handle upgrades. This is also pointed out in a comment on the same issue.The k3s documentation outlines how this can be setup, and this PR basically just implements those suggestions and cleans up sections that became obsolete. The NVIDIA container runtime should be automatically detected as long as you add a RuntimeClass definition to your cluster, and deploy Pods that explicitly request the appropriate runtime by setting
runtimeClassName: nvidia
in the Pod spec.Implications
No internal changes here.
Using this new method, you have to explicitly define
runtimeClassName: nvidia
in the Pod spec for a GPU setup, which you did not have to do before. This is the suggested method in the k3s docs however.It also appears that this method will work in WSL2. I have run basic tests on my local setup (Debian and WSL2) and did not run into any issues.
Fixes #658
Fixes #1108