Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NixOS: Add support for CDI #284507

Merged
merged 3 commits into from
Feb 22, 2024
Merged

NixOS: Add support for CDI #284507

merged 3 commits into from
Feb 22, 2024

Conversation

ereslibre
Copy link
Member

@ereslibre ereslibre commented Jan 28, 2024

Description of changes

This builds on top of #278969. It provides two main features to NixOS related to the Container Device Interface (CDI):

  • virtualisation.containers.cdi.static attrset of JSONs. This can be used to provide static valid JSON files for CDI that will be placed under /etc/cdi.

  • virtualisation.containers.cdi.dynamic.nvidia.enable boolean. When enabled, it will use the nvidia-container-toolkit to generate a CDI spec based on the current machine, as a systemd unit file. The generated JSON file will be patched to accomodate the NixOS host system, and will be placed under /run/cdi/nvidia-container-toolkit.json.

As per the CDI spec, CDI-compliant runtimes need to inspect /etc/cdi (static configuration) and /var/run/cdi (dynamic configuration); prioritizing what is under the dynamic configuration over the static configuration in case of clash.

So, this implementation is compliant with this specification.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@ereslibre ereslibre force-pushed the containers-cdi branch 4 times, most recently from 0d7dd00 to 95e9d84 Compare January 28, 2024 20:22
@ereslibre
Copy link
Member Author

The last commit contains more cleanup than is needed, since we support Docker < 25, and thus, we'll still need the container runtime approach, given the CDI support was included in Docker 25.

When the final PR is submitted I'll get rid of that cleanup commit.

Copy link
Contributor

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how this cleanup makes it easier to keep track of all the nvidia dependencies.

One thing I'd like to add is that some sort of integration test would probably be a good idea. I'm not sure whether this is possible as they'd probably have to run in a NixOS VM though.

@SomeoneSerge Might have some comments on the new layout of the package.nix/default.nix files.

Copy link
Contributor

@SomeoneSerge SomeoneSerge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks much simpler and cleaner than what we had

@ereslibre ereslibre force-pushed the containers-cdi branch 2 times, most recently from 4ad8cfa to a252928 Compare February 2, 2024 14:48
Copy link
Contributor

@SomeoneSerge SomeoneSerge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel a bit disoriented, but I think all of my remaining comments are mere nitpicks. We also outlined a few TODOs for the immediate future (correct handling of closures, decisions about /usr/bin, libc priorities, deciding about whether to delete nvidia-container-toolkit-cdi-generate as a separate nixos module).

I think I'm convinced enough by the public-facing side of the PR. I also think this should just work for most of the basic use-cases (the overlay experience may need more looking into; and yes, I haven't yet tested this branch myself). In the (hopefully unlikely) case this PR does break somebody's workflow, I am online and respond to github pings, and I expect @ereslibre to be available as well, so we can always discuss and fix stuff.

...Is it time?

@ereslibre
Copy link
Member Author

I am online and respond to github pings, and I expect @ereslibre to be available as well, so we can always discuss and fix stuff.

Likewise here, I'm happy to take ownership of this bit.

@ereslibre ereslibre marked this pull request as draft February 20, 2024 15:50
@SomeoneSerge SomeoneSerge mentioned this pull request Feb 21, 2024
12 tasks
Copy link
Contributor

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I just tested this on an instance with a T4 GPU on GCP. Seems to "just work" out of the box.

One thing I'm noticing is that the generated cdi file is not symlinked to /etc/cdi/nvidia.yaml as described in the nvidia docs. This doesn't seem to be an issue but higher level tooling (like the nvidia-operator for k8s) might have issues with this. I'd expect this to be solvable on the side of said tooling though, so this seems fine to me (and we can always change this later).

Fantastic work!

@ereslibre
Copy link
Member Author

I just want to double check a couple things and answer all remaining conversations on this PR and mark it as ready to review, sorry for this last wait. Final stretch :)

Thanks a ton for your confirmation @aaronmondal :)

@ereslibre
Copy link
Member Author

ereslibre commented Feb 21, 2024

One thing I'm noticing is that the generated cdi file is not symlinked to /etc/cdi/nvidia.yaml as described in the nvidia docs. This doesn't seem to be an issue but higher level tooling (like the nvidia-operator for k8s) might have issues with this. I'd expect this to be solvable on the side of said tooling though, so this seems fine to me (and we can always change this later).

We'll have to check with the nvidia-operator and other components, but in principle both /etc/cdi and /var/run/cdi should be inspected by the runtimes (according to the CDI spec, prioritizing /var/run/cdi over /etc/cdi).

With this implementation, users can provide their static configuration, that is placed under /etc/cdi, and the "dynamic configuration", that is only implemented for nvidia (as of today) will place the dynamic configuration under /run/cdi (to which /var/run is linked to in NixOS.)

@ereslibre
Copy link
Member Author

ereslibre commented Feb 21, 2024

Everything seems to be fine, this is the only thing I cannot explain:

❯ podman run --rm --device=nvidia.com/gpu=all -it ubuntu:latest ldd /usr/bin/nvidia-smi
	linux-vdso.so.1 (0x00007fff0c181000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f94ec666000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f94ec57f000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f94ec578000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f94ec34f000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f94ec34a000)
	/nix/store/cyrrf49i2hm1w7vn2j945ic3rrzgxbqs-glibc-2.38-44/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f94ec671000)

What is exactly the meaning of => on this instance, and why is /lib64 there?:

/nix/store/cyrrf49i2hm1w7vn2j945ic3rrzgxbqs-glibc-2.38-44/lib/ld-linux-x86-64.so.2 => /lib64/ld-linux-x86-64.so.2 (0x00007f94ec671000)

Anyhow, I think we can move on, and fix any issues that might appear due to this change. As I have stated before, I take ownership of this logic and I'm happy to address those issues.

@ereslibre ereslibre marked this pull request as ready for review February 21, 2024 21:22
@SomeoneSerge SomeoneSerge merged commit ee3923e into NixOS:master Feb 22, 2024
24 checks passed
@ereslibre ereslibre deleted the containers-cdi branch February 22, 2024 13:09
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/docker-rootless-with-nvidia-support/37069/21

@MathieuMoalic
Copy link

I just want to say thanks for the work. I was watching this issue closely as I would benefit a lot from this. Thanks!

@stuzenz
Copy link

stuzenz commented Aug 8, 2024

Likewise, I have been looking into this issue and just found this pull request. Thank you for all the work on this issue - greatly appreciated.

I have linked this pull request to some reddit questions about GPU use with docker/podman on NixOS where others are trying to find their way through it.

Many thanks!

@ereslibre
Copy link
Member Author

I have linked this pull request to some reddit questions about GPU use with docker/podman on NixOS where others are trying to find their way through it.

Thanks a lot for this @stuzenz. Just a reminder that this feature was updated in subsequent PR's and that virtualisation.containers.cdi.static and virtualisation.containers.cdi.dynamic are no longer a thing. hardware.nvidia-container-toolkit.enable should be used instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants