-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rPackages.torch: Fix torch #273536
base: r-updates
Are you sure you want to change the base?
rPackages.torch: Fix torch #273536
Conversation
Torch requires a number of CUDA related packages Using the binary supplied by R torch. Precedent is in libtorch-bin in nixpkgs.
I picked the wrong time to put this PR up for review. When I tested this PR rebased onto master, it doesn't run, because Looks like CUDA is getting a major overhaul in nixpkgs right now as well. https://discourse.nixos.org/t/cuda-team-roadmap-and-call-for-sponsors/29495. Building from source may be easier once CUDA has been reworked. |
I had a go at compiling liblantern but encountered torch version compatibility issues. I guess we could use binary torch and compile lantern, but this doesn't seem to improve things much. Not a cuda expert so unsure how to improve this. |
Thanks for taking a look. Ideally, compiling from source would be an option, but I couldn't figure it out, and I spent a good amount of time trying to debug it. Maybe someone from the CUDA team could help though. Worst case scenario, we just keep downloading the binaries. If #328980 just works with the GPU, it would make the nix code a lot cleaner as we just download one binary. |
So I don't have cuda on my system, and tried the binary of the gpu version. Trying the same binary on an non-nix installation of R (so using opensuse's package manager) of the same binary file (and still no cuda installed), I do get a list of memory statistics using Or we just provide the cpu version, and let users deal with the gpu version, because it requires quite a lot of deps and is also hardware specific. If you don't have an nvidia gpu, you can't use the gpu version anyway, but can still take advantage of the cpu version... or we make two packages, a torch-cpu and a torch-gpu, and users install whatever they need. This way, we can have a smaller version of the package (torch-cpu) and still provide torch-gpu for nvidia users. I will also close my PR, so we can keep the discussion in one place. |
@b-rodrigues, thanks for testing the GPU version, I had hoped that your simple fix would also work for GPU, it would have made maintenance much easier. Which Nvidia GPU did you test with? I had access to a GTX 3070, and ran it on a range of GPUs in a HPC cluster, A100, H100 and maybe an L45 if I recall correctly. Could you try out one more thing before abandoning your approach? It sounds like you were using Nix inside Opensuse, and not NixOS. Is that correct? If so, please try again, but with nix-gl-host, eg Torch bundled by this PR only worked on a HPC running CentOS when I added |
Regarding two packages versus one package, I think the way to approach it is to check for CUDA. If the user has
then provide the GPU accelerated version, otherwise provide the CPU-only version. We may need some documentation or a warning if A wiki is probably important, as issues like requiring |
Another thought on having two packages: the GPU binaries fall back to CPU operation anyway. The only advantage I see for a CPU-only binary is saving space. |
Amazing, that seems to work! I wrote the following shell.nix:
and then used
trying to run
that's pretty neat. Should we go with the binary route then?
but that config is only for NixOS right? How could we do it for non-NixOS Nix package managers users? |
That's promising! This PR doesn't actually build the C code anyway, it just downloads binaries too. Most of the code in this PR is for getting the precompiled Getting the correct version of CUDA bundled along with the R package will simplify the Nix code. |
Regarding the option, it is a nixpkgs option, and I used it in a nix shell on CentOS, so it isn't exclusive to NixOS. However, the code pattern I often see in nixpkgs is to do something like this: r-packages.nix
Then by default, CUDA is off, but users can override it for the R package set, and NixOS can override it according to If |
Description of changes
The latest commit in NixOS/r-updates allows the R package
torch
to install. However, callinglibrary(torch)
causes the package to attempt to download and install binaries if it does not find them in$TORCH_PATH
. This PR allowstorch
to actually run. I have been usingtorch
with an RTX 3070 laptop card and it is working welltorch
requires a range of CUDA related packages and libtorch. Python is not needed, liblantern is a wrapper for libtorch that interfaces with R.Ideally, libtorch would be built from source. However, I struggled to get libtorch and liblantern to compile, but I did succeed in getting the binaries to download, install, and run correctly. I used libtorch-bin from nixpkgs as an example.
The PR is a draft. I don't like that it pulls binaries, or how the code is organised.
r-modules/default.nix
doesn't seem like the right place to put all thetorch
overrides. I also have very little free time for the next few months, so I don't think I can figure out how to make it fit nixpkgs better without assistance.@jbedo
Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.