-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rPackages.torch: use binary package for installation #328980
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -94,6 +94,54 @@ Executing `nix-shell` will then drop you into an environment equivalent to the | |
one above. If you need additional packages just add them to the list and | ||
re-enter the shell. | ||
|
||
## torch with GPU acceleration {#torch-gpu-accel} | ||
|
||
The `torch` package included in `nixpkgs` is based on the binary packages by the | ||
mlverse team. By default, the CPU version of the package is installed. On Linux | ||
distributions other than NixOS, it is possible to enable GPU acceleration by | ||
setting `config.cudaSupport` to true when importing `nixpkgs` in a `shell.nix`. | ||
You will also need to use `nixglhost`. Here is an example of such a shell: | ||
|
||
```nix | ||
{ pkgs ? import <nixpkgs> {config.cudaSupport = true; }, lib ? pkgs.lib }: | ||
|
||
let | ||
nixglhost-sources = pkgs.fetchFromGitHub { | ||
owner = "numtide"; | ||
repo = "nix-gl-host"; | ||
rev = "main"; | ||
# Replace this with the hash Nix will complain about, TOFU style. | ||
hash = ""; | ||
}; | ||
|
||
nixglhost = pkgs.callPackage "${nixglhost-sources}/default.nix" { }; | ||
|
||
rpkgs = builtins.attrValues { | ||
inherit (pkgs.rPackages) | ||
torch; | ||
}; | ||
|
||
wrapped_pkgs = pkgs.rWrapper.override { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. h'm, personally, I read
|
||
packages = [ rpkgs ]; | ||
}; | ||
Comment on lines
+119
to
+126
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not just |
||
|
||
in pkgs.mkShell { | ||
buildInputs = [ | ||
nixglhost | ||
wrapped_pkgs | ||
]; | ||
} | ||
|
||
``` | ||
|
||
After executing `nix-shell` to drop in that environment, call `nixglhost R` to | ||
b-rodrigues marked this conversation as resolved.
Show resolved
Hide resolved
|
||
start a GPU accelerated R session. Test that everything worked well by calling | ||
`torch::cuda_memory_summary()` which should return a list of summary statistics | ||
regarding the available GPU memory. | ||
|
||
We are looking for contributors that could help us package the darwin binaries | ||
as well as fix `torch` for NixOS as well. | ||
|
||
## Updating the package set {#updating-the-package-set} | ||
|
||
There is a script and associated environment for regenerating the package | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there much value to keeping a binary-only package in Nixpkgs rather than out-of-tree, is it a common dependency?
Our There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll have to study There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It would be good to get everything to compile from source. I would support that effort, but I would need a Nix CUDA expert to help because I kept getting issues with CUDA version compatibility. I was not able to successfully compile This PR bypasses all the complexity of trying to compile I feel the value of having this binary-only package in nixpkgs is for R users who are not nix users first. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Compiling liblantern dosen't seem that difficult in itself, the main issue is it needs and old version of torch and doesn't work with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Would a wrapper do the trick? Wrap the binary to use nixglhost ootb? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If wrapping You could have a look at how I patched the binaries over at #273536. The important parts are the various There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We can try working together
There shouldn't be any need, you just autoPatchelf and/or addDriverRunpath to link the relevant libraries directly
Correct, we shouldn't
Note it's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have amended the docs to explain this is only for non-NixOS linux and that we are looking for people to fix torch for NixOS and darwin. For fixing this for NixOS I would need help, as I'm not running NixOS and don't have experience with patching binaries. Should we merge this one and then continue with another PR, or try to fix it as part of this PR? |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1803,10 +1803,33 @@ let | |
''; | ||
}); | ||
|
||
torch = old.torch.overrideAttrs (attrs: { | ||
preConfigure = '' | ||
patchShebangs configure | ||
''; | ||
Comment on lines
-1806
to
-1809
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Am I to understand that the original package was source-built? If so, I think we should keep a source-built There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original R torch package was indeed source built but didn't provide the torch library, it suggested users to download and install it on first run There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
oh |
||
torch = let | ||
# Sets the correct string to download either the binary for | ||
# the cpu version of the torch package, or the | ||
# the cuda-enabled version. | ||
# To use gpu acceleration, set `config.cudaSupport = true;` | ||
# when importing nixpkgs in your shell | ||
accel = if pkgs.config.cudaSupport then "cu118" else "cpu"; | ||
|
||
binary_sha = if pkgs.config.cudaSupport then | ||
"sha256-a80sG89C0svZzkjNRpY0rTR2P1JdvKAbWDGIIghsv2Y=" else | ||
"sha256-qUn8Rot6ME7iTvtNd52iw3ebqMnpLz7kwl/9GoPHD+I="; | ||
in | ||
old.torch.overrideAttrs (attrs: { | ||
nativeBuildInputs = attrs.nativeBuildInputs or [ ] ++ [ | ||
pkgs.autoPatchelfHook | ||
pkgs.autoAddDriverRunpath | ||
]; | ||
buildInputs = attrs.buildInputs or [ ] ++ [ | ||
"${lib.getLib R}/lib/R" # libR | ||
pkgs.zlib | ||
] ++ lib.optionals pkgs.config.cudaSupport [ | ||
pkgs.cudaPackages.cuda_cudart | ||
]; | ||
Comment on lines
+1819
to
+1828
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These changes are sufficient to do something like
However, normally libtorch_cuda links e.g. libcublas, etc, which I didn't have to explicitly mention. I suppose that means they're not listed in DT_NEEDED and are loaded by dlopen instead. Either way I expect all related functionality to be broken and I can't allocate the time to do more tests unfortunately I also see there's a lot of garbage in the outputs, e.g. a vendored copy of |
||
src = pkgs.fetchzip { | ||
url = "https://torch-cdn.mlverse.org/packages/${accel}/0.13.0/src/contrib/torch_0.13.0_R_x86_64-pc-linux-gnu.tar.gz"; | ||
sha256 = binary_sha; | ||
}; | ||
}); | ||
|
||
pak = old.pak.overrideAttrs (attrs: { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be documented in its own chapter rather than in the R section: #264433 (has been stale for a while, didn't get the chance to follow up) but maybe it's ok to temporarily merge this section and then replace it with a short reference when the chapter is added