-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue installing GPU-enabled torch #1151
Comments
I'm on it, there has been an accidental policy change on that storage bucket and it no longer has public access. I'll work as fast as possible to make it public again. |
Same issue here =) |
Should be fixed now! |
Hi @dfalbel thank you for the response! The required files are now downloading. However, I am getting an issue at the next step of the installation process:
I looked into the installed torch folder and there seems to be a file called 'libcudart-e409450e.so.11.0' as opposed to 'libcudart.so.11.0' which is mentioned in the error message. Any idea how to resolve this? I tried reinstalling torch and restarting R and get the same issues. |
Do you have cuda 11.7 and cudnn installed? My guess is that it's missing that installation. |
Hi @dfalbel , Yes I have both installed. I'm running this on an HPC and loaded the corresponding modules. Still seeing the same issue. |
Have you tried installing using the pre-built binaries? |
@dfalbel Just tried that as well. Same issue
|
Sorry @aosakwe I belive there's an error in the docs, as we are no longer building 11.7 binaries. |
Hi, this ended up working thanks a lot! |
I noticed the install vignette says that if you install pre built binaries then you don't need to install CUDA or cuDNN.
Does that mean if we have a machine with a GPU but no CUDA or cuDNN (perhaps in a lightweight docker container) then If so, that can really cut down on the size of the docker image. |
Actually, you need to install it using instructions here. These are pre-built binary packages, similar to pre-built binaries that CRAN serve, but bundles all CUDA and CUDnn dependencies. It unfortunatelly probably won't reduce much the size of the docker image as you still need NVIDIA GPU drivers and these package binaries are quite large (~2GB) compressed. It makes installation much easier to maintain though, if you are not using a NVIDIA container such as |
I did some experimenting on the CUDA devel image and a regular, cpu rocker image. Installing that way enabled GPU access in both instances. It even enabled me to use xgboost on the GPU. Related to that, xgboost runs on the CUDA runtime image and it looks like there are pytorch images also based on the runtime images. I wonder if it's possible to have smaller binaries similar to what's in that image linked above? |
Hello,
I am trying to install the GPU-enabled version of torch onto an HPC server. However, I have run into difficulties when running the install_torch() command. It seems that the zip file stored in this link (https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip) is not accessible. I encountered the same issue when installing with CUDA 11.7. The download link also doesn't work when I try to download it manually with wget through the server and locally.
The exact error message following the command's execution is below. Any help with this matter would be greatly appreciated.
The text was updated successfully, but these errors were encountered: