Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue installing GPU-enabled torch #1151

Closed
aosakwe opened this issue Apr 30, 2024 · 13 comments
Closed

Issue installing GPU-enabled torch #1151

aosakwe opened this issue Apr 30, 2024 · 13 comments

Comments

@aosakwe
Copy link

aosakwe commented Apr 30, 2024

Hello,

I am trying to install the GPU-enabled version of torch onto an HPC server. However, I have run into difficulties when running the install_torch() command. It seems that the zip file stored in this link (https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip) is not accessible. I encountered the same issue when installing with CUDA 11.7. The download link also doesn't work when I try to download it manually with wget through the server and locally.

The exact error message following the command's execution is below. Any help with this matter would be greatly appreciated.

> torch::install_torch()
ℹ Additional software needs to be downloaded and installed for torch to work correctly.
Do you want to continue? (Yes/no/cancel) Yes
trying URL 'https://download.pytorch.org/libtorch/cu118/libtorch-cxx11-abi-shared-with-deps-2.0.1%2Bcu118.zip'
Content type 'application/zip' length 2395587423 bytes (2284.6 MB)
==================================================
downloaded 2284.6 MB

trying URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip'
trying URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip'
Error in `download_file()`:
✖ Unable to download from
  <https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip>
ℹ Please verify that the URL is not blocked by your firewall. See also
  <https://torch.mlverse.org/docs/articles/installation.html#file-based-download>
Caused by error in `utils::download.file()`:
! cannot open URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip'
Run `rlang::last_trace()` to see where the error occurred.
Warning messages:
1: In utils::download.file(url = url, destfile = destfile) :
  downloaded length 0 != reported length 298
2: In utils::download.file(url = url, destfile = destfile) :
  cannot open URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip': HTTP status was '403 Forbidden'
3: In file.remove(tmp) :
  cannot remove file '/tmp/RtmpSI8pRP/file2e3012571aeed7.zip', reason 'No such file or directory'
4: ℹ Failed to install torch, manually run `install_torch()`
Unable to download from
<https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip>
Caused by error in `download_file()`:
✖ Unable to download from
  <https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip>
ℹ Please verify that the URL is not blocked by your firewall. See also
  <https://torch.mlverse.org/docs/articles/installation.html#file-based-download>
Caused by error in `utils::download.file()`:
! cannot open URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip'
5: In utils::download.file(url = url, destfile = destfile) :
  downloaded length 0 != reported length 298
6: In utils::download.file(url = url, destfile = destfile) :
  cannot open URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu118+x86_64-Linux.zip': HTTP status was '403 Forbidden'
Warning message:
In file.remove(tmp) :
  cannot remove file '/tmp/RtmpSI8pRP/file2e301225061fb0.zip', reason 'No such file or directory'
@dfalbel
Copy link
Member

dfalbel commented Apr 30, 2024

I'm on it, there has been an accidental policy change on that storage bucket and it no longer has public access. I'll work as fast as possible to make it public again.

@BrunoDaleffi
Copy link

BrunoDaleffi commented May 1, 2024

Same issue here =)
I'm on Ubuntu 20.04 and CUDA 11.8

@dfalbel
Copy link
Member

dfalbel commented May 1, 2024

Should be fixed now!

@dfalbel dfalbel closed this as completed May 1, 2024
@aosakwe
Copy link
Author

aosakwe commented May 2, 2024

Hi @dfalbel thank you for the response! The required files are now downloading. However, I am getting an issue at the next step of the installation process:

> library(torch)
ℹ Additional software needs to be downloaded and installed for torch to work correctly.
Do you want to continue? (Yes/no/cancel) Yes
trying URL 'https://download.pytorch.org/libtorch/cu117/libtorch-cxx11-abi-shared-with-deps-2.0.1%2Bcu117.zip'
Content type 'application/zip' length 1957956986 bytes (1867.3 MB)
==================================================
downloaded 1867.3 MB

trying URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu117+x86_64-Linux.zip'
downloaded 5.5 MB

Warning message:
ℹ torch failed to start, restart your R session to try again.
ℹ You might need to reinstall torch using `install_torch()`
✖ /home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/torch/lib/liblantern.so - libcudart.so.11.0: cannot open shared object
  file: No such file or directory
Caused by error in `cpp_lantern_init()`:
! /home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/torch/lib/liblantern.so - libcudart.so.11.0: cannot open shared object file: No such file or directory
>

I looked into the installed torch folder and there seems to be a file called 'libcudart-e409450e.so.11.0' as opposed to 'libcudart.so.11.0' which is mentioned in the error message. Any idea how to resolve this? I tried reinstalling torch and restarting R and get the same issues.

@dfalbel
Copy link
Member

dfalbel commented May 2, 2024

Do you have cuda 11.7 and cudnn installed? My guess is that it's missing that installation.
I recommend using the pre-built binaries from https://torch.mlverse.org/docs/articles/installation#pre-built

@aosakwe
Copy link
Author

aosakwe commented May 2, 2024

Hi @dfalbel , Yes I have both installed. I'm running this on an HPC and loaded the corresponding modules. Still seeing the same issue.

@dfalbel
Copy link
Member

dfalbel commented May 2, 2024

Have you tried installing using the pre-built binaries?
I think this is caused by libcudart.so not being in your runtime search path. This can be fixed by using the pre-built binaries as you the package will bundle all necessary binaries.

@aosakwe
Copy link
Author

aosakwe commented May 2, 2024

@dfalbel Just tried that as well. Same issue

R version 4.2.2 (2022-10-31) -- "Innocent and Trusting"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> options(timeout = 600)
> kind <- "cu117"
> version <- available.packages()["torch","Version"]
--- Please select a CRAN mirror for use in this session ---
Secure CRAN mirrors

 1: 0-Cloud [https]
 2: Australia (Canberra) [https]
 3: Australia (Melbourne 1) [https]
 4: Australia (Melbourne 2) [https]
 5: Austria [https]
 6: Belgium (Brussels) [https]
 7: Brazil (PR) [https]
 8: Brazil (RJ) [https]
 9: Brazil (SP 1) [https]
10: Brazil (SP 2) [https]
11: Bulgaria [https]
12: Canada (MB) [https]
13: Canada (ON 1) [https]
14: Canada (ON 2) [https]
15: Chile (Santiago) [https]
16: China (Beijing 2) [https]
17: China (Beijing 3) [https]
18: China (Hefei) [https]
19: China (Hong Kong) [https]
20: China (Guangzhou) [https]
21: China (Jinan) [https]
22: China (Lanzhou) [https]
23: China (Nanjing) [https]
24: China (Shanghai 2) [https]
25: China (Shenzhen) [https]
26: Colombia (Cali) [https]
27: Costa Rica [https]
28: Cyprus [https]
29: Czech Republic [https]
30: Denmark [https]
31: East Asia [https]
32: Ecuador (Cuenca) [https]
33: France (Lyon 1) [https]
34: France (Lyon 2) [https]
35: France (Marseille) [https]
36: France (Paris 1) [https]
37: Germany (Erlangen) [https]
38: Germany (Göttingen) [https]
39: Germany (Leipzig) [https]
40: Germany (Münster) [https]
41: Greece [https]
42: Iceland [https]
43: India (Bengaluru) [https]
44: India (Bhubaneswar) [https]
45: Indonesia (Banda Aceh) [https]
46: Iran (Mashhad) [https]
47: Italy (Milano) [https]
48: Italy (Padua) [https]
49: Japan (Tokyo) [https]
50: Japan (Yonezawa) [https]
51: Korea (Gyeongsan-si) [https]
52: Mexico (Mexico City) [https]
53: Mexico (Texcoco) [https]
54: Morocco [https]
55: Netherlands (Dronten) [https]
56: New Zealand [https]
57: Norway [https]
58: South Africa (Johannesburg) [https]
59: Spain (A Coruña) [https]
60: Spain (Madrid) [https]
61: Sweden (Umeå) [https]
62: Switzerland (Zurich 1) [https]
63: Taiwan (Taipei) [https]
64: Turkey (Denizli) [https]
65: Turkey (Istanbul) [https]
66: UK (Bristol) [https]
67: UK (London 1) [https]
68: USA (IA) [https]
69: USA (MI) [https]
70: USA (MO) [https]
71: USA (OH) [https]
72: USA (OR) [https]
73: USA (TN) [https]
74: United Arab Emirates [https]
75: Uruguay [https]
76: (other mirrors)

Selection: 14
>
> options(repos = c(
+   torch = sprintf("https://storage.googleapis.com/torch-lantern-builds/packages/%s/%s/", kind, version),
+   CRAN = "https://cloud.r-project.org" # or any other from which you want to install the other R dependencies.
+ ))
> install.packages("torch")
Installing package into ‘/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2’
(as ‘lib’ is unspecified)
Warning: unable to access index for repository https://storage.googleapis.com/torch-lantern-builds/packages/cu117/0.12.0/src/contrib:
  cannot open URL 'https://storage.googleapis.com/torch-lantern-builds/packages/cu117/0.12.0/src/contrib/PACKAGES'
trying URL 'https://cloud.r-project.org/src/contrib/torch_0.12.0.tar.gz'
Content type 'application/x-gzip' length 1821835 bytes (1.7 MB)
==================================================
downloaded 1.7 MB

* installing *source* package ‘torch’ ...
** package ‘torch’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
*** Skip building lantern.
*** Renaming init
"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/bin/Rscript" "../tools/renameinit.R"
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c RcppExports.cpp -o RcppExports.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c amp.cpp -o amp.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c autograd.cpp -o autograd.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c backends.cpp -o backends.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c codegen.cpp -o codegen.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c contrib.cpp -o contrib.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c cuda.cpp -o cuda.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c device.cpp -o device.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c dimname_list.cpp -o dimname_list.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c dtype.cpp -o dtype.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c gen-namespace.cpp -o gen-namespace.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c generator.cpp -o generator.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c indexing.cpp -o indexing.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c ivalue.cpp -o ivalue.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c jit-compile.cpp -o jit-compile.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c jit-execute.cpp -o jit-execute.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c lantern.cpp -o lantern.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c layout.cpp -o layout.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c memory_format.cpp -o memory_format.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c nn_utils_rnn.cpp -o nn_utils_rnn.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c qscheme.cpp -o qscheme.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c quantization.cpp -o quantization.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c reduction.cpp -o reduction.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c save.cpp -o save.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c scalar.cpp -o scalar.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c script_module.cpp -o script_module.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c stack.cpp -o stack.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c storage.cpp -o storage.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c tensor.cpp -o tensor.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c tensor_list.cpp -o tensor_list.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c torch_api.cpp -o torch_api.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c torch_exports.cpp -o torch_exports.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c trace.cpp -o trace.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c utils.cpp -o utils.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c variable_list.cpp -o variable_list.o
g++ -std=gnu++14 -I"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/include" -DNDEBUG -I../inst/include/ -DRCPP_NO_UNWIND_PROTECT -I'/home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/Rcpp/include' -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/java/13.0.2/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include -I/cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/flexiblas/3.0.4/include/flexiblas   -fpic  -O2 -ftree-vectorize -march=core-avx2 -fno-math-errno  -c xptr.cpp -o xptr.o
g++ -std=gnu++14 -shared -L/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/lib -o torch.so RcppExports.o amp.o autograd.o backends.o codegen.o contrib.o cuda.o device.o dimname_list.o dtype.o gen-namespace.o generator.o indexing.o ivalue.o jit-compile.o jit-execute.o lantern.o layout.o memory_format.o nn_utils_rnn.o qscheme.o quantization.o reduction.o save.o scalar.o script_module.o stack.o storage.o tensor.o tensor_list.o torch_api.o torch_exports.o trace.o utils.o variable_list.o xptr.o -L/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/lib -lR
*** Renaming torch lib to torchpkg
"/cvmfs/soft.computecanada.ca/easybuild/software/2020/avx2/Core/r/4.2.2/lib64/R/bin/Rscript" "../tools/renamelib.R"
installing to /home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/00LOCK-torch/00new/torch/libs
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
*** copying figures
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** checking absolute paths in shared objects and dynamic libraries
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (torch)

The downloaded source packages are in
        ‘/tmp/RtmpHmZOnf/downloaded_packages’
> library(torch)
ℹ Additional software needs to be downloaded and installed for torch to work correctly.
Do you want to continue? (Yes/no/cancel) Yes
trying URL 'https://download.pytorch.org/libtorch/cu117/libtorch-cxx11-abi-shared-with-deps-2.0.1%2Bcu117.zip'
Content type 'application/zip' length 1957956986 bytes (1867.3 MB)
==================================================
downloaded 1867.3 MB

trying URL 'https://storage.googleapis.com/torch-lantern-builds/binaries/refs/heads/cran/v0.12.0/latest/lantern-0.12.0+cu117+x86_64-Linux.zip'
downloaded 5.5 MB

Warning message:
ℹ torch failed to start, restart your R session to try again.
ℹ You might need to reinstall torch using `install_torch()`
✖ /home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/torch/lib/liblantern.so - libcudart.so.11.0: cannot open shared object
  file: No such file or directory
Caused by error in `cpp_lantern_init()`:
! /home/aosakwe/R/x86_64-pc-linux-gnu-library/4.2/torch/lib/liblantern.so - libcudart.so.11.0: cannot open shared object file: No such file or directory

@dfalbel
Copy link
Member

dfalbel commented May 2, 2024

Sorry @aosakwe
Can you please try: kind <- "cu118" instaed?

I belive there's an error in the docs, as we are no longer building 11.7 binaries.

@aosakwe
Copy link
Author

aosakwe commented May 2, 2024

Hi, this ended up working thanks a lot!

@jaredlander
Copy link

I noticed the install vignette says that if you install pre built binaries then you don't need to install CUDA or cuDNN.

When installing from the pre-built binaries, you don’t need to manually install CUDA or cuDNN. If you have CUDA installed, it doesn’t need to match the installation ‘kind’ chosen below.

Does that mean if we have a machine with a GPU but no CUDA or cuDNN (perhaps in a lightweight docker container) then Sys.setenv(CUDA='11.8'); torch::install_torch() will install with everything we need?

If so, that can really cut down on the size of the docker image.

@dfalbel
Copy link
Member

dfalbel commented Jun 7, 2024

Actually, you need to install it using instructions here. These are pre-built binary packages, similar to pre-built binaries that CRAN serve, but bundles all CUDA and CUDnn dependencies.

It unfortunatelly probably won't reduce much the size of the docker image as you still need NVIDIA GPU drivers and these package binaries are quite large (~2GB) compressed.

It makes installation much easier to maintain though, if you are not using a NVIDIA container such as nvidia/cuda:11.8.0-cudnn8-devel-ubi8

@jaredlander
Copy link

I did some experimenting on the CUDA devel image and a regular, cpu rocker image. Installing that way enabled GPU access in both instances. It even enabled me to use xgboost on the GPU.

Related to that, xgboost runs on the CUDA runtime image and it looks like there are pytorch images also based on the runtime images.

I wonder if it's possible to have smaller binaries similar to what's in that image linked above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants