Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda compile errors #12

Closed
himat opened this issue Jan 15, 2022 · 4 comments · Fixed by #22
Closed

Cuda compile errors #12

himat opened this issue Jan 15, 2022 · 4 comments · Fixed by #22

Comments

@himat
Copy link

himat commented Jan 15, 2022

I'm getting errors when compiling by running cmake --build build --config RelWithDebInfo -j 16

The build error log is long, so I put it here: https://pastebin.com/6tRsjYfM

Is this due to some incorrect cuda version incompatibility? I'm on cuda 11.2

(tensorflow2_p38) ubuntu@ip-172-31-40-250:~/instant-ngp$ python -V
Python 3.8.12

(tensorflow2_p38) ubuntu@ip-172-31-40-250:~/instant-ngp$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

(tensorflow2_p38) ubuntu@ip-172-31-40-250:~/instant-ngp$ cmake --version
cmake version 3.22.1
(tensorflow2_p38) ubuntu@ip-172-31-40-250:~/instant-ngp$ echo $PATH 
/usr/local/cuda-11.2/bin:/home/ubuntu/anaconda3/envs/tensorflow2_p38/bin:/home/ubuntu/anaconda3/condabin:/opt/amazon/openmpi/bin:/opt/amazon/efa/bin:/home/ubuntu/anaconda3/condabin:/home/ubuntu/.dl_binaries/bin:/usr/local/cuda/bin:/opt/aws/neuron/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
(tensorflow2_p38) ubuntu@ip-172-31-40-250:~/instant-ngp$ echo $LD_LIBRARY_PATH 
/usr/local/cuda-11.2/lib64:/usr/local/cuda-11.2/extras/CUPTI/lib64:/usr/local/cuda-11.2/lib:/usr/local/cuda-11.2/efa/lib:/opt/amazon/efa/lib:/opt/amazon/efa/lib64:/usr/local/cuda/lib:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/targets/x86_64-linux/lib:/opt/amazon/efa/lib:/opt/amazon/openmpi/lib:/usr/local/lib:/usr/lib::/home/ubuntu/anaconda3/envs/tensorflow2_p38/lib/python3.8/site-packages/tensorflow
@mmalex
Copy link
Contributor

mmalex commented Jan 15, 2022 via email

@himat
Copy link
Author

himat commented Jan 15, 2022

Ah I see. I'm trying to run this on an AWS ec2 instance and it has a K80 GPU. (there's only one GPU on the machine I'm trying this on so I thought I wouldn't need to specify the architecture)
There are also instances with a V100.
Do you know what numbers to use for either of these? Or what aws instance you'd recommend?

@Tom94
Copy link
Collaborator

Tom94 commented Jan 16, 2022

Unfortunately, both K80 and V100 GPUs are unsupported.

This codebase requires at least compute capability 75+, which means RTX 2000-series GPUs, A100 GPUs, or RTX 3000-series GPUs.

See also #13. I will momentarily update the README to reflect this more explicitly

@Tom94
Copy link
Collaborator

Tom94 commented Jan 16, 2022

Apologies, but it's currently not in our scope to add dedicated implementations for older GPUs.

In part, because earlier GPU's performance would benefit much less from a fully fused implementation due to them being more compute rather than memory-bound than newer GPUs on the small-MLP workload.

All that said, I would still be more than happy to merge code contributions that improve compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants