Disable exceptions when used in CUDA code #79
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CUDA supports neither exceptions nor
std::terminate
in device code, but NVCC silently ignores them. Unfortunately, Clang, which can be used as a drop-in replacement for NVCC to compile CUDA code, does not silently ignore these things, raising the following errors:To observe this issue, use the following minimal piece of sample code:
It can be successfully compiled by NVCC (
/usr/local/cuda-11.0/bin/nvcc -I variant/include -x cu -ccbin $(which g++-9) -std=c++14 -gencode=arch=compute_52,code=[sm_52,compute_52] --expt-relaxed-constexpr -o test_nvcc test.cpp
). It cannot be successfully compiled by Clang (clang++-10 -I variant/include -x cuda --gcc-toolchain=$(dirname $(dirname $(which gcc-9))) -std=c++14 --cuda-path=/usr/local/cuda-10.0 --cuda-gpu-arch=sm_52 -L /usr/local/cuda-10.1/lib -lcudart -o test_clang test.cpp
) without my patch. Note that the precise versions of CUDA, Clang or GCC used in the compile command do not matter, the ones I gave here simply are whatever compatible versions I had available on my system.My patch disables exceptions inside CUDA device code and uses the
__trap
intrinsic as anstd::terminate
-equivalent to deliver consistently correct behavior for both NVCC and Clang.