Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document minimum CUDA version of 11.4 #1385

Merged
merged 2 commits into from
Nov 21, 2023

Conversation

harrism
Copy link
Member

@harrism harrism commented Nov 21, 2023

Description

With the addition of libcudacxx 2.1.0, minimum CUDA version required to build RMM is now 11.4. This PR updates the readme to reflect this.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@harrism harrism added doc Documentation breaking Breaking change labels Nov 21, 2023
@harrism harrism requested a review from bdice November 21, 2023 21:35
@harrism harrism self-assigned this Nov 21, 2023
@jrhemstad
Copy link
Contributor

Not that it matters, but libcu++ should support 11.1.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with one question for @raydouglass about whether RAPIDS should document driver support or only CTK support.

README.md Outdated Show resolved Hide resolved
@jakirkham
Copy link
Member

Think we will want to update these constraints to match whatever minimum we use:

- cuda-version {{ cuda_spec }}

- {{ pin_compatible('cuda-version', max_pin='x', min_pin='x') }}

@bdice
Copy link
Contributor

bdice commented Nov 21, 2023

@jakirkham This PR is changing the docs for build requirements, not run requirements. Those conda recipe specs are relevant for runtime requirements but not for setting the minimum version to build.

@bdice
Copy link
Contributor

bdice commented Nov 21, 2023

Not that it matters, but libcu++ should support 11.1.

@benfred reported that libcudacxx 2.1.0 includes code that requires CUDA 11.4 to compile. @benfred Can you confirm what piece requires CUDA 11.4?

@bdice
Copy link
Contributor

bdice commented Nov 21, 2023

Looks like the failure was here: https://github.com/benfred/implicit/actions/runs/6947998720/job/18903049451?pr=703#step:4:2193

        FAILED: implicit/gpu/CMakeFiles/_cuda.dir/als.cu.o
        /usr/local/cuda/bin/nvcc -forward-unknown-to-host-compiler -DFMT_HEADER_ONLY=1 -DLIBCUDACXX_ENABLE_EXPERIMENTAL_MEMORY_RESOURCE -DSPDLOG_FMT_EXTERNAL -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -D_cuda_EXPORTS -I/project/. -I/opt/_internal/cpython-3.8.18/include/python3.8 -I/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/raft-src/cpp/include -I/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/rmm-src/include -I/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/libcudacxx-src/lib/cmake/libcudacxx/../../../include -I/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/thrust-src -I/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/thrust-src/dependencies/cub -I/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/fmt-src/include -I/project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/spdlog-src/include -isystem /usr/local/cuda/include --extended-lambda -Wno-deprecated-gpu-targets -Xfatbin=-compress-all --expt-relaxed-constexpr -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_60,code=[compute_60,sm_60]" "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_86,code=[compute_86,sm_86]" -Xcompiler=-fPIC -MD -MT implicit/gpu/CMakeFiles/_cuda.dir/als.cu.o -MF implicit/gpu/CMakeFiles/_cuda.dir/als.cu.o.d -x cu -c /project/implicit/gpu/als.cu -o implicit/gpu/CMakeFiles/_cuda.dir/als.cu.o
        /project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/libcudacxx-src/lib/cmake/libcudacxx/../../../include/cuda/std/detail/libcxx/include/__concepts/../__concepts/../__concepts/swappable.h(173): error: A __device__ variable cannot be marked constexpr
        
        /project/_skbuild/linux-x86_64-3.8/cmake-build/_deps/libcudacxx-src/lib/cmake/libcudacxx/../../../include/cuda/memory_resource(203): error: A __device__ variable cannot be marked constexpr
        
        2 errors detected in the compilation of "/project/implicit/gpu/als.cu".

Coming from memory_resource:203, which uses _LIBCUDACXX_CPO_ACCESSIBILITY to make it __device__ constexpr (or possibly const on CUDA <=11.2?).

@bdice
Copy link
Contributor

bdice commented Nov 21, 2023

Support for early CUDA 11 versions has been fixed in libcudacxx: NVIDIA/libcudacxx@015cd67

We will have this fix once we migrate to CCCL 2.2.0, but we only test RAPIDS with CUDA 11.4 so it's safer to require that. The changes in this PR should be fine.

@harrism
Copy link
Member Author

harrism commented Nov 21, 2023

The constexpr error in libcu++ is fixed in 2.2.0, which RMM does not use yet. In any case we test only on 11.4+, so let's go with documenting that as the minimum.

@raydouglass raydouglass merged commit 7f776ef into rapidsai:branch-23.12 Nov 21, 2023
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change doc Documentation
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants