-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial ROCm 3.1 support #3571
Partial ROCm 3.1 support #3571
Conversation
Codecov Report
@@ Coverage Diff @@
## python #3571 +/- ##
=======================================
+ Coverage 88% 88% +<1%
=======================================
Files 524 524
Lines 23598 23598
=======================================
+ Hits 20772 20774 +2
+ Misses 2826 2824 -2
Continue to review full report at Codecov.
|
CMakeLists.txt
Outdated
@@ -187,13 +187,13 @@ if(WITH_CUDA) | |||
message(STATUS "Found HIP compiler: ${HIP_HIPCC_EXECUTABLE}") | |||
set(CUDA 1) | |||
set(HIP 1) | |||
list(APPEND HIP_HCC_FLAGS "-I${HIP_ROOT_DIR}/include -Wno-c99-designator -Wno-macro-redefined -Wno-duplicate-decl-specifier -std=c++14") | |||
list(APPEND HIP_HCC_FLAGS "-I${HIP_ROOT_DIR}/include -I${HIP_ROOT_DIR}/../include -Wno-c99-designator -Wno-macro-redefined -Wno-duplicate-decl-specifier -std=c++14") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HIP_HCC_FLAGS
expands to "-I/opt/rocm/include -I/opt/rocm/../include -Wno-c99..."
on ROCm 3.0, but /opt/rocm/../include
doesn't exist. Is there a way in CMake to find the path of the parent ROCm folder? This would allow us to write "-I${HIP_ROOT_DIR}/include -I${ROCM_ROOT_DIR}/include -Wno-c99..."
and save us some headache when debugging this CMake logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately there is not. So maybe we should merge your patch instead of this pull request that contains an explicit version check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could do like in pytorch/pytorch:torch/utils/cpp_extension.py#L53-L73, e.g.
string(REGEX REPLACE "^(.+)/bin/hipcc$" "\\1" ROCM_HOME ${HIP_HIPCC_EXECUTABLE})
list(APPEND HIP_HCC_FLAGS "-I${HIP_ROOT_DIR}/include -I${ROCM_HOME}/include -Wno-c99-designator -Wno-macro-redefined -Wno-duplicate-decl-specifier -std=c++14")
But this seems superfluous given that we already hardcode the path to /opt/rocm
:
Line 182 in a4596e3
list(APPEND CMAKE_MODULE_PATH "/opt/rocm/hip/cmake") |
Why not directly do the following?
set(ROCM_HOME "/opt/rocm")
list(APPEND CMAKE_MODULE_PATH "${ROCM_HOME}/hip/cmake")
# ...
list(APPEND HIP_HCC_FLAGS "-I${HIP_ROOT_DIR}/include -I${ROCM_HOME}/include -Wno-c99-designator -Wno-macro-redefined -Wno-duplicate-decl-specifier -std=c++14")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ROCM_HOME
sounds like a sensible solution.
I think in general we should try to keep these details out of the main |
@KaiSzuttor, it makes no sense to move this stuff elsewhere because the usual place to do all the library detection is in the main CMakeLists.txt. CUDA detection takes up just as much space. |
Can you please explain this statement? Just because there is mess elsewhere in the file it does not mean that we should not clean up and extend the mess at another place. |
I'm currently trying to refactor the CMake logic for detecting CUDA. We are using the |
no, it happens in the |
Setting compiler flags, which is what we are currently discussing, shouldn't happen inside FindHIP.cmake. The choice of compiler flags is Espresso-specific. Find*.cmake files are supposed to be provided either by CMake or by the respective library to deal with things common to all use cases. |
Compiler flags should be set on targets not globally, so they neither belong into the main cmake file nor into FindHIP.cmake |
An introduction to modern cmake:
|
Please don't invest any more time on this PR, I'm including it in my CMake refactoring PR. It's taking more time than anticipated because the |
Description of changes: - move logic to import packages from `CMakeLists.txt` to dedicated helper files `cmake/Find<package>.cmake` for `find_package()` - enforce the Cython version requested in `CMakeLists.txt` - CMake now fails if `WITH_CUDA` is set to true but no CUDA-capable compiler is found - CMake now fails if `WITH_CLANG_TIDY` is set to true but Clang-tidy is not found or its version doesn't match the Clang compiler version - drop deprecated `FindCUDA` in favor of native CUDA support in CMake 3.10 (required for #3445) - add partial support for ROCm 3.1 (closes #3571, required for espressomd/docker#156)
Make CMake changes. Needs
ln -s /opt/rocm/bin/hcc* /opt/rocm/hip/bin/
to work around bug in hipcc linker wrapper. Build then succeeds, but anything that passes GPU pointers between compilation units (most notably, EK and LB) fails and all tests deadlock in the AMD HSA shutdown procedure.