-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use explicit accelerator types in alpaka device code #44636
Use explicit accelerator types in alpaka device code #44636
Conversation
enable gpu |
please test with cms-sw/cmsdist#9121 |
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44636/39823
|
A new Pull Request was created by @fwyzard for master. It involves the following packages:
@fwyzard, @makortel can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
In the original version we had:
This fails only in the CUDA case, due to a problem with the device-side linker and shared libraries:
This happens also with the native CUDA code, and the failing part of the test is disabled in the CUDA case:
I've disabled it like for the CUDA test case. |
-1 Failed Tests: Build The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: BuildI found compilation error when building: Entering library rule at src/HeterogeneousTest/AlpakaDevice/plugins >> Compiling alpaka/cuda src/HeterogeneousTest/AlpakaDevice/plugins/alpaka/AlpakaTestDeviceAdditionAlgo.dev.cc >> Compiling alpaka/cuda edm plugin src/HeterogeneousTest/AlpakaDevice/plugins/alpaka/AlpakaTestDeviceAdditionModule.cc >> Cuda Device Link tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o nvlink error : Undefined reference to '_ZN17alpaka_cuda_async4test13add_vectors_fERKN6alpaka22AccGpuUniformCudaHipRtINS1_9ApiCudaRtESt17integral_constantImLm1EEjEEPKfSA_Pfj' in 'tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/alpaka/AlpakaTestDeviceAdditionAlgo.dev.cc.o' (target: sm_60) gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o] Error 255 >> Building alpaka/cuda edm plugin tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/libHeterogeneousTestAlpakaDevicePluginsCudaAsync.so /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: cannot find tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o: No such file or directory collect2: error: ld returned 1 exit status gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/libHeterogeneousTestAlpakaDevicePluginsCudaAsync.so] Error 1 Leaving library rule at src/HeterogeneousTest/AlpakaDevice/plugins |
7f9bc74
to
1c20d3f
Compare
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-44636/39827
|
please test with cms-sw/cmsdist#9121 |
-1 Failed Tests: Build The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: BuildI found compilation error when building: Entering library rule at src/HeterogeneousTest/AlpakaDevice/plugins >> Compiling alpaka/cuda src/HeterogeneousTest/AlpakaDevice/plugins/alpaka/AlpakaTestDeviceAdditionAlgo.dev.cc >> Compiling alpaka/cuda edm plugin src/HeterogeneousTest/AlpakaDevice/plugins/alpaka/AlpakaTestDeviceAdditionModule.cc >> Cuda Device Link tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o nvlink error : Undefined reference to '_ZN17alpaka_cuda_async4test13add_vectors_fERKN6alpaka22AccGpuUniformCudaHipRtINS1_9ApiCudaRtESt17integral_constantImLm1EEjEEPKfSA_Pfj' in 'tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/alpaka/AlpakaTestDeviceAdditionAlgo.dev.cc.o' (target: sm_60) gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o] Error 255 >> Building alpaka/cuda edm plugin tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/libHeterogeneousTestAlpakaDevicePluginsCudaAsync.so /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: cannot find tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o: No such file or directory collect2: error: ld returned 1 exit status gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/libHeterogeneousTestAlpakaDevicePluginsCudaAsync.so] Error 1 Leaving library rule at src/HeterogeneousTest/AlpakaDevice/plugins |
please test with cms-sw/cmsdist#9121, #44640 |
-1 Failed Tests: Build BuildI found compilation error when building: Entering library rule at src/HeterogeneousTest/AlpakaKernel/plugins >> Compiling alpaka/cuda src/HeterogeneousTest/AlpakaKernel/plugins/alpaka/AlpakaTestKernelAdditionAlgo.dev.cc >> Compiling alpaka/cuda edm plugin src/HeterogeneousTest/AlpakaKernel/plugins/alpaka/AlpakaTestKernelAdditionModule.cc >> Cuda Device Link tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaKernel/plugins/HeterogeneousTestAlpakaKernelPluginsCudaAsync/HeterogeneousTestAlpakaKernelPluginsCudaAsync_cudadlink.o nvlink error : Undefined reference to '_ZNK17alpaka_cuda_async4test17KernelAddVectorsFclERKN6alpaka22AccGpuUniformCudaHipRtINS2_9ApiCudaRtESt17integral_constantImLm1EEjEEPKfSB_Pfj' in 'tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaKernel/plugins/HeterogeneousTestAlpakaKernelPluginsCudaAsync/alpaka/AlpakaTestKernelAdditionAlgo.dev.cc.o' (target: sm_60) gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaKernel/plugins/HeterogeneousTestAlpakaKernelPluginsCudaAsync/HeterogeneousTestAlpakaKernelPluginsCudaAsync_cudadlink.o] Error 255 >> Building alpaka/cuda edm plugin tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaKernel/plugins/HeterogeneousTestAlpakaKernelPluginsCudaAsync/libHeterogeneousTestAlpakaKernelPluginsCudaAsync.so /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: cannot find tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaKernel/plugins/HeterogeneousTestAlpakaKernelPluginsCudaAsync/HeterogeneousTestAlpakaKernelPluginsCudaAsync_cudadlink.o: No such file or directory collect2: error: ld returned 1 exit status gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaKernel/plugins/HeterogeneousTestAlpakaKernelPluginsCudaAsync/libHeterogeneousTestAlpakaKernelPluginsCudaAsync.so] Error 1 Leaving library rule at src/HeterogeneousTest/AlpakaKernel/plugins |
let's retry the test now that the dependencies have been merged |
enable gpu |
please test |
-1 Failed Tests: Build BuildI found compilation error when building: Entering library rule at src/HeterogeneousTest/AlpakaDevice/plugins >> Compiling alpaka/cuda src/HeterogeneousTest/AlpakaDevice/plugins/alpaka/AlpakaTestDeviceAdditionAlgo.dev.cc >> Compiling alpaka/cuda edm plugin src/HeterogeneousTest/AlpakaDevice/plugins/alpaka/AlpakaTestDeviceAdditionModule.cc >> Cuda Device Link tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o nvlink error : Undefined reference to '_ZN17alpaka_cuda_async4test13add_vectors_fERKN6alpaka22AccGpuUniformCudaHipRtINS1_9ApiCudaRtESt17integral_constantImLm1EEjEEPKfSA_Pfj' in 'tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/alpaka/AlpakaTestDeviceAdditionAlgo.dev.cc.o' (target: sm_60) gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o] Error 255 >> Building alpaka/cuda edm plugin tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/libHeterogeneousTestAlpakaDevicePluginsCudaAsync.so /cvmfs/cms-ib.cern.ch/sw/x86_64/nweek-02832/el8_amd64_gcc12/external/gcc/12.3.1-40d504be6370b5a30e3947a6e575ca28/bin/../lib/gcc/x86_64-redhat-linux-gnu/12.3.1/../../../../x86_64-redhat-linux-gnu/bin/ld.bfd: cannot find tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/HeterogeneousTestAlpakaDevicePluginsCudaAsync_cudadlink.o: No such file or directory collect2: error: ld returned 1 exit status gmake: *** [tmp/el8_amd64_gcc12/src/HeterogeneousTest/AlpakaDevice/plugins/HeterogeneousTestAlpakaDevicePluginsCudaAsync/libHeterogeneousTestAlpakaDevicePluginsCudaAsync.so] Error 1 Leaving library rule at src/HeterogeneousTest/AlpakaDevice/plugins |
Mhm, for me this builds and runs locally on top of the latest IB. Is there a way to reset the PR test parameters ? |
IIRC message with empty test parameters along
should do it |
Remove "test parameters" comment (bot doesn't actively save parameters between invocations) |
FYI, now that the new release cycle is open, this PR has been updated for CMSSW 14.2.x in #45887. |
PR description:
Update the
HeterogeneousTest
alpaka tests to use the explicit accelerator types in alpaka device code:ALPAKA_ACCELERATOR_NAMESPACE
namespace;Acc1D
orAcc2D
, instead of the genericTAcc
template argument;PR validation:
The new
HeterogeneousTest
unit tests pass: