-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use rapids-cmake parallel testing feature #1183
Use rapids-cmake parallel testing feature #1183
Conversation
Things to be done before moving out of draft:
|
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-23.02 #1183 +/- ##
==============================================
Coverage ? 0.00%
==============================================
Files ? 6
Lines ? 414
Branches ? 0
==============================================
Hits ? 0
Misses ? 414
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Big fan of this PR! @bdice and I were planning on circling back to the GH Actions test scripts in the future to figure out how we could make some of these scripts less verbose. |
Moving to 23.04 |
Some RMM tests allocate significant memory. If tests are run in parallel on small-memory GPUs this might be a problem. |
If we have a list of high memory usage tests we can mark them as requiring the entire GPU which will remove this failure spot ( It will still allow multiple tests to run on multi-gpu systems ) |
03edaaa
to
fc21512
Compare
5559183
to
fc153f9
Compare
fc153f9
to
17df282
Compare
Tests are failing due to a lack of |
I think it has to go in the |
17df282
to
078a220
Compare
Looks to be reducing our c++ test time by 45sec ( 4min to 3.15min ). rmm has a couple of tests ( DEVICE_MR ) that have long execution times ( 40+sec ) and high memory usage that cause high serialization. |
I'm expecting that we'll get much more mileage out of this feature in cudf (and other downstream RAPIDS repos) where the tests are much more amenable to parallelization (lower util). |
One minor caveat is that for cudf I plan to use ctest to configure tests with suitable environment variables to manage the preload library and other associated settings, so those tests will run differently when run via ctest than when the executables are invoked directly. Obviously that specific example is only relevant for cudf, but it does illustrate the possibility that we will have instances where ctest will run a test differently from a direct invocation. We may want to document that expectation if it translates to a recommended way to run tests (ctest over direct executable). |
f73d445
to
0b6aace
Compare
0b6aace
to
fec0bf2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM.
5c341dd
to
6b3d19a
Compare
rapids-cmake 23.02 offers parallel testing with load balancing across GPUs. This feature allows multiple tests to run on the same gpu without oversubscription, and handles setting the CUDA_VISIBLE_DEVICE so that you can have tests executing on different GPUS.
6b3d19a
to
4cf7cf6
Compare
/merge |
Description
Converts librmm over to use
rapids-cmake
new GPU aware parallel testing feature, which allows tests to run across all the GPUs on a machine without oversubscription.This will allow developers to run
ctest -j<N>
and ctest will figure out given the current machine how many tests it can run in parallel given the current GPU set ( currently 4 tests per GPU ).Checklist