Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve build time of libcudf iterator tests #9788

Merged
merged 2 commits into from
Dec 1, 2021

Conversation

davidwendt
Copy link
Contributor

While working on #9641 I noticed that building the iterator gtests takes alot of time in CI. Here is a link to the individual build times for libcudf including the gtests:
https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-gpu-test/CUDA=11.5,GPU_LABEL=driver-495,LINUX_VER=ubuntu20.04,PYTHON=3.8/5173/testReport/(root)/BuildTime/
(you can sort by Duration by clicking on table colum header).

Here is a table of the top 20 compile time offenders as recorded on my local machine. Note that like the CI build output, 6 of the top 20 are just building the ITERATOR_TEST

rank time (ms) file
1 814334 /cudf.dir/src/search/search.cu.o
2 755375 /cudf.dir/src/sort/sort_column.cu.o
3 686235 /ITERATOR_TEST.dir/iterator/optional_iterator_test_numeric.cu.o
4 670587 /cudf.dir/src/groupby/sort/group_nunique.cu.o
5 585524 /cudf.dir/src/reductions/scan/scan_inclusive.cu.o
6 582677 /ITERATOR_TEST.dir/iterator/pair_iterator_test_numeric.cu.o
7 568418 /ITERATOR_TEST.dir/iterator/scalar_iterator_test.cu.o
8 563196 /cudf.dir/src/sort/sort.cu.o
9 548816 /ITERATOR_TEST.dir/iterator/value_iterator_test_numeric.cu.o
10 535315 /cudf.dir/src/groupby/sort/sort_helper.cu.o
11 531384 /cudf.dir/src/sort/is_sorted.cu.o
12 530382 /ITERATOR_TEST.dir/iterator/value_iterator_test_chrono.cu.o
13 525187 /cudf.dir/src/join/semi_join.cu.o
14 523726 /cudf.dir/src/rolling/rolling.cu.o
15 517909 /cudf.dir/src/reductions/product.cu.o
16 513119 /cudf.dir/src/stream_compaction/distinct_count.cu.o
17 512569 /ITERATOR_TEST.dir/iterator/optional_iterator_test_chrono.cu.o
18 508978 /cudf.dir/src/reductions/sum_of_squares.cu.o
19 508460 /cudf.dir/src/lists/drop_list_duplicates.cu.o
20 505247 /cudf.dir/src/reductions/sum.cu.o

I made some simple changes to the iterator code logic to use different thrust functions along with a temporary device vector. This approach improved the compile time of the ITERATOR_TEST by about 3x. Here are the results of compiling the above 6 files with the changes in this PR.

new rank new time (ms) file
59 232691 (2.9x) optional_iterator_test_numeric.cu.o
26 416951 (1.4x) pair_iterator_test_numeric.cu.o
92 165947 (3.4x) scalar_iterator_test.cu.o
65 216364 (2.5x) value_iterator_test_numeric.cu.o
77 186583 (2.8x) value_iterator_test_chrono.cu.o
111 137789 (3.7x) optional_iterator_test_chrono.cu.o

Total overall build time improved locally by ~3m (10%) using ninja -j48 install on a Dell 5820.

Here are the build time results of a CI build with these changes.
https://gpuci.gpuopenanalytics.com/job/rapidsai/job/gpuci/job/cudf/job/prb/job/cudf-gpu-test/CUDA=11.5,GPU_LABEL=driver-495,LINUX_VER=ubuntu20.04,PYTHON=3.8/5190/testReport/(root)/BuildTime/

@davidwendt davidwendt added 2 - In Progress Currently a work in progress libcudf Affects libcudf (C++/CUDA) code. improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Nov 29, 2021
@davidwendt davidwendt self-assigned this Nov 29, 2021
@codecov
Copy link

codecov bot commented Nov 29, 2021

Codecov Report

Merging #9788 (1055f21) into branch-22.02 (967a333) will decrease coverage by 0.01%.
The diff coverage is 0.00%.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.02    #9788      +/-   ##
================================================
- Coverage         10.49%   10.47%   -0.02%     
================================================
  Files               119      119              
  Lines             20305    20341      +36     
================================================
  Hits               2130     2130              
- Misses            18175    18211      +36     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/index.py 0.00% <ø> (ø)
python/cudf/cudf/core/indexed_frame.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/multiindex.py 0.00% <0.00%> (ø)
python/cudf/cudf/utils/utils.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4629037...1055f21. Read the comment docs.

@davidwendt davidwendt added 3 - Ready for Review Ready for review by team and removed 2 - In Progress Currently a work in progress labels Nov 30, 2021
@davidwendt davidwendt marked this pull request as ready for review November 30, 2021 15:49
@davidwendt davidwendt requested a review from a team as a code owner November 30, 2021 15:49
@davidwendt
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 1ceb8ab into rapidsai:branch-22.02 Dec 1, 2021
@davidwendt davidwendt deleted the iterator-build-time branch December 1, 2021 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants