This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 758
Reduce header bloat #1572
Merged
alliepiper
merged 1 commit into
NVIDIA:main
from
alliepiper:prep-if-target/unused-headers
Dec 14, 2021
Merged
Reduce header bloat #1572
alliepiper
merged 1 commit into
NVIDIA:main
from
alliepiper:prep-if-target/unused-headers
Dec 14, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
alliepiper
added
type: enhancement
New feature or request.
P1: should have
Necessary, but not critical.
release: breaking change
Include in "Breaking Changes" section of release notes.
labels
Nov 30, 2021
run tests |
alliepiper
force-pushed
the
prep-if-target/unused-headers
branch
from
November 30, 2021 18:44
0b4910f
to
71279ac
Compare
run tests |
gevtushenko
approved these changes
Nov 30, 2021
gpuCI failure is unrelated -- I'll fix that timeout in NVIDIA/cub#403. DVS CL: 30722775 |
alliepiper
added
testing: gpuCI passed
Passed gpuCI testing.
testing: internal ci in progress
Currently testing on internal NVIDIA CI (DVS).
labels
Dec 1, 2021
alliepiper
force-pushed
the
prep-if-target/unused-headers
branch
from
December 1, 2021 21:24
71279ac
to
94e9909
Compare
DVS CL: 30723811 run tests |
alliepiper
added
testing: gpuCI in progress
Started gpuCI testing.
and removed
testing: gpuCI passed
Passed gpuCI testing.
labels
Dec 1, 2021
The CUDA-specific binary search implementation has been `#ifdef 0`d for a long time. It didn't perform as well as the sequential implementation and is dead code that uses old dispatch mechanisms. Also remove a load of unused headers from `thrust/system/cuda/execution_policy.h`. The comments around these headers don't make sense and looks like this was being used for test bookkeeping.
alliepiper
force-pushed
the
prep-if-target/unused-headers
branch
from
December 14, 2021 18:50
94e9909
to
f296ff8
Compare
This was referenced Mar 23, 2022
rapids-bot bot
pushed a commit
to rapidsai/rmm
that referenced
this pull request
Mar 29, 2022
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.16 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. It looks like rmm may be able to build with Thrust 1.16 even without these changes, but I think this changeset may help prevent future problems arising from inconsistency and reliance on `detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Mark Harris (https://github.com/harrism) - Vyas Ramasubramani (https://github.com/vyasr) - Conor Hoekstra (https://github.com/codereport) URL: #1011
rapids-bot bot
pushed a commit
to rapidsai/cudf
that referenced
this pull request
Apr 1, 2022
This PR updates the version of Thrust from 1.15 to 1.16 ([changelog](https://github.com/NVIDIA/thrust/blob/main/CHANGELOG.md#thrust-1160)). This update is needed to fix compilation with GCC 11, because of some warnings-as-errors present in Thrust 1.15 with GCC 11 (such as this one from Thrust's copy of cub: https://github.com/NVIDIA/cub/pull/418). Notably, Thrust reduced the number of internal header inclusions: > [#1572](https://github.com/NVIDIA/thrust/pull/1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. This change illuminated many missing includes in libcudf, so I added `#include <thrust/...>` for all thrust features used in each file (with help from a Python script). I included raw benchmarks that I recorded below. <details> <summary>Benchmarks:</summary> ``` Benchmark Time CPU Time Old Time New CPU Old CPU New -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CopyIfElse/int16_no_nulls/4096/manual_time +0.0581 +0.0307 0 0 0 0 CopyIfElse/uint32_no_nulls/4096/manual_time +0.1308 +0.0463 0 0 0 0 CopyIfElse/uint32_no_nulls/32768/manual_time +0.1043 +0.0485 0 0 0 0 CopyIfElse/float64_no_nulls/4096/manual_time +0.0894 +0.0422 0 0 0 0 StringDateTime/from_days/32768/manual_time +0.0529 +0.0491 93 98 112 118 StringDateTime/to_days/1024/manual_time +0.0596 +0.0493 35 37 54 57 StringDateTime/to_days/32768/manual_time +0.0547 +0.0460 37 39 55 58 StringToDurations/to_durations_ms/1024/manual_time +0.0516 +0.0426 30 31 49 51 StringToDurations/to_durations_ms/32768/manual_time +0.0542 +0.0506 32 34 52 55 StringToDurations/to_durations_us/32768/manual_time +0.0520 +0.0440 32 34 52 55 StringsFromFixedPoint/strings_from_decimal64/16384/manual_time +0.0530 +0.0508 94 99 113 119 StringsToNumeric/strings_to_float32/1024/manual_time +0.0521 +0.0451 31 32 50 52 StringsToNumeric/strings_to_float64/16384/manual_time +0.0517 +0.0437 32 34 51 53 StringsToNumeric/strings_to_float64/65536/manual_time +0.0505 +0.0496 35 36 53 56 StringsToNumeric/strings_to_uint8/4096/manual_time +0.0559 +0.0466 24 25 43 45 StringsToNumeric/strings_to_uint8/65536/manual_time +0.0563 +0.0458 26 27 44 46 StringCopy/gather/4096/32/manual_time +0.0652 +0.0574 0 0 0 0 StringCopy/gather/4096/128/manual_time +0.0706 +0.0615 0 0 0 0 StringCopy/gather/4096/512/manual_time +0.0547 +0.0476 0 0 0 0 StringCopy/gather/32768/32/manual_time +0.0538 +0.0492 0 0 0 0 StringCopy/gather/32768/128/manual_time +0.0540 +0.0477 0 0 0 0 StringCopy/scatter/4096/32/manual_time +0.0571 +0.0526 0 0 0 0 StringCopy/scatter/32768/32/manual_time +0.0541 +0.0509 0 0 0 0 StringFindScalar/find_multi/4096/32/manual_time +0.0525 +0.0460 0 0 0 0 StringFindScalar/find_multi/32768/32/manual_time +0.0538 +0.0489 0 0 0 0 StringFindScalar/contains/4096/32/manual_time +0.0502 +0.0471 0 0 0 0 StringFindScalar/starts_with/4096/32/manual_time +0.0528 +0.0476 0 0 0 0 StringFindScalar/starts_with/4096/2048/manual_time +0.0575 +0.0475 0 0 0 0 StringFindScalar/starts_with/4096/8192/manual_time +0.0606 +0.0515 0 0 0 0 StringFindScalar/starts_with/32768/32/manual_time +0.0690 +0.0592 0 0 0 0 StringFindScalar/starts_with/32768/128/manual_time +0.0589 +0.0499 0 0 0 0 StringFindScalar/starts_with/32768/512/manual_time +0.0567 +0.0521 0 0 0 0 StringFindScalar/starts_with/32768/2048/manual_time +0.0517 +0.0501 0 0 0 0 StringFindScalar/starts_with/262144/32/manual_time +0.0555 +0.0525 0 0 0 0 StringFindScalar/ends_with/4096/2048/manual_time +0.0526 +0.0446 0 0 0 0 StringFindScalar/ends_with/4096/8192/manual_time +0.0568 +0.0485 0 0 0 0 StringFindScalar/ends_with/32768/32/manual_time +0.0654 +0.0567 0 0 0 0 StringFindScalar/ends_with/32768/512/manual_time +0.0546 +0.0502 0 0 0 0 StringFindScalar/ends_with/262144/32/manual_time +0.0523 +0.0517 0 0 0 0 RepeatStrings/scalar_times/256/16/manual_time +0.0555 +0.0501 0 0 0 0 RepeatStrings/scalar_times/1024/16/manual_time +0.0562 +0.0519 0 0 0 0 RepeatStrings/column_times/256/16/manual_time +0.0645 +0.0579 0 0 0 0 RepeatStrings/column_times/256/64/manual_time +0.0506 +0.0472 0 0 0 0 RepeatStrings/column_times/1024/16/manual_time +0.0643 +0.0578 0 0 0 0 RepeatStrings/column_times/4096/16/manual_time +0.0537 +0.0502 0 0 0 0 RepeatStrings/column_times/16384/16/manual_time +0.0565 +0.0514 0 0 0 0 RepeatStrings/compute_output_strings_sizes/256/16/manual_time +0.0626 +0.0490 0 0 0 0 RepeatStrings/compute_output_strings_sizes/256/64/manual_time +0.0539 +0.0434 0 0 0 0 RepeatStrings/compute_output_strings_sizes/256/256/manual_time +0.0694 +0.0525 0 0 0 0 RepeatStrings/compute_output_strings_sizes/1024/16/manual_time +0.0526 +0.0422 0 0 0 0 RepeatStrings/compute_output_strings_sizes/1024/64/manual_time +0.0630 +0.0493 0 0 0 0 RepeatStrings/compute_output_strings_sizes/1024/256/manual_time +0.0533 +0.0460 0 0 0 0 RepeatStrings/precomputed_sizes/256/16/manual_time +0.0674 +0.0602 0 0 0 0 RepeatStrings/precomputed_sizes/1024/16/manual_time +0.0544 +0.0488 0 0 0 0 RepeatStrings/precomputed_sizes/4096/16/manual_time +0.0531 +0.0492 0 0 0 0 RepeatStrings/precomputed_sizes/16384/16/manual_time +0.0522 +0.0470 0 0 0 0 StringReplace/slice/4096/32/manual_time +0.0559 +0.0534 0 0 0 0 StringReplace/slice/32768/32/manual_time +0.0509 +0.0472 0 0 0 0 StringSplit/split_ws/4096/32/manual_time +0.0507 +0.0493 0 0 0 0 StringSubstring/multi_position/4096/32/manual_time +0.0560 +0.0515 0 0 0 0 StringSubstring/delimiter/4096/32/manual_time +0.0532 +0.0504 0 0 0 0 StringSubstring/delimiter/32768/128/manual_time +0.0531 +0.0535 0 0 0 0 StringSubstring/multi_delimiter/4096/32/manual_time +0.0544 +0.0522 0 0 0 0 CsvWrite/string_file_output/23/0/manual_time -0.3111 -0.0110 1421 979 842 833 Shift/shift_ten_percent_nullable_out/32768/manual_time -0.0786 -0.0650 0 0 0 0 Shift/shift_full_nullable_out/1073741824/manual_time +0.0511 +0.0510 11 11 11 11 TypeDispatcher/fp64_bandwidth_host/8/1024/1/manual_time +0.1281 +0.0638 18970 21400 37938 40357 TypeDispatcher/fp64_bandwidth_host/4/2048/1/manual_time +0.0928 +0.0345 11556 12629 30463 31513 TypeDispatcher/fp64_bandwidth_host/2/4096/1/manual_time +0.0768 +0.0270 7421 7991 26234 26943 TypeDispatcher/fp64_bandwidth_host/1/8192/1/manual_time +0.0729 +0.0209 5029 5396 24111 24615 TypeDispatcher/fp64_bandwidth_device/8/1024/1/manual_time +0.1176 +0.0632 16518 18460 35703 37961 TypeDispatcher/fp64_bandwidth_device/4/2048/1/manual_time +0.0787 +0.0457 14424 15559 33546 35079 TypeDispatcher/fp64_bandwidth_device/2/4096/1/manual_time +0.0500 +0.0327 13594 14274 32740 33811 TypeDispatcher/fp64_bandwidth_no/2/1024/1/manual_time +0.0590 +0.0131 5065 5364 23966 24281 TypeDispatcher/fp64_bandwidth_no/8/1024/1/manual_time +0.2305 +0.0699 6912 8506 25803 27607 TypeDispatcher/fp64_bandwidth_no/1/2048/1/manual_time +0.0574 +0.0120 4854 5133 23782 24067 TypeDispatcher/fp64_bandwidth_no/4/2048/1/manual_time +0.1602 +0.0461 6010 6973 24906 26054 TypeDispatcher/fp64_bandwidth_no/2/4096/1/manual_time +0.0949 +0.0330 5583 6113 24469 25275 TypeDispatcher/fp64_bandwidth_no/4/4096/1/manual_time +0.0623 +0.0175 6991 7427 26088 26545 TypeDispatcher/fp64_bandwidth_no/8/4096/1/manual_time +0.0521 +0.0173 8953 9419 28000 28484 TypeDispatcher/fp64_bandwidth_no/1/8192/1/manual_time +0.0607 +0.0257 5225 5542 24107 24727 TypeDispatcher/fp64_bandwidth_no/2/8192/1/manual_time +0.0588 +0.0115 5964 6315 25052 25341 TypeDispatcher/fp64_bandwidth_no/1/16384/1/manual_time +0.0541 +0.0119 5443 5737 24515 24806 TextTokenize/ngrams/2097152/128/manual_time +0.0624 +0.0623 10 10 10 10 MultibyteSplitBenchmark/multibyte_split_simple/1/1/1/32768/manual_time +0.4019 +0.4024 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/2/1/1/32768/manual_time +0.4099 +0.4073 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/32768/manual_time +0.3999 +0.3961 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/32768/manual_time +0.3969 +0.3980 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/1/7/1/32768/manual_time +0.4107 +0.3971 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/2/7/1/32768/manual_time +0.3833 +0.3948 8 12 8 12 MultibyteSplitBenchmark/multibyte_split_simple/1/1/25/32768/manual_time +0.3807 +0.3772 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/2/1/25/32768/manual_time +0.3834 +0.3702 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/1/4/25/32768/manual_time +0.3646 +0.3661 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/2/4/25/32768/manual_time +0.3722 +0.3743 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/1/7/25/32768/manual_time +0.3575 +0.3664 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/2/7/25/32768/manual_time +0.3761 +0.3744 9 12 9 12 MultibyteSplitBenchmark/multibyte_split_simple/1/4/1/1073741824/manual_time -0.1017 -0.1040 1681 1510 1681 1506 MultibyteSplitBenchmark/multibyte_split_simple/2/4/1/1073741824/manual_time -0.1817 -0.1817 4102 3357 4101 3356 MultibyteSplitBenchmark/multibyte_split_simple/0/7/25/1073741824/manual_time -0.0704 -0.0704 345 320 345 320 OVERALL_GEOMEAN +0.0974 +0.0970 0 0 0 0 Groupby/BasicSumScan/100000000/manual_time +0.2947 +0.2947 135 175 135 175 CsvRead/decimal_file_input/35/0/manual_time +0.0508 +0.0511 151 159 151 159 ReductionScan/double_nulls/100000/manual_time +0.0721 +0.0609 22874 24524 40726 43206 OrcWrite/integral_file_output/30/0/32/1/0/manual_time -0.1923 -0.0371 913 738 763 735 OrcWrite/integral_file_output/30/0/1/0/0/manual_time +0.2668 -0.0297 754 955 722 701 OrcWrite/integral_file_output/30/1000/1/0/0/manual_time -0.1090 -0.0510 986 878 725 688 OrcWrite/integral_file_output/30/0/32/0/0/manual_time +0.0594 -0.0575 981 1039 738 696 OrcWrite/integral_buffer_output/30/1000/32/1/1/manual_time +0.0882 +0.0885 85 92 85 92 OrcWrite/integral_buffer_output/30/1000/32/0/1/manual_time -0.0966 -0.0955 98 89 98 89 OrcWrite/floats_file_output/31/0/1/1/0/manual_time +0.0600 -0.0538 737 781 737 697 OrcWrite/floats_file_output/31/0/32/1/0/manual_time +0.0670 +0.0021 1203 1284 715 717 OrcWrite/floats_file_output/31/0/1/0/0/manual_time -0.2406 -0.0605 865 657 698 656 OrcWrite/floats_file_output/31/1000/1/0/0/manual_time -0.2006 -0.0642 1122 897 706 660 OrcWrite/floats_file_output/31/0/32/0/0/manual_time -0.1759 -0.0563 1131 932 708 668 OrcWrite/floats_file_output/31/1000/32/0/0/manual_time -0.1600 -0.0640 1095 919 702 657 OrcWrite/decimal_file_output/35/1000/1/0/0/manual_time +0.1622 -0.0865 1110 1290 588 537 OrcWrite/timestamps_file_output/33/0/1/0/0/manual_time +0.1884 -0.0494 552 657 552 524 OrcWrite/timestamps_file_output/33/1000/1/0/0/manual_time +0.1409 +0.0064 650 742 541 544 OrcWrite/list_file_output/24/0/1/0/0/manual_time -0.0723 -0.0788 713 661 711 655 OrcWrite/list_file_output/24/1000/1/0/0/manual_time +0.0935 -0.0468 696 761 689 657 Concatenate/BM_concatenate_nullable_false/4096/2/manual_time +0.1055 +0.0672 0 0 0 0 Concatenate/BM_concatenate_nullable_false/512/8/manual_time +0.0548 +0.0379 0 0 0 0 Concatenate/BM_concatenate_nullable_true/32768/8/manual_time +0.0501 +0.0415 0 0 0 0 Concatenate/BM_concatenate_nullable_true/64/64/manual_time +0.0570 +0.0400 0 0 0 0 Concatenate/BM_concatenate_nullable_true/512/64/manual_time +0.0894 +0.0606 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/4096/2/2/manual_time +0.1086 +0.0771 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/512/8/2/manual_time +0.0920 +0.0828 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/4096/8/2/manual_time +0.0549 +0.0502 0 0 0 0 Concatenate/BM_concatenate_tables_nullable_false/256/32/2/manual_time +0.1036 +0.1009 1 1 1 1 Concatenate/BM_concatenate_tables_nullable_false/512/32/2/manual_time +0.0827 +0.0813 1 1 1 1 Concatenate/BM_concatenate_tables_nullable_false/4096/32/2/manual_time +0.0788 +0.0768 1 1 1 1 Concatenate/BM_concatenate_tables_nullable_false/256/8/64/manual_time +0.0525 +0.0490 0 0 0 0 ParquetRead/integral_buffer_input/29/1000/1/0/1/manual_time +0.0929 +0.0928 46 50 46 50 ParquetRead/timestamps_file_input/33/0/32/0/0/manual_time -0.0896 -0.0897 127 116 128 116 OrcRead/integral_buffer_input/30/1000/1/0/1/manual_time +0.1087 +0.1087 88 97 88 97 OrcRead/floats_file_input/31/0/1/1/0/manual_time +0.1528 +0.1526 134 155 134 155 OrcRead/floats_buffer_input/31/1000/1/0/1/manual_time +0.1349 +0.1350 75 85 75 85 OrcRead/decimal_buffer_input/35/0/1/0/1/manual_time -0.1137 -0.1137 264 234 264 234 OrcRead/string_file_input/23/0/1/0/0/manual_time -0.0750 -0.0750 162 150 162 150 OrcRead/string_file_input/23/0/32/0/0/manual_time -0.0963 -0.0963 163 147 163 147 OrcRead/string_buffer_input/23/0/32/0/1/manual_time -0.1586 -0.0139 114 96 97 96 OrcRead/list_file_input/24/1000/1/0/0/manual_time +0.0515 +0.0517 176 185 176 185 OrcRead/list_file_input/24/0/32/0/0/manual_time +0.0925 +0.0922 173 189 173 189 OrcRead/list_buffer_input/24/0/1/1/1/manual_time -0.1288 -0.1291 139 121 139 121 BINARYOP<int32_t, TreeType::IMBALANCED_LEFT, true>/binaryop_int32_imbalanced_reuse/100000/2/manual_time +0.0533 +0.0381 0 0 0 0 COMPILED_BINARYOP/NULL_MAX_decimal32_decimal32_decimal32/100000/manual_time +0.0509 +0.0320 13 14 32 33 COMPILED_BINARYOP/NULL_MIN_timestamp_D_timestamp_s_timestamp_s/10000/manual_time +0.0509 +0.0374 11 12 30 31 ParquetWrite/integral_file_output/29/0/1/1/0/manual_time +0.3011 +0.0605 726 945 726 770 ParquetWrite/integral_file_output/29/1000/1/1/0/manual_time +0.0812 +0.0804 311 336 310 335 ParquetWrite/integral_file_output/29/0/32/1/0/manual_time +0.3497 +0.0714 948 1279 734 786 ParquetWrite/integral_file_output/29/1000/32/1/0/manual_time +0.0559 +0.0558 62 65 62 65 ParquetWrite/integral_file_output/29/0/1/0/0/manual_time +0.1829 +0.0679 702 830 700 748 ParquetWrite/integral_file_output/29/1000/1/0/0/manual_time +0.0829 +0.0852 284 307 283 307 ParquetWrite/integral_file_output/29/0/32/0/0/manual_time -0.3273 +0.0451 1063 715 683 714 ParquetWrite/integral_file_output/29/1000/32/0/0/manual_time +0.0835 +0.0834 58 63 58 63 ParquetWrite/integral_buffer_output/29/0/1/1/1/manual_time +0.0608 +0.0609 874 927 874 927 ParquetWrite/floats_file_output/31/0/1/1/0/manual_time +0.1916 +0.0634 694 827 693 737 ParquetWrite/floats_file_output/31/1000/1/1/0/manual_time +0.0560 +0.0553 217 229 217 229 ParquetWrite/floats_file_output/31/0/32/1/0/manual_time +0.0517 +0.0546 1020 1073 721 760 ParquetWrite/floats_file_output/31/1000/32/1/0/manual_time +0.1149 +0.0631 45 50 39 42 ParquetWrite/floats_file_output/31/0/1/0/0/manual_time +0.1165 +0.0471 880 983 664 695 ParquetWrite/floats_file_output/31/1000/1/0/0/manual_time +0.3996 +0.0038 237 331 219 219 ParquetWrite/floats_file_output/31/0/32/0/0/manual_time +0.3109 +0.0673 666 873 666 710 ParquetWrite/floats_file_output/31/1000/32/0/0/manual_time +0.0798 +0.0790 38 41 38 41 ParquetWrite/floats_buffer_output/31/1000/1/1/1/manual_time +0.0710 +0.0709 208 223 208 223 ParquetWrite/floats_buffer_output/31/0/32/1/1/manual_time +0.0677 +0.0673 732 782 732 782 ParquetWrite/floats_buffer_output/31/0/1/0/1/manual_time +0.0663 +0.0659 682 728 682 727 ParquetWrite/floats_buffer_output/31/1000/1/0/1/manual_time +0.0785 +0.0780 188 203 188 203 ParquetWrite/decimal_file_output/35/0/1/1/0/manual_time +0.0655 +0.0636 277 296 277 295 ParquetWrite/decimal_file_output/35/1000/1/1/0/manual_time +0.0657 +0.0634 242 258 242 257 ParquetWrite/decimal_file_output/35/0/32/1/0/manual_time +0.1194 +0.0577 291 325 290 307 ParquetWrite/decimal_file_output/35/1000/32/1/0/manual_time +0.0852 +0.0836 170 185 170 184 ParquetWrite/decimal_file_output/35/0/1/0/0/manual_time +0.3802 +0.0372 346 477 325 337 ParquetWrite/decimal_file_output/35/1000/1/0/0/manual_time +0.8101 +0.1543 374 677 373 431 ParquetWrite/decimal_file_output/35/0/32/0/0/manual_time +1.4742 +0.0541 328 812 327 344 ParquetWrite/decimal_file_output/35/1000/32/0/0/manual_time +0.5398 +0.0463 391 603 390 409 ParquetWrite/decimal_buffer_output/35/0/1/1/1/manual_time +0.0571 +0.0570 301 318 301 318 ParquetWrite/decimal_buffer_output/35/1000/1/1/1/manual_time +0.1955 +0.1953 253 302 253 302 ParquetWrite/decimal_buffer_output/35/0/32/1/1/manual_time +0.0655 +0.0641 306 326 306 325 ParquetWrite/decimal_buffer_output/35/0/1/0/1/manual_time +0.0595 +0.0591 381 404 381 404 ParquetWrite/decimal_buffer_output/35/1000/1/0/1/manual_time +0.0650 +0.0643 515 548 515 548 ParquetWrite/decimal_buffer_output/35/0/32/0/1/manual_time +0.0595 +0.0591 386 409 386 409 ParquetWrite/decimal_buffer_output/35/1000/32/0/1/manual_time +0.0595 +0.0590 517 547 516 547 ParquetWrite/timestamps_file_output/33/0/1/1/0/manual_time +0.0566 +0.0580 724 765 721 762 ParquetWrite/timestamps_file_output/33/1000/1/1/0/manual_time -0.6229 -0.0258 526 198 203 198 ParquetWrite/timestamps_file_output/33/0/32/1/0/manual_time -0.0955 +0.0444 928 840 733 766 ParquetWrite/timestamps_file_output/33/1000/32/1/0/manual_time +0.0794 +0.0725 36 39 36 39 ParquetWrite/timestamps_file_output/33/0/1/0/0/manual_time +0.2140 +0.0788 626 760 626 676 ParquetWrite/timestamps_file_output/33/1000/1/0/0/manual_time +0.0778 +0.0760 174 188 174 187 ParquetWrite/timestamps_file_output/33/0/32/0/0/manual_time +0.4682 +0.0758 636 934 636 684 ParquetWrite/timestamps_file_output/33/1000/32/0/0/manual_time +0.0938 +0.0929 34 38 34 38 ParquetWrite/timestamps_buffer_output/33/0/1/1/1/manual_time +0.0559 +0.0559 837 884 837 884 ParquetWrite/timestamps_buffer_output/33/0/1/0/1/manual_time +0.0612 +0.0612 714 758 714 758 ParquetWrite/timestamps_buffer_output/33/1000/1/0/1/manual_time -0.2022 -0.2021 229 183 229 183 ParquetWrite/timestamps_buffer_output/33/0/32/0/1/manual_time +0.0609 +0.0596 721 765 721 764 ParquetWrite/string_file_output/23/0/1/1/0/manual_time +0.1674 +0.1004 1231 1437 869 956 ParquetWrite/string_file_output/23/1000/1/1/0/manual_time +0.0748 +0.0675 124 133 107 114 ParquetWrite/string_file_output/23/0/32/1/0/manual_time +0.0497 +0.0541 1197 1256 893 942 ParquetWrite/string_file_output/23/1000/32/1/0/manual_time +0.0822 +0.0551 38 41 34 35 ParquetWrite/string_file_output/23/0/1/0/0/manual_time +0.3477 +0.0668 892 1202 828 883 ParquetWrite/string_file_output/23/1000/1/0/0/manual_time +0.1446 +0.1474 98 113 98 113 ParquetWrite/string_file_output/23/1000/32/0/0/manual_time +0.0596 +0.0590 33 35 33 35 ParquetWrite/string_buffer_output/23/1000/1/0/1/manual_time +0.0598 +0.0594 104 110 104 110 ParquetWrite/string_void_output/23/1000/32/0/2/manual_time -0.3901 +0.0015 34 21 21 21 ParquetWrite/list_file_output/24/0/1/0/0/manual_time -0.1313 +0.0831 1033 897 828 897 ParquetWrite/list_file_output/24/1000/1/0/0/manual_time +0.0559 +0.0537 521 550 521 549 ParquetWrite/list_file_output/24/0/32/0/0/manual_time -0.1942 -0.0129 1183 954 888 877 ContiguousSplit/1Gb512ColsValidity/1073741824/512/256/1/iterations:8/manual_time +0.0660 +0.0659 30 32 30 32 AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/1000000/1/manual_time +0.0540 +0.0453 0 0 0 0 AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/10000000/1/manual_time +0.0657 +0.0642 1 1 1 1 AST<int32_t, TreeType::IMBALANCED_LEFT, false, true>/ast_int32_imbalanced_unique_nulls/100000000/1/manual_time +0.0704 +0.0702 8 9 8 9 AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/1000000/1/manual_time +0.0549 +0.0473 0 0 0 0 AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/10000000/1/manual_time +0.0745 +0.0723 1 1 1 1 AST<int32_t, TreeType::IMBALANCED_LEFT, true, true>/ast_int32_imbalanced_reuse_nulls/100000000/1/manual_time +0.0758 +0.0755 7 8 7 8 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/1/manual_time +0.0534 +0.0522 1 1 1 1 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/10000000/10/manual_time +0.0610 +0.0606 3 3 3 3 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/1/manual_time +0.0538 +0.0537 9 10 9 10 AST<double, TreeType::IMBALANCED_LEFT, false, true>/ast_double_imbalanced_unique_nulls/100000000/10/manual_time +0.0579 +0.0579 26 27 26 27 Rank/nulls/1024/manual_time +0.7608 +0.6280 0 0 0 0 Rank/nulls/4096/manual_time +0.2739 +0.2437 0 0 0 0 Rank/nulls/32768/manual_time +0.1599 +0.1469 0 0 0 0 Rank/nulls/262144/manual_time +0.0813 +0.0793 0 0 0 0 Rank/nulls/2097152/manual_time -0.4178 -0.4162 5 3 5 3 Rank/nulls/16777216/manual_time -0.3688 -0.3686 45 28 45 28 Rank/nulls/67108864/manual_time -0.3576 -0.3576 181 117 181 117 Sort<false>/unstable_no_nulls/1024/8/manual_time +0.2655 +0.2554 1 1 1 1 Sort<false>/unstable_no_nulls/4096/8/manual_time +0.3212 +0.3081 0 1 1 1 Sort<false>/unstable_no_nulls/32768/8/manual_time +0.1430 +0.1395 1 1 1 1 Sort<false>/unstable_no_nulls/262144/8/manual_time +0.1080 +0.1064 1 1 1 2 Sort<false>/unstable_no_nulls/2097152/8/manual_time -0.0740 -0.0740 15 14 15 14 Sort<false>/unstable_no_nulls/16777216/8/manual_time -0.0882 -0.0882 215 196 215 196 Sort<false>/unstable_no_nulls/67108864/8/manual_time -0.0848 -0.0848 1170 1071 1170 1071 Sort<true>/stable_no_nulls/1024/8/manual_time +0.2656 +0.2553 1 1 1 1 Sort<true>/stable_no_nulls/4096/8/manual_time +0.3215 +0.3081 0 1 1 1 Sort<true>/stable_no_nulls/32768/8/manual_time +0.1427 +0.1392 1 1 1 1 Sort<true>/stable_no_nulls/262144/8/manual_time +0.1082 +0.1066 1 1 1 2 Sort<true>/stable_no_nulls/2097152/8/manual_time -0.0737 -0.0735 15 14 15 14 Sort<true>/stable_no_nulls/16777216/8/manual_time -0.0889 -0.0887 215 196 215 196 Sort<true>/stable_no_nulls/67108864/8/manual_time -0.0848 -0.0846 1170 1071 1170 1071 Sort<false>/unstable/1024/1/manual_time +0.8698 +0.7017 0 0 0 0 Sort<false>/unstable/4096/1/manual_time +0.2846 +0.2506 0 0 0 0 Sort<false>/unstable/32768/1/manual_time +0.1640 +0.1492 0 0 0 0 Sort<false>/unstable/262144/1/manual_time +0.0818 +0.0794 0 0 0 0 Sort<false>/unstable/2097152/1/manual_time -0.4431 -0.4414 5 3 5 3 Sort<false>/unstable/16777216/1/manual_time -0.4282 -0.4280 38 22 38 22 Sort<false>/unstable/67108864/1/manual_time -0.4168 -0.4168 155 90 155 90 Sort<false>/unstable/1024/8/manual_time +0.2213 +0.2142 1 1 1 1 Sort<false>/unstable/4096/8/manual_time +0.2784 +0.2687 1 1 1 1 Sort<false>/unstable/32768/8/manual_time +0.1115 +0.1094 1 1 1 1 Sort<false>/unstable/262144/8/manual_time +0.1030 +0.1016 2 2 2 2 Sort<true>/stable/1024/1/manual_time +0.8684 +0.7016 0 0 0 0 Sort<true>/stable/4096/1/manual_time +0.2860 +0.2517 0 0 0 0 Sort<true>/stable/32768/1/manual_time +0.1638 +0.1497 0 0 0 0 Sort<true>/stable/262144/1/manual_time +0.0817 +0.0798 0 0 0 0 Sort<true>/stable/2097152/1/manual_time -0.4431 -0.4415 5 3 5 3 Sort<true>/stable/16777216/1/manual_time -0.4279 -0.4277 38 22 38 22 Sort<true>/stable/67108864/1/manual_time -0.4176 -0.4176 155 90 155 90 Sort<true>/stable/1024/8/manual_time +0.2211 +0.2138 1 1 1 1 Sort<true>/stable/4096/8/manual_time +0.2808 +0.2706 1 1 1 1 Sort<true>/stable/32768/8/manual_time +0.1117 +0.1096 1 1 1 1 Sort<true>/stable/262144/8/manual_time +0.1029 +0.1013 2 2 2 2 Sort/strings/262144/manual_time -0.0781 -0.0777 4 4 4 4 Scatter/double_coalesce_x/2048/2/manual_time +0.0614 +0.0472 27988 29705 46846 49057 Scatter/double_coalesce_x/32768/2/manual_time +0.0637 +0.0522 30209 32133 47991 50496 Scatter/double_coalesce_x/131072/2/manual_time +0.0558 +0.0444 37821 39932 54883 57321 Scatter/double_coalesce_x/1024/4/manual_time +0.0811 +0.0663 53699 58053 72617 77434 Scatter/double_coalesce_x/2048/4/manual_time +0.0535 +0.0468 56040 59038 74848 78348 Scatter/double_coalesce_x/4096/4/manual_time +0.0514 +0.0449 56187 59073 74930 78291 Scatter/double_coalesce_x/8192/4/manual_time +0.0516 +0.0452 56747 59674 75140 78533 Scatter/double_coalesce_x/16384/4/manual_time +0.0520 +0.0479 57412 60400 75292 78895 Scatter/double_coalesce_x/32768/4/manual_time +0.0610 +0.0544 58151 61699 75398 79499 Scatter/double_coalesce_x/1024/8/manual_time +0.0526 +0.0486 110089 115882 129032 135301 Scatter/double_coalesce_x/2048/8/manual_time +0.0546 +0.0506 110864 116921 129784 136352 Scatter/double_coalesce_x/4096/8/manual_time +0.0612 +0.0554 110733 117506 129306 136465 Scatter/double_coalesce_x/8192/8/manual_time +0.0635 +0.0579 111614 118703 129727 137233 Scatter/double_coalesce_x/16384/8/manual_time +0.0665 +0.0604 111918 119366 129458 137275 Scatter/double_coalesce_x/32768/8/manual_time +0.0545 +0.0543 114993 121260 131951 139113 Scatter/double_coalesce_x/65536/8/manual_time +0.0619 +0.0560 119167 126540 136092 143717 Scatter/double_coalesce_o/2048/2/manual_time +0.0542 +0.0418 29300 30889 48197 50211 Scatter/double_coalesce_o/32768/2/manual_time +0.0556 +0.0464 32069 33851 49914 52229 Scatter/double_coalesce_o/1024/4/manual_time +0.0684 +0.0569 56480 60346 75468 79761 Scatter/double_coalesce_o/8192/4/manual_time +0.0572 +0.0497 59554 62960 77958 81834 Scatter/double_coalesce_o/16384/4/manual_time +0.0572 +0.0525 59839 63260 77704 81781 Scatter/double_coalesce_o/32768/4/manual_time +0.0564 +0.0514 62493 66015 79779 83883 Scatter/double_coalesce_o/1024/8/manual_time +0.0566 +0.0515 112968 119360 131925 138723 Scatter/double_coalesce_o/2048/8/manual_time +0.0565 +0.0518 113151 119548 132028 138870 Scatter/double_coalesce_o/4096/8/manual_time +0.0594 +0.0545 114566 121374 133078 140333 Scatter/double_coalesce_o/8192/8/manual_time +0.0587 +0.0534 116146 122963 134282 141449 Scatter/double_coalesce_o/16384/8/manual_time +0.0663 +0.0597 116445 124161 134038 142046 Scatter/double_coalesce_o/32768/8/manual_time +0.0555 +0.0566 122258 129043 139016 146891 Scatter/double_coalesce_o/65536/8/manual_time +0.0553 +0.0498 133373 140749 150403 157896 Quantiles/no_nulls/65536/4/1/manual_time +0.1394 +0.1370 1 1 1 1 Quantiles/no_nulls/262144/4/1/manual_time +0.1372 +0.1348 1 1 1 1 Quantiles/no_nulls/1048576/4/1/manual_time -0.0944 -0.0943 6 5 6 5 Quantiles/no_nulls/4194304/4/1/manual_time -0.1068 -0.1070 35 32 35 32 Quantiles/no_nulls/16777216/4/1/manual_time -0.0882 -0.0884 210 191 210 191 Quantiles/no_nulls/67108864/4/1/manual_time -0.0855 -0.0858 1148 1050 1148 1050 Quantiles/no_nulls/65536/8/1/manual_time +0.1312 +0.1290 1 1 1 1 Quantiles/no_nulls/262144/8/1/manual_time +0.1058 +0.1044 1 2 1 2 Quantiles/no_nulls/4194304/8/1/manual_time -0.0982 -0.0984 37 33 37 33 Quantiles/no_nulls/16777216/8/1/manual_time -0.0886 -0.0888 215 196 215 196 Quantiles/no_nulls/67108864/8/1/manual_time -0.0866 -0.0868 1173 1071 1173 1071 Quantiles/no_nulls/65536/4/4/manual_time +0.1413 +0.1385 1 1 1 1 Quantiles/no_nulls/262144/4/4/manual_time +0.1355 +0.1332 1 1 1 1 Quantiles/no_nulls/1048576/4/4/manual_time -0.0944 -0.0943 6 5 6 5 Quantiles/no_nulls/4194304/4/4/manual_time -0.1061 -0.1063 35 32 35 32 Quantiles/no_nulls/16777216/4/4/manual_time -0.0877 -0.0879 210 191 210 191 Quantiles/no_nulls/67108864/4/4/manual_time -0.0863 -0.0865 1149 1050 1149 1049 Quantiles/no_nulls/65536/8/4/manual_time +0.1328…
This was referenced May 25, 2022
rapids-bot bot
pushed a commit
to rapidsai/raft
that referenced
this pull request
Jun 17, 2022
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context Version 1.16 of Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I am making similar changes across all RAPIDS libraries to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust. Authors: - Bradley Dice (https://github.com/bdice) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Corey J. Nolet (https://github.com/cjnolet) URL: #678
rapids-bot bot
pushed a commit
to rapidsai/cuspatial
that referenced
this pull request
Jun 17, 2022
## Description This PR cleans up some `#include`s for Thrust. This is meant to help with the transition to Thrust 1.17 when that is updated in rapids-cmake (rapidsai/rapids-cmake#199). ## Context Version 1.16 of Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I am making similar changes across all RAPIDS libraries to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Mark Harris (https://github.com/harrism) - Vyas Ramasubramani (https://github.com/vyasr) URL: #539
isVoid
pushed a commit
to isVoid/cuspatial
that referenced
this pull request
Jun 23, 2022
## Description This PR cleans up some `#include`s for Thrust. This is meant to help with the transition to Thrust 1.17 when that is updated in rapids-cmake (rapidsai/rapids-cmake#199). ## Context Version 1.16 of Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I am making similar changes across all RAPIDS libraries to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Mark Harris (https://github.com/harrism) - Vyas Ramasubramani (https://github.com/vyasr) URL: rapidsai#539
rapids-bot bot
pushed a commit
to rapidsai/cugraph
that referenced
this pull request
Jun 29, 2022
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, version 1.16 of Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also makes it more obvious where cugraph depends on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Brad Rees (https://github.com/BradReesWork) - Seunghwa Kang (https://github.com/seunghwak) URL: #2310
rapids-bot bot
pushed a commit
to rapidsai/cuml
that referenced
this pull request
Jul 7, 2022
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions: > [#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also removes dependence on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - William Hicks (https://github.com/wphicks) URL: #4675
jakirkham
pushed a commit
to jakirkham/cuml
that referenced
this pull request
Feb 27, 2023
## Description This PR cleans up some `#include`s for Thrust. This is meant to help ease the transition to Thrust 1.17 when that is updated in rapids-cmake. ## Context I opened a PR rapidsai/cudf#10489 that updates cuDF to Thrust 1.16. Notably, Thrust reduced the number of internal header inclusions: > [rapidsai#1572](NVIDIA/thrust#1572) Removed several unnecessary header includes. Downstream projects may need to update their includes if they were relying on this behavior. I spoke with @robertmaynard and he recommended making similar changes to clean up includes ("include what we use," in essence) to make sure we have compatibility with future versions of Thrust across all RAPIDS libraries. This changeset also removes dependence on `thrust/detail` headers. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - William Hicks (https://github.com/wphicks) URL: rapidsai#4675
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
P1: should have
Necessary, but not critical.
release: breaking change
Include in "Breaking Changes" section of release notes.
testing: gpuCI in progress
Started gpuCI testing.
testing: internal ci in progress
Currently testing on internal NVIDIA CI (DVS).
type: enhancement
New feature or request.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is part of the if-target split.
The CUDA-specific binary search implementation has been
#ifdef 0
dfor a long time. It didn't perform as well as the sequential
implementation and is dead code that uses old dispatch mechanisms.
This also removes a load of unused headers from
thrust/system/cuda/execution_policy.h
. The comments around theseheaders don't make sense and looks like this was being used for test
bookkeeping.
Breaking Change
This patch may introduce compile errors in user code when a
thrust::
algorithm is used, but it's header is not included. Such usage of Thrust is not supported and the required headers should be added to resolve any issues.NVIDIA/cub#407 fixes some related issues in CUB uncovered by this change.