-
Notifications
You must be signed in to change notification settings - Fork 41
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* refactor: Use rocPRIM overloads for warp_scan::exclusive_scan wo initial value Exclusive scans without an initial value are now present on the rocPRIM public API. Use those instead of relying on "hidden" APIs added as a workaround between the two libraries. * Implemented WARP_EXCHANGE_SHUFFLE * Testing WARP_EXCHANGE_SHUFFLE and refactored warp_exchange test suite * Updated benchmarks with WARP_EXCHANGE_SHUFFLE * Updated changelog * NVCC build fixes and warning fixes * Put host_warp_size_wrapper in ::hipcub::detail namespace Previously it was in the global ::detail namespace. * Preprocessor definitions needed for debug_synchronous deprecation * adjacent_difference: deprecated debug_synchronous * device histogram: deprecated debug synchronous * Removed optional debug_synchronous argument in DeviceMemcpy This argument was never on the CUB interface and shouldn't have been added in the first place. * device merge sort: deprecated debug_synchronous * device partition: deprecated debug_synchronous * device radix sort: deprecated debug_synchronous * device reduce: deprecated debug_synchronous * device run length encode: deprecate debug_synchronous * device scan: deprecated debug_synchronous * device segmented radix sort: deprecated debug_synchronous * device segmented reduce: deprecated debug_synchronous * Build warning free & enabled -Werror in CI * device segmented sort: deprecated debug_synchronous * device select: deprecated debug_synchronous * Refactored HIPCUB_DETAIL_HIP_SYNC_AND_RETURN_ON_ERROR * device SPMV: deprecated debug_synchronous and added missing test to CMakeLists * Improved HIPCUB_DETAIL_RUNTIME_LOG_DEBUG_SYNCHRONOUS * Fixed formatting * ci: pass-failed warning does not imply failure * Fixed documentation * Updated changelog * Removed DeviceSelectWarpSize from tests * Removed DeviceSelectWarpSize from benchmarks * Use hipcub's DiscardOutputIterator instead of custom one * add device_copy add test for device_copy add benchmark for device_copy * update docs * fix format * fix copyright date * add device_copy to cub backend * update changelog * fix review comments * fix format * clarify warp scan interface * Added hipcub::tuple * Added decomposer overloads to BlockRadixSort * Testing BlockRadixSort decomposer overloads * Benchmarking BlockRadixSort decomposer overloads * Added tuple_element_t to cub/tuple,hpp * Fixed formatting * Tidied and updated changelog * DeviceRadixSort decomposer overloads (CUB backend) * DeviceRadixSort decomposer overloads (rocprim backend) * Testing DeviceRadixSort decomposer overloads * Benchmarking DeviceRadixSort decomposer overloads * Updated changelog * Added select_plus_operator_host for calculating on host with double precision * Host reference calculations are done in double precision in the tests * Clang format on test_hipcub_block_scan.cpp * Added precision for different types to test_utils and updated host scan using double precision during calculation * Added more specific precision checks to the tests * Changed precision for nvcc support * Changed rocprim types to test_utils type in device_scan test * Remove unused variable test device_scan * Change transform for nvcc compiler device_scan test * Added more precise precision checks for warp_scan and warp_reduce * Remove cast_type from test_utils * Templatize on overloaded operator instead of struct * hibcup test device_reduce_by_key updated to assert with precision * Added assert near with better precision to tests device_reduce, device spmv and thread_operators * Changed for block_reduce and block_scan all asserts to assert_near * Replaced is_plus by is_add in test_utils * specify architecture for rocprim build --------- Co-authored-by: Gergely Meszaros <[email protected]> Co-authored-by: Lőrinc Serfőző <[email protected]> Co-authored-by: Beatriz Navidad Vilches <[email protected]> Co-authored-by: Nol Moonen <[email protected]> Co-authored-by: Nick Breed <[email protected]>
- Loading branch information
1 parent
72d340c
commit a9b8970
Showing
101 changed files
with
14,186 additions
and
6,886 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.