Skip to content

Commit

Permalink
Mergeback 6.3 fixes (ROCm#420)
Browse files Browse the repository at this point in the history
* Revert Bit Twiddle change from PR ROCm#377 (ROCm#397)

An update to the TwiddleIn/Out functions from PR ROCm#377 seems to be causing
a build failure in onnxruntime. This change reverts the single commit
(0721c2c) that made those changes.
We can re-apply the change with an appropriate fix in the future.
Note: the commits in the PR were squashed, so that commit will not
show up in the log.

* Remove website URL from comments (ROCm#398)

Referencing or using code from some websites is prohibited in this repository.
This change removes an informational reference in the comments.

* Add gfx1151 target (ROCm#399) (ROCm#401)

Co-authored-by: Stanley Tsang <[email protected]>

* Spolifroni amd/624 changelogcleanup upcoming (ROCm#411)

* edited to conform to standards

* edited to conform to standards

* updated the changelog for 6.3 (ROCm#418)

---------

Co-authored-by: amd-garydeng <[email protected]>
Co-authored-by: Stanley Tsang <[email protected]>
Co-authored-by: spolifroni-amd <[email protected]>
  • Loading branch information
4 people authored and NB4444 committed Nov 20, 2024
1 parent dd0ad2d commit 41dce3a
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 16 deletions.
13 changes: 8 additions & 5 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# Changelog for hipCUB

Documentation for hipCUB is available at
[https://rocm.docs.amd.com/projects/hipCUB/en/latest/](https://rocm.docs.amd.com/projects/hipCUB/en/latest/).
Full documentation for hipCUB is available at [https://rocm.docs.amd.com/projects/hipCUB/en/latest/](https://rocm.docs.amd.com/projects/hipCUB/en/latest/).

## (Unreleased) hipCUB-x.x.x for ROCm 6.4.0

Expand All @@ -14,14 +13,14 @@ Documentation for hipCUB is available at
* The NVIDIA backend now requires CUB, Thrust and libcu++ 2.5.0. If it is not found it will be downloaded from the NVIDIA CCCL repository.
* Changed the C++ version from 14 to 17. C++14 will be deprecated in the next major release.

## (Unreleased) hipCUB-3.3.0 for ROCm 6.3.0
## hipCUB-3.3.0 for ROCm 6.3.0

### Fixed

* Not all headers in hipCUB included `config.hpp` which could have resulted in build errors.

### Added
* Add support for large indices in `hipcub::DeviceSegmentedReduce::*`. rocPRIM's backend provides support for all reduce variants, but CUB's does not have support yet for `DeviceSegmentedReduce::Arg*`, so large indices support has been excluded for these as well in hipCUB.
* Support for large indices in `hipcub::DeviceSegmentedReduce::*` has been added, with the exception of `DeviceSegmentedReduce::Arg*`. Although rocPRIM's backend provides support for all reduce variants, CUB does not support large indices in `DeviceSegmentedReduce::Arg*`. For this reason, large index support is not available for `hipcub::DeviceSegmentedReduce::Arg*`.
* Add -t smoke option in rtest.py. It will run a subset of tests such that the total test time is in 5 minutes. Use python3 ./rtest.py --test smoke or python3 ./rtest.py -t smoke to execute smoke test.
* Add inplace overloads of `DeviceScan` functions.
* Add inplace overloads of `DeviceSelect::Flagged` and `DeviceSelect::If`.
Expand All @@ -32,7 +31,11 @@ Documentation for hipCUB is available at
### Changed
* The NVIDIA backend now requires CUB, Thrust and libcu++ 2.4.0. If it is not found it will be downloaded from the NVIDIA CCCL repository.

## (Unreleased) hipCUB-3.2.0 for ROCm 6.2.0
### Resolved issues

* Fixed an issue where `config.hpp` was not included in all hipCUB headers, resulting in build errors.

## hipCUB-3.2.0 for ROCm 6.2.0

### Added
* Add `DeviceCopy` function to have parity with CUB.
Expand Down
22 changes: 12 additions & 10 deletions hipcub/include/hipcub/backend/rocprim/util_type.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -547,12 +547,12 @@ struct BaseTraits<UNSIGNED_INTEGER, true, false, _UnsignedBits, T>

static HIPCUB_HOST_DEVICE __forceinline__ UnsignedBits TwiddleIn(UnsignedBits key)
{
return key_codec::encode(rocprim::detail::bit_cast<T>(key));
return key;
}

static HIPCUB_HOST_DEVICE __forceinline__ UnsignedBits TwiddleOut(UnsignedBits key)
{
return key_codec::decode(rocprim::detail::bit_cast<T>(key));
return key;
}

static HIPCUB_HOST_DEVICE __forceinline__ T Max()
Expand Down Expand Up @@ -596,12 +596,12 @@ struct BaseTraits<SIGNED_INTEGER, true, false, _UnsignedBits, T>

static HIPCUB_HOST_DEVICE __forceinline__ UnsignedBits TwiddleIn(UnsignedBits key)
{
return key_codec::encode(rocprim::detail::bit_cast<T>(key));
return key ^ HIGH_BIT;
};

static HIPCUB_HOST_DEVICE __forceinline__ UnsignedBits TwiddleOut(UnsignedBits key)
{
return key_codec::decode(rocprim::detail::bit_cast<T>(key));
return key ^ HIGH_BIT;
};

static HIPCUB_HOST_DEVICE __forceinline__ T Max()
Expand Down Expand Up @@ -695,12 +695,14 @@ struct BaseTraits<FLOATING_POINT, true, false, _UnsignedBits, T>

static HIPCUB_HOST_DEVICE __forceinline__ UnsignedBits TwiddleIn(UnsignedBits key)
{
return key_codec::encode(rocprim::detail::bit_cast<T>(key));
UnsignedBits mask = (key & HIGH_BIT) ? UnsignedBits(-1) : HIGH_BIT;
return key ^ mask;
};

static HIPCUB_HOST_DEVICE __forceinline__ UnsignedBits TwiddleOut(UnsignedBits key)
{
return key_codec::decode(rocprim::detail::bit_cast<T>(key));
UnsignedBits mask = (key & HIGH_BIT) ? HIGH_BIT : UnsignedBits(-1);
return key ^ mask;
};

static HIPCUB_HOST_DEVICE __forceinline__ T Max() {
Expand Down Expand Up @@ -751,12 +753,12 @@ struct NumericTraits<__uint128_t>

static __host__ __device__ __forceinline__ UnsignedBits TwiddleIn(UnsignedBits key)
{
return key_codec::encode(rocprim::detail::bit_cast<T>(key));
return key;
}

static __host__ __device__ __forceinline__ UnsignedBits TwiddleOut(UnsignedBits key)
{
return key_codec::decode(rocprim::detail::bit_cast<T>(key));
return key;
}

static __host__ __device__ __forceinline__ T Max()
Expand Down Expand Up @@ -788,12 +790,12 @@ struct NumericTraits<__int128_t>

static __host__ __device__ __forceinline__ UnsignedBits TwiddleIn(UnsignedBits key)
{
return key_codec::encode(rocprim::detail::bit_cast<T>(key));
return key ^ HIGH_BIT;
};

static __host__ __device__ __forceinline__ UnsignedBits TwiddleOut(UnsignedBits key)
{
return key_codec::decode(rocprim::detail::bit_cast<T>(key));
return key ^ HIGH_BIT;
};

static __host__ __device__ __forceinline__ T Max()
Expand Down
1 change: 0 additions & 1 deletion scripts/copyright-date/check-copyright.sh
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,6 @@ if $forkdiff; then
source_commit="remotes/$remote/HEAD"

# don't use fork-point for finding fork point (lol)
# see: https://stackoverflow.com/a/53981615
diff_hash="$(git merge-base "$source_commit" "$branch")"
fi

Expand Down

0 comments on commit 41dce3a

Please sign in to comment.