Skip to content

Commit

Permalink
Develop Stream 2024-07-01 (#377)
Browse files Browse the repository at this point in the history
* Using .lint:clang-format

* Added the hipcub thread test to the cmakelist

* fix: config.hpp is included in every header

* add half and bfloat16 cases to warp reduce and scan tests

* fix format

* Enabled half and bfloat16 tests for nvidia gpus

* Fixed that bfloat16 and half thread tests work and added them

* Moved thrust headers to different file and added iterator_facade_category

* Added IteratorCategory for arg_index, constant, counting and transform

* Added iterator_wrapper to reduce amount of duplicate code and use rocprim as much as possible

* Use keyword Using in iterators and put helper function in detail space

* Make use of iterator_wrapper for arg_index_input

* Changes names and adds some documentation to iterator_wrapper

* Removed unnecessary constructor and added explicit tag

* Moved more to code to IteratorWrapper based on coe Lorinc

* Revert "Moved more to code to IteratorWrapper based on coe Lorinc"

This reverts commit 894f60cae177096bcf1000e5dd321174faf2476b.

* Some iterators should have the device_system_tag as is the case in cub

* Implemented minor comments nara

* Add large indices test for device segmented reduce

* Fix hipCUB's device segmented reduce for large indices

* Update CHANGELOG

* Bumped CCCL version

* Moved get_large_sizes to common test utils file

* Changed TwiddleIn/Out implementation to make use of rocprim::radix_key_codec

* clang-format: trick clang-format into always breaking after c-style function attributes

* Fixed formatting

* Bumped project version to 3.3.0

---------

Co-authored-by: Lőrinc Serfőző <[email protected]>
Co-authored-by: Nick Breed <[email protected]>
Co-authored-by: Bence Parajdi <[email protected]>
Co-authored-by: Robin Voetter <[email protected]>
  • Loading branch information
5 people authored Aug 3, 2024
1 parent 186c5f9 commit 3b4f714
Show file tree
Hide file tree
Showing 60 changed files with 963 additions and 397 deletions.
34 changes: 33 additions & 1 deletion .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ AllowShortLoopsOnASingleLine: false
AlwaysBreakAfterReturnType: None
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: Yes
AttributeMacros: ['HIPCUB_DEVICE', 'HIPCUB_HOST', 'HIPCUB_HOST_DEVICE', 'HIPCUB_SHARED_MEMORY', 'HIPCUB_RUNTIME_FUNCTION']
BinPackArguments: false
BinPackParameters: false
BitFieldColonSpacing: Both
Expand Down Expand Up @@ -133,4 +132,37 @@ SpacesInConditionalStatement: false
SpacesInContainerLiterals: true
SpacesInParentheses: false
SpacesInSquareBrackets: false

AttributeMacros:
- __host__
- __device__
- __global__
- __forceinline__
- __shared__
- __launch_bounds__
- HIPCUB_DEVICE
- HIPCUB_HOST
- HIPCUB_HOST_DEVICE
- HIPCUB_SHARED_MEMORY
- HIPCUB_RUNTIME_FUNCTION
- HIPCUB_DETAIL_DEPRECATED_DEBUG_SYNCHRONOUS

# Trick clang into thinking that our C-style attributes are C++-style attributes
# Make sure that the sizes line up for linebreaks etc
Macros:
- __host__=[[host]]
- __device__=[[device]]
- __global__=[[global]]
- __forceinline__=[[forceinline]]
- __shared__=[[shared]]
- __launch_bounds__(x)=[[launch_bounds(x)]]
- __attribute__(x)=[[attribute(x)]]
- HIPCUB_DEVICE=[[DEVICE___]]
- HIPCUB_HOST=[[HOST___]]
- HIPCUB_HOST_DEVICE=[[HOST_DEVICE___]]
- HIPCUB_SHARED_MEMORY=[[SHARED_MEMORY___]]
- HIPCUB_RUNTIME_FUNCTION=[[RUNTIME_FUNCTION___]]
- HIPCUB_DETAIL_DEPRECATED_DEBUG_SYNCHRONOUS=[[DETAIL_DEPRECATED_DEBUG_SYNCHRONOUS___]]
BreakAfterAttributes: Always

---
16 changes: 2 additions & 14 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ include:
- /defaults.yaml
- /deps-cmake.yaml
- /deps-docs.yaml
- /deps-format.yaml
- /deps-rocm.yaml
- /deps-nvcc.yaml
- /gpus-rocm.yaml
Expand All @@ -41,20 +42,7 @@ stages:

clang-format:
extends:
- .deps:rocm
stage: lint
needs: []
tags:
- build
variables:
CLANG_FORMAT: "/opt/rocm/llvm/bin/clang-format"
GIT_CLANG_FORMAT: "/opt/rocm/llvm/bin/git-clang-format"
rules:
- if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
script:
- cd $CI_PROJECT_DIR
- git config --global --add safe.directory $CI_PROJECT_DIR
- scripts/code-format/check-format.sh $CI_MERGE_REQUEST_DIFF_BASE_SHA --binary "$CLANG_FORMAT"
- .lint:clang-format

copyright-date:
extends:
Expand Down
14 changes: 13 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,18 @@
Documentation for hipCUB is available at
[https://rocm.docs.amd.com/projects/hipCUB/en/latest/](https://rocm.docs.amd.com/projects/hipCUB/en/latest/).

## (Unreleased) hipCUB-3.3.0 for ROCm 6.3.0

### Fixed

* Not all headers in hipCUB included `config.hpp` which could have resulted in build errors.

### Added
* Add support for large indices in `hipcub::DeviceSegmentedReduce::*`. rocPRIM's backend provides support for all reduce variants, but CUB's does not have support yet for `DeviceSegmentedReduce::Arg*`, so large indices support has been excluded for these as well in hipCUB.

### Changed
* The NVIDIA backend now requires CUB, Thrust and libcu++ 2.3.2. If it is not found it will be downloaded from the NVIDIA CCCL repository.

## (Unreleased) hipCUB-3.2.0 for ROCm 6.2.0

### Added
Expand Down Expand Up @@ -38,7 +50,7 @@ Documentation for hipCUB is available at
by setting the `CUB_DEBUG_SYNC` (or higher debug level) or the `HIPCUB_DEBUG_SYNC` preprocessor definition.
* The compile time deprecation warnings can be disabled by defining the `HIPCUB_IGNORE_DEPRECATED_API` preprocessor definition.

## (Unreleased) hipCUB-3.1.0 for ROCm 6.1.0
## hipCUB-3.1.0 for ROCm 6.1.0

### Changes

Expand Down
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ if(BUILD_ADDRESS_SANITIZER)
endif()

# Setup VERSION
set(VERSION_STRING "3.2.0")
set(VERSION_STRING "3.3.0")
rocm_setup_version(VERSION ${VERSION_STRING})

# Print configuration summary
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ python3 -m http.server
* Requires CMake 3.16.9 or later
* For NVIDIA GPUs:
* CUDA Toolkit
* CUB library
* CCCL library (>= 2.3.2)
* Automatically downloaded and built by the CMake script
* Requires CMake 3.15.0 or later
* Python 3.6 or higher (for HIP on Windows only; this is only required for install scripts)
Expand Down
2 changes: 1 addition & 1 deletion cmake/Dependencies.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ endif(USER_BUILD_BENCHMARK)

# CUB (only for CUDA platform)
if(HIP_COMPILER STREQUAL "nvcc")
set(CCCL_MINIMUM_VERSION 2.2.0)
set(CCCL_MINIMUM_VERSION 2.3.2)
if(NOT DOWNLOAD_CUB)
find_package(CUB ${CCCL_MINIMUM_VERSION} CONFIG)
find_package(Thrust ${CCCL_MINIMUM_VERSION} CONFIG)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/******************************************************************************
* Copyright (c) 2010-2011, Duane Merrill. All rights reserved.
* Copyright (c) 2011-2018, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2017-2022, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2017-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -30,6 +30,8 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_HISTOGRAM_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_HISTOGRAM_HPP_

#include "../../../config.hpp"

#include "../util_ptx.hpp"

#include <type_traits>
Expand Down
6 changes: 3 additions & 3 deletions hipcub/include/hipcub/backend/rocprim/block/block_load.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/******************************************************************************
* Copyright (c) 2010-2011, Duane Merrill. All rights reserved.
* Copyright (c) 2011-2018, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2017-2020, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2017-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -30,14 +30,14 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_LOAD_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_LOAD_HPP_

#include <type_traits>

#include "../../../config.hpp"

#include <rocprim/block/block_load.hpp>

#include "block_load_func.hpp"

#include <type_traits>

BEGIN_HIPCUB_NAMESPACE

namespace detail
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/******************************************************************************
* Copyright (c) 2011-2021, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2021, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2021-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -29,6 +29,8 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_MERGE_SORT_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_MERGE_SORT_HPP_

#include "../../../config.hpp"

#include "../thread/thread_sort.hpp"
#include "../util_math.hpp"
#include "../util_type.hpp"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/******************************************************************************
* Copyright (c) 2011, Duane Merrill. All rights reserved.
* Copyright (c) 2011-2018, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2021-2022, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2021-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -35,8 +35,6 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_RADIX_RANK_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_RADIX_RANK_HPP_

#include <stdint.h>

#include "../../../config.hpp"
#include "../../../util_type.hpp"
#include "../../../util_ptx.hpp"
Expand All @@ -48,6 +46,8 @@

#include <rocprim/block/block_radix_rank.hpp>

#include <stdint.h>

BEGIN_HIPCUB_NAMESPACE

namespace detail
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,13 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_RAKING_LAYOUT_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_RAKING_LAYOUT_HPP_

#include <type_traits>

#include "../../../config.hpp"

#include <rocprim/config.hpp>
#include <rocprim/detail/various.hpp>

#include <type_traits>

BEGIN_HIPCUB_NAMESPACE

/**
Expand Down
4 changes: 3 additions & 1 deletion hipcub/include/hipcub/backend/rocprim/block/block_reduce.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/******************************************************************************
* Copyright (c) 2010-2011, Duane Merrill. All rights reserved.
* Copyright (c) 2011-2018, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2017-2020, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2017-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -30,6 +30,8 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_REDUCE_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_REDUCE_HPP_

#include "../../../config.hpp"

#include <type_traits>

#include <rocprim/block/block_reduce.hpp>
Expand Down
6 changes: 3 additions & 3 deletions hipcub/include/hipcub/backend/rocprim/block/block_scan.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/******************************************************************************
* Copyright (c) 2010-2011, Duane Merrill. All rights reserved.
* Copyright (c) 2011-2018, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2017-2020, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2017-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -30,14 +30,14 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_SCAN_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_SCAN_HPP_

#include <type_traits>

#include "../../../config.hpp"

#include "../thread/thread_operators.hpp"

#include <rocprim/block/block_scan.hpp>

#include <type_traits>

BEGIN_HIPCUB_NAMESPACE

namespace detail
Expand Down
6 changes: 3 additions & 3 deletions hipcub/include/hipcub/backend/rocprim/block/block_shuffle.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/******************************************************************************
* Copyright (c) 2010-2011, Duane Merrill. All rights reserved.
* Copyright (c) 2011-2018, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2017-2020, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2017-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -30,14 +30,14 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_SHUFFLE_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_SHUFFLE_HPP_

#include <type_traits>

#include "../../../config.hpp"

#include "../thread/thread_operators.hpp"

#include <rocprim/block/block_shuffle.hpp>

#include <type_traits>

BEGIN_HIPCUB_NAMESPACE


Expand Down
6 changes: 3 additions & 3 deletions hipcub/include/hipcub/backend/rocprim/block/block_store.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/******************************************************************************
* Copyright (c) 2010-2011, Duane Merrill. All rights reserved.
* Copyright (c) 2011-2018, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2017-2020, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2017-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -30,14 +30,14 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_BLOCK_STORE_HPP_
#define HIPCUB_ROCPRIM_BLOCK_BLOCK_STORE_HPP_

#include <type_traits>

#include "../../../config.hpp"

#include "block_store_func.hpp"

#include <rocprim/block/block_store.hpp>

#include <type_traits>

BEGIN_HIPCUB_NAMESPACE

namespace detail
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
/******************************************************************************
* Copyright (c) 2011-2020, NVIDIA CORPORATION. All rights reserved.
* Modifications Copyright (c) 2021, Advanced Micro Devices, Inc. All rights reserved.
* Modifications Copyright (c) 2021-2024, Advanced Micro Devices, Inc. All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
Expand Down Expand Up @@ -35,13 +35,13 @@
#ifndef HIPCUB_ROCPRIM_BLOCK_RADIX_RANK_SORT_OPERATIONS_HPP_
#define HIPCUB_ROCPRIM_BLOCK_RADIX_RANK_SORT_OPERATIONS_HPP_

#include <type_traits>

#include "../../../config.hpp"

#include <rocprim/config.hpp>
#include <rocprim/type_traits.hpp>
#include <rocprim/detail/various.hpp>
#include <rocprim/config.hpp>
#include <rocprim/detail/various.hpp>
#include <rocprim/type_traits.hpp>

#include <type_traits>

BEGIN_HIPCUB_NAMESPACE

Expand Down
13 changes: 7 additions & 6 deletions hipcub/include/hipcub/backend/rocprim/device/device_reduce.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,20 +30,21 @@
#ifndef HIPCUB_ROCPRIM_DEVICE_DEVICE_REDUCE_HPP_
#define HIPCUB_ROCPRIM_DEVICE_DEVICE_REDUCE_HPP_

#include <iterator>
#include <limits>

#include <hip/hip_bfloat16.h> // hip_bfloat16
#include <hip/hip_fp16.h> // __half

#include "../../../config.hpp"

#include "../../../util_deprecated.hpp"
#include "../iterator/arg_index_input_iterator.hpp"
#include "../thread/thread_operators.hpp"

#include <rocprim/device/device_reduce.hpp>
#include <rocprim/device/device_reduce_by_key.hpp>

#include <hip/hip_bfloat16.h> // hip_bfloat16
#include <hip/hip_fp16.h> // __half

#include <iterator>
#include <limits>

BEGIN_HIPCUB_NAMESPACE
namespace detail
{
Expand Down
Loading

0 comments on commit 3b4f714

Please sign in to comment.