Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add distinct key inner join #14990

Merged
merged 78 commits into from
Feb 23, 2024
Merged
Show file tree
Hide file tree
Changes from 47 commits
Commits
Show all changes
78 commits
Select commit Hold shift + click to select a range
5676a7e
Add public and detail unique_hash_join classes
PointKernel Feb 6, 2024
f732c9c
Add implementation for public APIs
PointKernel Feb 6, 2024
f08fbee
Add HasNested template to the public unique hash join
PointKernel Feb 6, 2024
17dc00f
Move detail unique hash join declaration to a separate header
PointKernel Feb 7, 2024
03f3bb7
Update copyright year
PointKernel Feb 7, 2024
3fe41ef
Add has_nested template back for now
PointKernel Feb 7, 2024
4667bfa
Add comments
PointKernel Feb 7, 2024
156a2d5
Create the hash table based on the build table
PointKernel Feb 7, 2024
8526494
Update comments
PointKernel Feb 7, 2024
35f561c
Add some probe details
PointKernel Feb 7, 2024
4cd9a6c
Add probe kernel setups
PointKernel Feb 7, 2024
cf99ef0
Add probe kernel
PointKernel Feb 7, 2024
e893061
Add first test and fix several bugs
PointKernel Feb 8, 2024
08eeecb
Update tests
PointKernel Feb 8, 2024
5aa1a97
Fix a probe bug
PointKernel Feb 8, 2024
83c3d2f
Update tests
PointKernel Feb 8, 2024
30a9dd1
Update cpp/include/cudf/detail/unique_hash_join.cuh
PointKernel Feb 8, 2024
a049b8e
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 8, 2024
c55c42b
CMake formatting
PointKernel Feb 8, 2024
9e3a708
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 8, 2024
0f2c854
Fix docs
PointKernel Feb 8, 2024
c2d572d
Update cpp/include/cudf/join.hpp
PointKernel Feb 8, 2024
f59234d
Fix a typo
PointKernel Feb 8, 2024
ce9b546
Remove inner_join_size APIs
PointKernel Feb 8, 2024
4c9a391
Early return for empty join
PointKernel Feb 8, 2024
5e7f2a5
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 8, 2024
67c683b
Merge remote-tracking branch 'origin/unique-hash-join' into unique-ha…
PointKernel Feb 8, 2024
b3cd8c7
Doc cleanups
PointKernel Feb 8, 2024
a451cb4
Cleanups
PointKernel Feb 8, 2024
1854315
Add unique_inner_join benchmark
PointKernel Feb 8, 2024
ddd4eaf
Test flushing tile algo
PointKernel Feb 9, 2024
5d6821f
Add scalar-probing kernel and block-wise buffer + performance tuning
PointKernel Feb 9, 2024
a9c5529
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 9, 2024
41876be
Tune block size
PointKernel Feb 9, 2024
c1b583f
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 12, 2024
52f8c7e
Remove the unused option join size parameter
PointKernel Feb 13, 2024
a218567
Remove Hasher template parameter since no customization needed
PointKernel Feb 13, 2024
0ad3bdb
Move table comparison to a public member function
PointKernel Feb 13, 2024
5a435af
Update comment
PointKernel Feb 13, 2024
cd9dca3
Add more tests
PointKernel Feb 13, 2024
ee2844c
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 13, 2024
e210b9e
Remove _is_empty data member
PointKernel Feb 13, 2024
5bebd9e
Update cpp/include/cudf/detail/unique_hash_join.cuh
PointKernel Feb 13, 2024
51a8c26
Update docs
PointKernel Feb 13, 2024
40b4b4c
Clean up headers
PointKernel Feb 13, 2024
6cb38c7
Minor doc fix
PointKernel Feb 13, 2024
cc5a88e
Apply suggestions from code review
PointKernel Feb 13, 2024
9b9600a
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 13, 2024
99b3255
Rename unique_hash_join as distinct_hash_join
PointKernel Feb 13, 2024
2ee3c40
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 13, 2024
5aebff1
Custom nvtx range name for distinct_hash_join::inner_join
PointKernel Feb 13, 2024
1ba7015
Update cpp/src/join/distinct_hash_join.cu
PointKernel Feb 14, 2024
119a531
File reordering in CMake
PointKernel Feb 14, 2024
26f867a
Merge remote-tracking branch 'origin/unique-hash-join' into unique-ha…
PointKernel Feb 14, 2024
876ed60
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 14, 2024
4201de6
Minor add
PointKernel Feb 15, 2024
e6c1271
Fix a typo
PointKernel Feb 15, 2024
de2a20d
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 15, 2024
ae5c73a
Update to use the latest cuco
PointKernel Feb 15, 2024
7864c3d
Get rid of redundant syncs
PointKernel Feb 15, 2024
3e8b8fb
Use load factor ctor + update size type
PointKernel Feb 15, 2024
30724e9
Add doc
PointKernel Feb 15, 2024
6cc98f5
Merge branch 'branch-24.04' into unique-hash-join
PointKernel Feb 16, 2024
a56ff0c
Update cpp/src/join/distinct_hash_join.cu
PointKernel Feb 16, 2024
715a101
Remove default template parameter for distinct_hash_join
PointKernel Feb 16, 2024
73700d9
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 16, 2024
35253e8
Merge remote-tracking branch 'origin/unique-hash-join' into unique-ha…
PointKernel Feb 16, 2024
35f2d9d
Minor template parameter cleanups
PointKernel Feb 17, 2024
f6f66cf
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 21, 2024
800bee7
Add new symbol into doxyen group
PointKernel Feb 22, 2024
1259b2d
Minor improvement
PointKernel Feb 22, 2024
7a169e2
Merge remote-tracking branch 'upstream/branch-24.04' into unique-hash…
PointKernel Feb 22, 2024
2c6b439
Unconst
PointKernel Feb 22, 2024
a801cd3
Fix style
PointKernel Feb 22, 2024
bc8ca49
Merge branch 'branch-24.04' into unique-hash-join
PointKernel Feb 22, 2024
ce7d8c0
Merge branch 'branch-24.04' into unique-hash-join
PointKernel Feb 23, 2024
505a6ab
Update cpp/tests/join/distinct_join_tests.cpp
PointKernel Feb 23, 2024
4b376de
Fix style
PointKernel Feb 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,7 @@ add_library(
src/join/mixed_join_size_kernel_nulls.cu
src/join/mixed_join_size_kernels_semi.cu
src/join/semi_join.cu
src/join/unique_hash_join.cu
src/json/json_path.cu
src/lists/contains.cu
src/lists/combine/concatenate_list_elements.cu
Expand Down
2 changes: 1 addition & 1 deletion cpp/benchmarks/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ ConfigureNVBench(
# ##################################################################################################
# * join benchmark --------------------------------------------------------------------------------
ConfigureBench(JOIN_BENCH join/left_join.cu join/conditional_join.cu)
ConfigureNVBench(JOIN_NVBENCH join/join.cu join/mixed_join.cu)
ConfigureNVBench(JOIN_NVBENCH join/join.cu join/mixed_join.cu join/unique_join.cu)

# ##################################################################################################
# * iterator benchmark ----------------------------------------------------------------------------
Expand Down
77 changes: 77 additions & 0 deletions cpp/benchmarks/join/unique_join.cu
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include "join_common.hpp"

template <typename key_type, typename payload_type, bool Nullable>
void unique_inner_join(nvbench::state& state,
nvbench::type_list<key_type, payload_type, nvbench::enum_type<Nullable>>)
{
skip_helper(state);

auto join = [](cudf::table_view const& left_input,
cudf::table_view const& right_input,
cudf::null_equality compare_nulls,
rmm::cuda_stream_view stream) {
auto const has_nulls = cudf::has_nested_nulls(left_input) || cudf::has_nested_nulls(right_input)
? cudf::nullable_join::YES
: cudf::nullable_join::NO;
auto hj_obj = cudf::unique_hash_join<cudf::has_nested::NO>{
left_input, right_input, has_nulls, compare_nulls, stream};
return hj_obj.inner_join(stream);
};

BM_join<key_type, payload_type, Nullable>(state, join);
}

// inner join -----------------------------------------------------------------------
NVBENCH_BENCH_TYPES(unique_inner_join,
NVBENCH_TYPE_AXES(nvbench::type_list<nvbench::int32_t>,
nvbench::type_list<nvbench::int32_t>,
nvbench::enum_type_list<false>))
.set_name("unique_inner_join_32bit")
.set_type_axes_names({"Key Type", "Payload Type", "Nullable"})
.add_int64_axis("Build Table Size", {100'000, 10'000'000, 80'000'000, 100'000'000})
.add_int64_axis("Probe Table Size",
{100'000, 400'000, 10'000'000, 40'000'000, 100'000'000, 240'000'000});

NVBENCH_BENCH_TYPES(unique_inner_join,
NVBENCH_TYPE_AXES(nvbench::type_list<nvbench::int64_t>,
nvbench::type_list<nvbench::int64_t>,
nvbench::enum_type_list<false>))
.set_name("unique_inner_join_64bit")
.set_type_axes_names({"Key Type", "Payload Type", "Nullable"})
.add_int64_axis("Build Table Size", {40'000'000, 50'000'000})
.add_int64_axis("Probe Table Size", {50'000'000, 120'000'000});

NVBENCH_BENCH_TYPES(unique_inner_join,
NVBENCH_TYPE_AXES(nvbench::type_list<nvbench::int32_t>,
nvbench::type_list<nvbench::int32_t>,
nvbench::enum_type_list<true>))
.set_name("unique_inner_join_32bit_nulls")
.set_type_axes_names({"Key Type", "Payload Type", "Nullable"})
.add_int64_axis("Build Table Size", {100'000, 10'000'000, 80'000'000, 100'000'000})
.add_int64_axis("Probe Table Size",
{100'000, 400'000, 10'000'000, 40'000'000, 100'000'000, 240'000'000});

NVBENCH_BENCH_TYPES(unique_inner_join,
NVBENCH_TYPE_AXES(nvbench::type_list<nvbench::int64_t>,
nvbench::type_list<nvbench::int64_t>,
nvbench::enum_type_list<true>))
.set_name("unique_inner_join_64bit_nulls")
.set_type_axes_names({"Key Type", "Payload Type", "Nullable"})
.add_int64_axis("Build Table Size", {40'000'000, 50'000'000})
.add_int64_axis("Probe Table Size", {50'000'000, 120'000'000});
154 changes: 154 additions & 0 deletions cpp/include/cudf/detail/unique_hash_join.cuh
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include <cudf/hashing/detail/helper_functions.cuh>
#include <cudf/table/experimental/row_operators.cuh>
#include <cudf/types.hpp>

#include <rmm/cuda_stream_view.hpp>
#include <rmm/device_uvector.hpp>

#include <cuco/static_set.cuh>

#include <cstddef>
#include <memory>
#include <type_traits>
#include <utility>

namespace cudf::detail {

using cudf::experimental::row::lhs_index_type;
using cudf::experimental::row::rhs_index_type;

/**
* @brief An comparator adapter wrapping both self comparator and two table comparator
*/
template <typename Equal>
struct comparator_adapter {
comparator_adapter(Equal const& d_equal) : _d_equal{d_equal} {}

__device__ constexpr auto operator()(
cuco::pair<hash_value_type, lhs_index_type> const&,
cuco::pair<hash_value_type, lhs_index_type> const&) const noexcept
{
// All build table keys are unique thus `false` no matter what
return false;
}

__device__ constexpr auto operator()(
cuco::pair<hash_value_type, lhs_index_type> const& lhs,
cuco::pair<hash_value_type, rhs_index_type> const& rhs) const noexcept
{
if (lhs.first != rhs.first) { return false; }
return _d_equal(lhs.second, rhs.second);
}

private:
Equal _d_equal;
};

template <typename Hasher>
struct hasher_adapter {
hasher_adapter(Hasher const& d_hasher = {}) : _d_hasher{d_hasher} {}

template <typename T>
__device__ constexpr auto operator()(cuco::pair<hash_value_type, T> const& key) const noexcept
{
return _d_hasher(key.first);
}

private:
Hasher _d_hasher;
};

/**
* @brief Unique hash join that builds hash table in creation and probes results in subsequent
* `*_join` member functions.
*
* @tparam HasNested Flag indicating whether there are nested columns in build/probe table
*/
template <has_nested HasNested = has_nested::NO>
struct unique_hash_join {
private:
/// Row equality type for nested columns
using nested_row_equal = cudf::experimental::row::equality::strong_index_comparator_adapter<
cudf::experimental::row::equality::device_row_comparator<true, cudf::nullate::DYNAMIC>>;
/// Row equality type for flat columns
using flat_row_equal = cudf::experimental::row::equality::strong_index_comparator_adapter<
cudf::experimental::row::equality::device_row_comparator<false, cudf::nullate::DYNAMIC>>;

/// Device row equal type
using d_equal_type =
std::conditional_t<HasNested == has_nested::YES, nested_row_equal, flat_row_equal>;
using hasher = hasher_adapter<thrust::identity<hash_value_type>>;
using probing_scheme_type = cuco::experimental::linear_probing<1, hasher>;
using cuco_storge_type = cuco::experimental::storage<1>;

/// Hash table type
using hash_table_type =
cuco::experimental::static_set<cuco::pair<hash_value_type, lhs_index_type>,
cuco::experimental::extent<std::size_t>,
cuda::thread_scope_device,
comparator_adapter<d_equal_type>,
probing_scheme_type,
cudf::detail::cuco_allocator,
cuco_storge_type>;

bool _has_nulls; ///< true if nulls are present in either build table or probe table
cudf::null_equality _nulls_equal; ///< whether to consider nulls as equal
cudf::table_view _build; ///< input table to build the hash map
cudf::table_view _probe; ///< input table to probe the hash map
std::shared_ptr<cudf::experimental::row::equality::preprocessed_table>
_preprocessed_build; ///< input table preprocssed for row operators
std::shared_ptr<cudf::experimental::row::equality::preprocessed_table>
_preprocessed_probe; ///< input table preprocssed for row operators
hash_table_type _hash_table; ///< hash table built on `_build`

public:
unique_hash_join() = delete;
~unique_hash_join() = default;
unique_hash_join(unique_hash_join const&) = delete;
unique_hash_join(unique_hash_join&&) = delete;
unique_hash_join& operator=(unique_hash_join const&) = delete;
unique_hash_join& operator=(unique_hash_join&&) = delete;

/**
* @brief Constructor that internally builds the hash table based on the given `build` table.
*
* @throw cudf::logic_error if the number of columns in `build` table is 0.
*
* @param build The build table, from which the hash table is built
* @param probe The probe table
* @param has_nulls Flag to indicate if any nulls exist in the `build` table or
* any `probe` table that will be used later for join.
* @param compare_nulls Controls whether null join-key values should match or not.
* @param stream CUDA stream used for device memory operations and kernel launches.
*/
unique_hash_join(cudf::table_view const& build,
cudf::table_view const& probe,
jlowe marked this conversation as resolved.
Show resolved Hide resolved
bool has_nulls,
cudf::null_equality compare_nulls,
rmm::cuda_stream_view stream);

/**
* @copydoc cudf::unique_hash_join::inner_join
*/
std::pair<std::unique_ptr<rmm::device_uvector<size_type>>,
std::unique_ptr<rmm::device_uvector<size_type>>>
inner_join(rmm::cuda_stream_view stream, rmm::mr::device_memory_resource* mr) const;
};
} // namespace cudf::detail
68 changes: 67 additions & 1 deletion cpp/include/cudf/join.hpp
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2019-2023, NVIDIA CORPORATION.
* Copyright (c) 2019-2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -33,6 +33,11 @@

namespace cudf {

/**
* @brief Enum to indicate whether the unique join table has nested columns or not
*/
enum class has_nested : bool { YES, NO };

// forward declaration
namespace hashing::detail {
template <typename T>
Expand All @@ -41,6 +46,9 @@ class MurmurHash3_x86_32;
namespace detail {
template <typename T>
class hash_join;

template <has_nested>
class unique_hash_join;
} // namespace detail

/**
Expand Down Expand Up @@ -438,6 +446,64 @@ class hash_join {
const std::unique_ptr<impl_type const> _impl;
};

/**
* @brief Unique hash join that builds hash table in creation and probes results in subsequent
* `*_join` member functions
*
* @note Behavior is undefined if the build table contains duplicates.
* @note All NaNs are considered as equal
*
* @tparam HasNested Flag indicating whether there are nested columns in build/probe table
*/
// TODO: `HasNested` to be removed via dispatching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this TODO need to be addressed in this PR or a later PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a separate PR. None of my current solutions is good enough thus need more time to think about this.

template <has_nested HasNested = has_nested::NO>
class unique_hash_join {
PointKernel marked this conversation as resolved.
Show resolved Hide resolved
public:
unique_hash_join() = delete;
~unique_hash_join();
unique_hash_join(unique_hash_join const&) = delete;
unique_hash_join(unique_hash_join&&) = delete;
unique_hash_join& operator=(unique_hash_join const&) = delete;
unique_hash_join& operator=(unique_hash_join&&) = delete;

/**
* @brief Constructs a unique hash join object for subsequent probe calls
*
* @param build The build table that contains unique elements
* @param probe The probe table, from which the keys are probed
* @param has_nulls Flag to indicate if there exists any nulls in the `build` table or
* any `probe` table that will be used later for join
* @param compare_nulls Controls whether null join-key values should match or not
* @param stream CUDA stream used for device memory operations and kernel launches
*/
unique_hash_join(cudf::table_view const& build,
cudf::table_view const& probe,
PointKernel marked this conversation as resolved.
Show resolved Hide resolved
nullable_join has_nulls = nullable_join::YES,
null_equality compare_nulls = null_equality::EQUAL,
rmm::cuda_stream_view stream = cudf::get_default_stream());

/**
* Returns the row indices that can be used to construct the result of performing
* an inner join between two tables. @see cudf::inner_join().
*
* @param stream CUDA stream used for device memory operations and kernel launches
* @param mr Device memory resource used to allocate the returned indices' device memory.
*
* @return A pair of columns [`build_indices`, `probe_indices`] that can be used to construct
* the result of performing an inner join between two tables with `build` and `probe`
* as the join keys.
*/
std::pair<std::unique_ptr<rmm::device_uvector<size_type>>,
std::unique_ptr<rmm::device_uvector<size_type>>>
inner_join(rmm::cuda_stream_view stream = cudf::get_default_stream(),
rmm::mr::device_memory_resource* mr = rmm::mr::get_current_device_resource()) const;

private:
using impl_type = typename cudf::detail::unique_hash_join<HasNested>; ///< Implementation type

std::unique_ptr<impl_type> _impl; ///< Unique hash join implementation
};

/**
* @brief Returns a pair of row index vectors corresponding to all pairs
* of rows between the specified tables where the predicate evaluates to true.
Expand Down
Loading
Loading