reduce_by_key (CUDA) should not assume its output iterator is default constructible #1812

harrism · 2022-10-11T03:55:10Z

As discussed in #1804, thrust::reduce_by_key's CUDA implementation assumes that the output iterator it is passed can be default constructed.

thrust/thrust/system/cuda/detail/reduce_by_key.h

Line 1087 in d3e6fa1

pair<KeysOutputIt, ValuesOutputIt> result{};

This is made necessary by the implementation of the dispatch macro:

thrust/thrust/system/cuda/detail/dispatch.h

Lines 29 to 37 in d3e6fa1

    
           #define THRUST_INDEX_TYPE_DISPATCH(status, call, count, arguments) \ 
        
               if (count <= thrust::detail::integer_traits<thrust::detail::int32_t>::const_max) { \ 
        
                   auto THRUST_PP_CAT2(count, _fixed) = static_cast<thrust::detail::int32_t>(count); \ 
        
                   status = call arguments; \ 
        
               } \ 
        
               else { \ 
        
                   auto THRUST_PP_CAT2(count, _fixed) = static_cast<thrust::detail::int64_t>(count); \ 
        
                   status = call arguments; \ 
        
               }

The dispatch mechanism should be modified so that it's return value can be used to initialize a local so that a temporary does not need to be default constructed in order to use it.

The text was updated successfully, but these errors were encountered:

jrhemstad · 2022-10-11T21:11:35Z

I wonder if one sneaky but convenient solution to this would be to use a thrust::optional for the result object that is passed to the dispatch macro, so:

    thrust::optional<pair<KeysOutputIt, ValuesOutputIt>> result{};
    THRUST_INDEX_TYPE_DISPATCH(result,
                               reduce_by_key_dispatch,
                               num_items,
                               (policy,
                                keys_first,
                                num_items_fixed,
                                values_first,
                                keys_output,
                                values_output,
                                equality_op,
                                reduction_op));

    return *result;

This would obviate the need to default construct an instance of the output iterators and doesn't require any changes to the dispatch macro as the status = call arguments will invoke the optional::operator= that fills the optional with a value.

Then we can extract the value before returning.

harrism · 2022-10-11T23:53:48Z

I was thinking of trying a new dispatch macro and use it at first only in this API. If it works, it can be gradually adopted elsewhere...

I still hate the "num_items_fixed" hack...

harrism · 2022-12-05T21:00:25Z

@senior-zero Perhaps you didn't see this issue when I filed it, but I think this should now be closed along with #1826?

Please also look at NVIDIA/cccl#821.

gevtushenko · 2022-12-06T03:58:48Z

Hello, @harrism! I've missed this issue indeed. I don't think it should be closed, though. I'd be glad to have an index dispatch version that returns a value. I think this would simplify the code and make future issues less probable.

jrhemstad · 2023-03-07T15:31:36Z

closed by #1827

We can open a new issue later as necessary when we update the dispatch mechanism across all algorithms.

cccl-authenticator-app bot added this to CCCL Oct 11, 2022

harrism mentioned this issue Oct 11, 2022

transform_output_iterator (and transform_input_output_iterator) unusable with reduce_by_key due to missing default constructor #1804

Closed

jrhemstad added the thrust label Feb 22, 2023

jrhemstad closed this as completed Mar 7, 2023

github-project-automation bot moved this to Done in CCCL Mar 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reduce_by_key (CUDA) should not assume its output iterator is default constructible #1812

reduce_by_key (CUDA) should not assume its output iterator is default constructible #1812

harrism commented Oct 11, 2022

jrhemstad commented Oct 11, 2022

harrism commented Oct 11, 2022 •

edited

Loading

harrism commented Dec 5, 2022

gevtushenko commented Dec 6, 2022

jrhemstad commented Mar 7, 2023

reduce_by_key (CUDA) should not assume its output iterator is default constructible #1812

reduce_by_key (CUDA) should not assume its output iterator is default constructible #1812

Comments

harrism commented Oct 11, 2022

jrhemstad commented Oct 11, 2022

harrism commented Oct 11, 2022 • edited Loading

harrism commented Dec 5, 2022

gevtushenko commented Dec 6, 2022

jrhemstad commented Mar 7, 2023

harrism commented Oct 11, 2022 •

edited

Loading