Switch to `cuda::std::tuple`/`<cuda/std/tuple>` #742

brycelelbach · 2020-06-30T18:28:44Z

Original Comment

Note that this requires us to sort out fetching the libcu++ headers in Thrust/CUB internal builds.

2022-04-26: @allisonvacanti Update

The libcu++ dependency will be merged soon for 2.0 (PR: NVIDIA/thrust#1605). Afterwards, we'll be able to start looking into this:

Run metabench study to test the difference in compile times between equivalent uses of cuda::std::tuple and the current thrust::tuple.
Scope out feasibility. Replace the current tuple with template <typename ...T> using tuple = cuda::std::tuple<T...>. Take note of what breaks and estimate how much effort it will be to fix. See the discussion of tuple_of_iterator_references below for a known trouble spot.

This information will let us make a better informed prioritization.

The text was updated successfully, but these errors were encountered:

andrewcorrigan · 2020-12-06T20:41:57Z

I'm hoping to try this out. Is there a branch anywhere implementing any of this? As a warm-up for cuda::std::tuple, I tried swapping in std::tuple and std::pair:

In thrust/pair.h:

namespace thrust
{
using std::pair;
using std::make_pair;
}

In thrust/tuple.h:

namespace thrust
{
using std::tuple;
using std::make_tuple;
using std::get;
using std::tuple_size;
using std::tuple_size_v;
using std::tuple_element;
using std::tuple_element_t;
using std::tie;
using std::tuple_cat;
using std::index_sequence;
using std::make_index_sequence;
}

However, I keep getting stuck on specializations related to detail::tuple_of_iterator_references, like the following:

/thrust/thrust/iterator/detail/tuple_of_iterator_references.h:124:8: error: class template partial specialization of 'tuple_size' not in a namespace enclosing '__1'
struct tuple_size<detail::tuple_of_iterator_references<Ts...>>
       ^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/__tuple:24:50: note: explicitly specialized declaration is here
template <class _Tp> struct _LIBCPP_TEMPLATE_VIS tuple_size;
                                                 ^

I'm not a C++ expert and am not sure what the right way to specialize things from std is. Therefore, I'm wondering: is there a branch already handling any of this for cuda::std::tuple? Alternatively, if either of you @brycelelbach or @allisonvacanti might have any advice, I would appreciate it greatly.

brycelelbach · 2020-12-07T01:00:34Z

So this should be doable now that libcu++ can be used from Thrust.

I knew the tuple of references thing was going to be a challenge. I'll defer to @allisonvacanti and @griwes to sort out what we should do here.

andrewcorrigan · 2020-12-07T01:11:28Z

It's exciting to hear this should now be doable. I hope I can be of assistance making it happen. Since I posted a few hours ago, I tried moving the specialization themselves into std:

namespace std
{
// define tuple_size, tuple_element, etc.
template<class... Ts>
struct tuple_size<thrust::detail::tuple_of_iterator_references<Ts...>>
  : std::integral_constant<size_t, sizeof...(Ts)>
{};

template<size_t i>
struct tuple_element<i, thrust::detail::tuple_of_iterator_references<>> {};


template<class T, class... Ts>
struct tuple_element<0, thrust::detail::tuple_of_iterator_references<T,Ts...>>
{
  using type = T;
};


template<size_t i, class T, class... Ts>
struct tuple_element<i, thrust::detail::tuple_of_iterator_references<T,Ts...>>
{
  using type = typename tuple_element<i - 1, thrust::detail::tuple_of_iterator_references<Ts...>>::type;
};
}

That seemed to get past that initial error. I guess we would have to define the specialization in whatever namespace the tuple implementation is in. Next, I hit another batch of errors also related to tuple_of_iterator_references:

thrust/thrust/system/detail/sequential/reduce_by_key.h:61:19: error: no viable conversion from
      'thrust::iterator_facade<thrust::zip_iterator<std::__1::tuple<thrust::detail::normal_iterator<thrust::device_ptr<int> >,
      thrust::detail::normal_iterator<thrust::device_ptr<int> > > >, std::__1::tuple<int, int>, thrust::system::cpp::detail::tag,
      thrust::random_access_traversal_tag, thrust::detail::tuple_of_iterator_references<thrust::device_reference<int>, thrust::device_reference<int> >,
      long>::reference' (aka 'thrust::detail::tuple_of_iterator_references<thrust::device_reference<int>, thrust::device_reference<int> >') to 'InputKeyType'
      (aka 'std::__1::tuple<int, int>')
    InputKeyType  temp_key   = *keys_first;

It's not clear to me why that conversion would fail.

alliepiper · 2020-12-07T16:38:11Z

I've been holding off on this because there are some issues with libcu++'s tuple, especially on MSVC (#955). Also, since Thrust has to build with the nvc++ compiler, we'll need to wait for #946 before we can actually make the switch.

I'm not familiar with tuple_of_iterator_references, but the name is horrifying. Sounds like a workaround that exists to make <something> work in C++98. We should dig into how it's used and see if it still needs a specialization, or if there's a better way to implement its functionality in the C++14 world.

alliepiper · 2020-12-07T16:44:14Z

Accidentally closed -- reopening.

If you'd like to start working on this while the libcu++ folks iron out those two issues, feel free! Just a heads up that it might be a while before we can actually merge anything from it.

I'd suggest trying out the current version of libcu++ cuda::std::tuple and see how it goes, rather than using the host-only std::tuple. They've done quite a bit of work already making the cuda version more portable, and may have fixed some of these issue already.

andrewcorrigan · 2020-12-07T16:46:54Z

Since this issue is closed, what would be the appropriate venue to discuss tuple_of_iterator_references? Should I open a new issue?

EDIT: just refreshed and saw you re-opened it.

andrewcorrigan · 2020-12-07T16:59:25Z

I very sincerely appreciate the transparency on your openness to merging this. No rush to merge anything. I just want to be able to test it myself, and have it ready for when those other issues are resolved.

In terms tuple_of_iterator_references, my understanding is that it implements tuple<device_references>, which looks like a tuple of r-values, but is in fact a tuple of l-values. I can look into finding a workaround that eliminates tuple_of_iterator_references entirely. Alternatively, I see code like the below, which might provide a lead.

// use this enable_if to avoid assigning to temporaries in the transform functors below
// XXX revisit this problem with c++11 perfect forwarding
template<typename T>
  struct enable_if_non_const_reference_or_tuple_of_iterator_references
    : thrust::detail::enable_if<
        is_non_const_reference<T>::value || is_tuple_of_iterator_references<T>::value
      >
{};

Alas, I still have negligible understanding of r-value references, perfect forwarding, etc. I guess it's time to learn!

Finally, I'll also try out cuda::std::tuple.

alliepiper · 2020-12-07T17:17:50Z

No problem -- you've contributed quite a bit to Thrust, I want to make sure you're looped into our plans when they overlap :)

That sounds familiar. I ran into a similar issue with tuples of reference proxies in the placeholder code. Maybe a similar solution will work here? What I did was:

Define a type that switches between T and T& based on whether T is a thrust::reference subclass: https://github.com/NVIDIA/thrust/blame/main/thrust/detail/functional/actor.h#L43-L49
Use this type to hold thrust::references with correct lifetimes and reference semantics: https://github.com/NVIDIA/thrust/blob/main/thrust/detail/functional/actor.inl#L87-L99

Not sure if this exactly fits the tuple_of_iterator_references usecase, but it sounds relevant?

Let me know if you get stuck, this is definitely a feature we'd like to have implemented quickly as soon as the prereqs are met. Dealing with reference proxies is a headache that I've had a lot of experience with, so I'll be happy to share what I can.

andrewcorrigan · 2021-05-29T20:17:43Z

I've made a few attempts over the last few months, but I'm out of my depth on this one.

alliepiper · 2021-06-01T20:03:06Z

No worries, thanks for giving it a shot.

alliepiper · 2022-04-27T01:44:57Z

Tentatively putting this in for 2.0. I'm expecting that this will take a significant amount of effort and may end up getting delayed again, but once we have the libcu++ dependency in place we should investigate this. Updated description with some subtasks.

jrhemstad · 2023-11-08T19:19:42Z

closed by #262

brycelelbach added the blocked This PR cannot be merged due to various reasons label Jun 30, 2020

brycelelbach added blocked This PR cannot be merged due to various reasons and removed blocked This PR cannot be merged due to various reasons labels Aug 31, 2020

bjude mentioned this issue Sep 29, 2020

variadic overload of make_zip_iterator NVIDIA/thrust#663

Closed

alliepiper mentioned this issue Oct 12, 2020

Preparing zip_iterator and tuple_of_iterator_references for variadic tuple NVIDIA/thrust#1311

Closed

andrewcorrigan mentioned this issue Oct 13, 2020

defines thrust::tuple_element consistently with std::tuple_element NVIDIA/thrust#1314

Merged

alliepiper closed this as completed Dec 7, 2020

alliepiper reopened this Dec 7, 2020

bdice mentioned this issue Nov 29, 2021

Refactor bit counting APIs, introduce valid/null count functions, and split host/device side code for segmented counts. rapidsai/cudf#9588

Merged

andrewcorrigan mentioned this issue Apr 18, 2022

variadic template implementation of thrust::tuple #695

Closed

alliepiper removed the blocked This PR cannot be merged due to various reasons label Apr 27, 2022

jrhemstad mentioned this issue Sep 14, 2022

Replace thrust::complex implementation with cuda::std::complex #819

Closed

jrhemstad added the thrust For all items related to Thrust. label Feb 22, 2023

jrhemstad mentioned this issue Mar 7, 2023

Replace thrust::tuple implementation with cuda::std::tuple NVIDIA/thrust#1798

Closed

github-project-automation bot added this to CCCL Nov 8, 2023

github-project-automation bot moved this to Todo in CCCL Nov 8, 2023

jarmak-nv transferred this issue from NVIDIA/thrust Nov 8, 2023

jrhemstad closed this as completed Nov 8, 2023

github-project-automation bot moved this from Todo to Done in CCCL Nov 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to `cuda::std::tuple`/`<cuda/std/tuple>` #742

Switch to `cuda::std::tuple`/`<cuda/std/tuple>` #742

brycelelbach commented Jun 30, 2020 •

edited by alliepiper

Loading

andrewcorrigan commented Dec 6, 2020 •

edited

Loading

brycelelbach commented Dec 7, 2020

andrewcorrigan commented Dec 7, 2020

alliepiper commented Dec 7, 2020

alliepiper commented Dec 7, 2020

andrewcorrigan commented Dec 7, 2020 •

edited

Loading

andrewcorrigan commented Dec 7, 2020 •

edited

Loading

alliepiper commented Dec 7, 2020

andrewcorrigan commented May 29, 2021

alliepiper commented Jun 1, 2021

alliepiper commented Apr 27, 2022

jrhemstad commented Nov 8, 2023

Switch to cuda::std::tuple/<cuda/std/tuple> #742

Switch to cuda::std::tuple/<cuda/std/tuple> #742

Comments

brycelelbach commented Jun 30, 2020 • edited by alliepiper Loading

Original Comment

2022-04-26: @allisonvacanti Update

andrewcorrigan commented Dec 6, 2020 • edited Loading

brycelelbach commented Dec 7, 2020

andrewcorrigan commented Dec 7, 2020

alliepiper commented Dec 7, 2020

alliepiper commented Dec 7, 2020

andrewcorrigan commented Dec 7, 2020 • edited Loading

andrewcorrigan commented Dec 7, 2020 • edited Loading

alliepiper commented Dec 7, 2020

andrewcorrigan commented May 29, 2021

alliepiper commented Jun 1, 2021

alliepiper commented Apr 27, 2022

jrhemstad commented Nov 8, 2023

Switch to `cuda::std::tuple`/`<cuda/std/tuple>` #742

Switch to `cuda::std::tuple`/`<cuda/std/tuple>` #742

brycelelbach commented Jun 30, 2020 •

edited by alliepiper

Loading

andrewcorrigan commented Dec 6, 2020 •

edited

Loading

andrewcorrigan commented Dec 7, 2020 •

edited

Loading

andrewcorrigan commented Dec 7, 2020 •

edited

Loading