Enable user-provided lock table for `atomic_ref<T>` #990

jrhemstad · 2021-11-18T22:59:41Z

I would like to be able to use atomic_ref<T> for sizeof(T) > 8B, i.e., atomic_ref<T>::is_always_lock_free == false.

Obviously this can't rely on built-in atomic operations.Typical implementations will make use of a lock table to support this usage where you use the address of the referenced object as a lookup into a table of mutexes.

Generic support for a lock table underneath a cuda::std::atomic_ref would be extremely non-trivial to support on all platforms and for it to work heterogeneously.

However, a less generic solution that would still be very useful would be to allow a user to provide their own lock table.

@ogiroux's idea is to partially specialize cuda::atomic_ref for is_always_lock_free == false to contain a pointer to a lock table to be supplied by the user via the atomic_ref constructor at each construction. It would likely be useful to supply an opaque cuda::atomic_lock_table<N, Scope> type that a user could instantiate and manage however they like.

A rough sketch of what this could look like:

__managed__ cuda::atomic_lock_table<1024, thread_scope_device> table;

__global__ void kernel(int4 * i){
   cuda::atomic_ref<int4, thread_scope_device> ref{i[0], table};
   ref.atomic_exchange( int4{1, 2, 3, 4} ); // internally locks a mutex in `table`
}

This would depend on #949 for implementing the atomic_lock_table.

The text was updated successfully, but these errors were encountered:

jrhemstad · 2021-11-18T23:02:04Z

For reference, this functionality would be extremely useful in cuCollections where we want to use atomic_ref but we need support Key/Value types larger than 8B.

jrhemstad · 2022-02-03T01:26:25Z

@ogiroux

I've been thinking about what the lock_table object would look like.

I was thinking something along the lines of this:

// Opaque type for lock table to pass to atomic_ref
template <size_t N, thread_scope Scope, typename AccessProperty>
struct lock_table{
   static void init(lock_table* t){...}
private:
   cuda::std::array<cuda::mutex<Scope>, N> arr_;

   // Need to let atomic_ref access the storage
   template <typename T, thread_scope Scope>
   friend class atomic_ref; 
};

I want to put the access property in there so I can expose control to how the accesses to the mutexes are cached. My guess is that it'll be pretty important to perf to ensure the locks aren't getting thrashed out of L2.

That said, I'm not sure if this will work with the current limitations of the compiler unless mutex exposed an access_property as well.

jrhemstad mentioned this issue Jan 27, 2022

Added support for most of <mutex> NVIDIA/libcudacxx#113

Closed

sleeepyjack mentioned this issue Aug 9, 2022

[FEA] Migrate from cuda::atomic to cuda::atomic_ref NVIDIA/cuCollections#183

Closed

jrhemstad added thrust For all items related to Thrust. libcu++ For all items related to libcu++ and removed thrust For all items related to Thrust. labels Feb 22, 2023

miscco self-assigned this Feb 23, 2023

github-project-automation bot added this to CCCL Nov 8, 2023

github-project-automation bot moved this to Todo in CCCL Nov 8, 2023

jarmak-nv transferred this issue from NVIDIA/libcudacxx Nov 8, 2023

miscco removed their assignment Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable user-provided lock table for `atomic_ref<T>` #990

Enable user-provided lock table for `atomic_ref<T>` #990

jrhemstad commented Nov 18, 2021 •

edited

Loading

jrhemstad commented Nov 18, 2021

jrhemstad commented Feb 3, 2022 •

edited

Loading

Enable user-provided lock table for atomic_ref<T> #990

Enable user-provided lock table for atomic_ref<T> #990

Comments

jrhemstad commented Nov 18, 2021 • edited Loading

jrhemstad commented Nov 18, 2021

jrhemstad commented Feb 3, 2022 • edited Loading

Enable user-provided lock table for `atomic_ref<T>` #990

Enable user-provided lock table for `atomic_ref<T>` #990

jrhemstad commented Nov 18, 2021 •

edited

Loading

jrhemstad commented Feb 3, 2022 •

edited

Loading