Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Commit

Permalink
Add requirement for memcpy_async cooperative variants
Browse files Browse the repository at this point in the history
  • Loading branch information
c0riolis committed Nov 20, 2020
1 parent 84e6493 commit eb79d53
Showing 1 changed file with 6 additions and 4 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ void memcpy_async(Group const & group, void * destination, void const * source,
Asynchronously copies `size` bytes from the memory location pointed to by `source` to the memory location pointed to by `destination`.
Both objects are reinterpreted as arrays of `unsigned char`.

1. Binds the asynchronous copy completion to `barrier` and issues the copy in the current thread.
2. Binds the asynchronous copy completion to `barrier` and cooperatively issues the copy across all threads in `group`.
3. Binds the asynchronous copy completion to `pipeline` and issues the copy in the current thread
4. Binds the asynchronous copy completion to `pipeline` and cooperatively issues the copy across all threads in `group`.

`cuda::memcpy_async` have similar constraints to [`std::memcpy`](https://en.cppreference.com/w/cpp/string/byte/memcpy), namely:
* If the objects overlap, the behavior is undefined.
* If either `destination` or `source` is an invalid or null pointer, the behavior is undefined (even if `count` is zero).
Expand All @@ -38,10 +43,7 @@ If _Shape_ is [`cuda::aligned_size_t`](./aligned_size_t.md)), `source` and `dest

If `pipeline` is in a _quitted state_ (see [`pipeline::quit`](../synchronization_library/pipeline/quit.md)), the behavior is undefined.

1. Binds the asynchronous copy completion to `barrier` and issues the copy in the current thread.
2. Binds the asynchronous copy completion to `barrier` and cooperatively issues the copy across all threads in `group`.
3. Binds the asynchronous copy completion to `pipeline` and issues the copy in the current thread
4. Binds the asynchronous copy completion to `pipeline` and cooperatively issues the copy across all threads in `group`.
For cooperative variants, if the parameters are not the same across all threads in `group`, the behavior is undefined.

## Template parameters

Expand Down

0 comments on commit eb79d53

Please sign in to comment.