Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] impl and refactor device_scalar::set_value and value, respectively. #167

Merged
merged 19 commits into from
Oct 31, 2019

Conversation

cwharris
Copy link
Contributor

@cwharris cwharris commented Oct 24, 2019

Addresses #166

Depends on #169

@cwharris cwharris changed the title [DISCUSSION] Add value setter to device_scalar [REVIEW] Add value setter to device_scalar Oct 24, 2019
@cwharris
Copy link
Contributor Author

This change breaks cudf, so I've creating a sister pr here: rapidsai/cudf#3211

include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accidentally approved last time.

@cwharris
Copy link
Contributor Author

I added async versions of T value() and void set_value(T), as well as overloads for specifying the stream on which to perform the copy.

@cwharris cwharris requested a review from harrism October 25, 2019 18:22
@cwharris cwharris changed the title [REVIEW] Add value setter to device_scalar [REVIEW] impl and refactor device_scalar::set_value and value, respectively. Oct 25, 2019
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully we can simplify this a lot based on #169.

include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
@cwharris cwharris requested a review from harrism October 29, 2019 20:28
@kkraus14 kkraus14 added the 3 - Ready for review Ready for review by team label Oct 29, 2019
@cwharris cwharris requested a review from harrism October 30, 2019 18:34
Copy link
Contributor

@devavret devavret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this up.

include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
include/rmm/device_scalar.hpp Outdated Show resolved Hide resolved
Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to make synchronization explicit.

@@ -92,6 +92,14 @@ class device_scalar {

private:
rmm::device_buffer buff{sizeof(T)};

inline void _memcpy(void *dst, const void *src, cudaStream_t stream) const {
auto status = cudaMemcpyAsync(dst, src, sizeof(T), cudaMemcpyDefault,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please either use cudaMemcpy or add a cudaStreamSynchronize() after this. Otherwise it appears to be asynchronous and is therefore ambiguous with the documentation.

I know I said that cudaMemcpyAsync is not async when either src or dst is un-pinned host memory, but even we should make the synchronization explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot specify the stream with cudaMemcpy, but Jake mentioned specifying the stream may still be beneficial. I'm not sure how to proceed.

Copy link
Member

@harrism harrism Oct 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's still better to use cudaMemcpyAsync because it let's us sync a stream that's NOT the null stream (synchronizing the NULL stream is the same as cudaDeviceSynchronize(), which is overkill). To proceed, just add in cudaStreamSynchronize(stream) after the cudaMemcpyAsync like you had before.

Copy link
Member

@harrism harrism left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what you need, I believe.

include/rmm/device_scalar.hpp Show resolved Hide resolved
@cwharris cwharris requested a review from harrism October 31, 2019 00:56
@harrism harrism merged commit 723f4e6 into rapidsai:branch-0.11 Oct 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for review Ready for review by team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants