-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] impl and refactor device_scalar::set_value and value, respectively. #167
Merged
Merged
Changes from 16 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
9a3ef24
add value setter to device_scalar
cwharris e57f4d7
changelog
cwharris 91b2d7a
fix doc
cwharris da98387
for device_scalar, rename `T get()` to `T data()`, rename `void value…
cwharris c27e290
Update include/rmm/device_scalar.hpp
cwharris aff0a4a
add value_async and set_value_async to device_scalar
cwharris 49592ca
Merge branch 'rmm-166' of github.com:cwharris/rmm into rmm-166
cwharris 772fb67
make device_scalar constructor use _set_value.
cwharris b03e7f1
use RMM_CHECK_CUDA in rmm::device_scalar
cwharris a16f5a6
revert calls to RMM_CHECK_CUDA
cwharris c80baa4
removing stored-stream overloads of rmm:device_scalar value/set_value
cwharris 5356a9c
remove trailing underscores on device_scalar constructor args
cwharris 6ca8265
Merge branch 'branch-0.11' of github.com:rapidsai/rmm into rmm-166
cwharris 525b4ca
remove async versions of device_scalar value/set_value
cwharris 6c1aa61
changelog
cwharris 53ebb7e
device_scalar doc adjustments
cwharris 3268c07
device_scalar explicit synchronization in value/set_value
cwharris e4e07cc
doc updates for device_scalar value/set_value
cwharris c5b8a99
fix build error
cwharris File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please either use cudaMemcpy or add a cudaStreamSynchronize() after this. Otherwise it appears to be asynchronous and is therefore ambiguous with the documentation.
I know I said that cudaMemcpyAsync is not async when either src or dst is un-pinned host memory, but even we should make the synchronization explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot specify the stream with
cudaMemcpy
, but Jake mentioned specifying the stream may still be beneficial. I'm not sure how to proceed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it's still better to use cudaMemcpyAsync because it let's us sync a stream that's NOT the null stream (synchronizing the NULL stream is the same as cudaDeviceSynchronize(), which is overkill). To proceed, just add in cudaStreamSynchronize(stream) after the cudaMemcpyAsync like you had before.