-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add general purpose host memory allocator reference to cuIO with a demo of pooled-pinned allocation. #15079
Add general purpose host memory allocator reference to cuIO with a demo of pooled-pinned allocation. #15079
Conversation
…th benchmarks for pooled-pinned memory.
Nice work @nvdbaranec !! |
I like this ability. My only question is if we should follow the current optional memory resource passed into functions or if we should add this as a set/get.
Maybe this becomes:
I don't know where all this applies and the trouble of passing it through. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really surprised the old host_memory_resource works with the pool. I added rmm::mr::pinned_host_memory_resource
(which implements the cuda::mr::async_memory_resource
and cuda::mr::memory_resource
concepts instead of deriving from host_memory_resource
) specifically to enable use with pool_memory_resource
. Please use it instead of the old one.
…ref instead of host_resource_ref. Removed the pageable-memory path entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment (non-blocking): In an ideal (and not too distant) world, this entire file will be unnecessary.
One shouldn't need to define their own allocator, or vector type. We should have an cuda::mr::allocator
that can be constructed from a cuda::mr::resource_ref
.
I understand not wanting to wait for that, but I just want to give you a heads up on what is coming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. This is definitely worth replacing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flush
…eallocate functions so that we can pass the correct stream.
… define for default host alloc alignment instead of thrust.
/** | ||
* @brief Copy constructor | ||
*/ | ||
rmm_host_allocator(rmm_host_allocator const& other) = default; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In rmm::device_buffer
and device_uvector
we delete the copy constructor and copy-assignment operator because they don't allow specifying a stream. YMMV, just suggesting it may be good practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be better done as a followup. There are a number of places in cudf code using the assignment operator and thrust itself hits the copy constructor for mysterious reasons. For example, just calling reserve on the wrapping thrust::host_vector
causes it to happen (h_data.reserve(max_size);
). Something happening internally in thrust::detail::contiguous_storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this getting close. All of my remaining comments are non-blocking, so approving.
…ost_data field in hostdevice_vector.
…ties for lifetime management of the passed resource ref.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for addressing the feedback! Looks very clean now 🔥
/merge |
## Description Following #15079, we add a way to share the pinned pool in JNI with cuIO via the new method added by @nvdbaranec `set_host_memory_resource`. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [ ] The documentation is up to date with these changes. --------- Signed-off-by: Alessandro Bellina <[email protected]>
This PR adds a new interface to cuIO which controls where host memory allocations come from. It adds two core functions:
Addresses #14314
cudf::io::hostdevice_vector
was currently implemented in terms of athrust::host_vector<>
that explicitly uses an allocator calledpinned_host_vector
. I copied that and made a new class calledrmm_host_vector
which takes any host_resource_ref. This probably makespinned_host_vector
obsolete.Parquet benchmarks have a new commandline option which lets you toggle between 3 modes:
The ultimate intent here is to reduce the cpu-side overhead of the setup code that comes before the decode kernels in the parquet reader. The wins are pretty significant for our faster kernels (that is, where we are less dominated by gpu time)
Edit: Updated to use newly minted resource ref types from rmm itself. I also switched the type to be
host_async_resource_ref
even though in this case the user (thrust::host_vector
) doesn't explicitly go through the async path. In addition, the pageable memory path (an experimental feature) has been removed.Pinned
Pooled/pinned
Checklist