-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARSE] Add support for rocSPARSE backend #544
base: develop
Are you sure you want to change the base?
Conversation
Tests log on W6800: amd_w6800_log.txt |
Yes, this is in progress! I expect I will need a few days at least. I'm aiming to merge the PR by the end of the year. |
af87004
to
c4b952c
Compare
@gajanan-choudhary I have updated the PR with recent changes from cuSPARSE. Note that it moved almost all the content from |
- The same sparse matrix handle cannot be reused for multiple operations | ||
``spmm``, ``spmv``, or ``spsv``. Doing so will throw a | ||
``oneapi::mkl::unimplemented`` exception. See `#332 | ||
<https://github.com/ROCm/rocSPARSE/issues/332>`_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is quite severe, but seems to be a legitimate issue on rocSPARSE
side right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If and when they fix this issue, though, will it be easy for us to make changes (with a version check of course) that correctly performs the operations rather than throwing an unimplemented
exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's easy to fix on oneMath side. The issue is also referenced in this comment: https://github.com/oneapi-src/oneMKL/pull/544/files#diff-3b8c1c2c71abd54f8f90f43415c2f17b2a7fdb81c2b882c210f3cba56b4679adR63
One would just need to remove the used
member, its 2 usages below as well as the mark_used
method.
if (this->format == detail::sparse_format::COO && | ||
!this->has_matrix_property(matrix_property::sorted)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the documentation, docs/domains/sparse_linear_algebra.rst
, you've written:
- The CSR format requires the column indices to be sorted within each row.
So shouldn't we be handling both CSR and COO formats here instead of just COO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I added this check in b6518c9
I've also made sure we also have tests on AMD and Nvidia which are skipped when a property is not set. The runtime example also need to set this property now.
handle_helper.rocsparse_handle_container_mapper_.insert( | ||
std::make_pair(piPlacedContext_, atomic_ptr)); | ||
|
||
sycl::detail::pi::contextSetExtendedDeleter(*placedContext_, ContextCallback, atomic_ptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it unchanged between the removed PI versus new UR APIs? Isn't there a sycl::detail::ur::contextSetExtendedDeleter
somewhere that you need to place under #ifdef ONEAPI_ONEMKL_PI_INTERFACE_REMOVED #else #endif...
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The contextSetExtendedDeleter
is still in the pi
namespace see https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/detail/ur.hpp#L100. As the comment says there are plans to deprecate it. The oneAPI core team at Codeplay started to change the scopeContextHandle mechanism for BLAS in #609. They will make sure that oneMKL Interface will keep working if/when contextSetExtendedDeleter
is deprecated.
If we ever need to change the scopeContextHandle mechanism, I would rather that we do it for both cuSPARSE and rocSPARSE backends in a separate PR.
auto event = queue.submit([&](sycl::handler& cgh) { | ||
auto acc = val.template get_access<sycl::access::mode::read_write>(cgh); | ||
detail::submit_host_task(cgh, queue, [=](sycl::interop_handle ih) { | ||
if (dvhandle->size != size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should throw an exception if dvhandle->size < size
instead of !=
. E.g., users may want to pad thesycl::buffer
for some reason with zeros in the end, which wouldn't work with the current if
condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, this is set_dense_vector_data
used for replacing/switching out arrays in an existing handle. The condition is fine then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although maybe it doesn't hurt to add the check both here and in init_xxx_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check to verify that the buffer size is compatible with the handle size is done elsewhere, see https://github.com/oneapi-src/oneMKL/blob/develop/src/sparse_blas/generic_container.hpp#L164
This is a common place that the other backends also use. That check will run for init_*_data
functions.
template <typename fpType> | ||
void init_dense_vector(sycl::queue& queue, dense_vector_handle_t* p_dvhandle, std::int64_t size, | ||
sycl::buffer<fpType, 1> val) { | ||
auto event = queue.submit([&](sycl::handler& cgh) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a check here for sycl::buffer
case that checks buffer->size() >= size
and throws an exception otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already done, see my comment above: #544 (comment)
CHECK_DESCR_MATCH(spmv_descr, alg, "spmv_optimize"); | ||
|
||
A_handle->mark_used(); | ||
auto& buffer_size = spmv_descr->temp_buffer_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why this variable specifically is kept and captured by reference outside compute_functor
while other members of spmv_descr
are created inside the functor? If it is because buffer_size
is used later in an if
condition, I'd prefer it if we change this approach and capture everything by [=]
in compute_functor
, even though it would mean replacing the last buffer_size > 0
with spmv_descr->temp_buffer_size > 0
. I know that what you have right now is expected to work in this particular case (because of the condition that spmv_descr
must live as long as spmv
is being called), it is normally a bad idea to capture variables by reference in SYCL functors that are going to be running asynchronously unless you have an immediate event.wait()
(e.g., like what we are doing in spmv_buffer_size()
function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that's fine, done in c53ef8d
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than some minor change requests. This PR was a lot easier to review having reviewed #527. Thanks for the fantastic work!
Thanks for the review Gajanan! For completeness, logs on MI210: log_mi210.txt FYI we will get a second approval and I aim to merge this after the oneMKL Interface renaming PR. |
Our AMD HW is unavailable for a while unfortunately. I will work on fixing the conflicts as soon as I can test the changes. |
Description
Add support for the rocSPARSE backend.
Depends on #527 and #532.
Rendered docs: docs.zip
Checklist
All Submissions
Do all unit tests pass locally? amd_log.txt nvidia_log.txt intel_log.txt
Have you formatted the code using clang-format?