-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add .bcast() #192
Add .bcast() #192
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
0f6cd27
to
5ec8bb6
Compare
…ing .get_ptr() * Rework the Span<> class to be more similar with C++20's std::span. * Add a SingleElementModifyableBuffer using the new Span<>'s interface. * Update the existing Buffers' interface to use the new Span<> * Split the existing Buffers' .get_ptr() into .resize() and .data() * Adjust allreduce and alltoall to use the new Span<> and Buffers
5ec8bb6
to
69f7a9c
Compare
KASSERT( | ||
recv_buf_large_enough_on_all_processes(send_recv_buf, root.rank()), | ||
"The receive buffer is too small on at least one rank.", assert::light_communication); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we want an execution path where kamping resizes the buffer so that it is large enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What should the behavior be in the following cases:
- send_recv_buf.size() == 1 on root and == 0 on all other ranks, send_counts not specified: Assume a default size of 1? Broadcast the size? -> doubling the number of broadcasts might be unexpected here; depend on if the ominous SingleElementBuffer is specified?
- send_counts explicitly specified: resize the buffer on all but the root process? On all including the root process? What, if the send_recv_buf() on root is const?
- send_counts not specified and not case 1: Broadcast the size of the send buffer, resize the receive buffers, broadcast the data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @DanielSeemaier does something like this in #170. Bcast and Scatter should probably use the same mechanisms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postponed, see #170 .
We could do that in a future PR when we see what's best (in particular this might change with our ongoing discussion on reducing syntax overhead anyways)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I still don't really like the way it's done here. We never expected the size of the receive buffers to be set by the user and I would like to keep it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would force the user to write different code for root and non-root ranks. Probably not ideal
What would you think about send_recv_buf
being optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Year, I don't like splitting up the send_recv_buf into a send_buf
and recv_buf
, too. But iirc someone argued sternly for it, as it'll unify the interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this then? :-)
if I'm the root:
assert, that I have a send_recv_buf
if I have no recv_count specified:
// assume, that the other ranks don't know how much data to receive
broadcast the amount of data to receive
else:
// assume that all other ranks also know about their recv_count
Check, that the recv_counts are the same on all ranks
broadcast the data
else: // if I'm not the root:
if I do not have a send_recv_buf, allocate one
if I have a recv_count:
resize my send_recv_buf
assert, that all ranks have the same recv_count (eq. to the one provided to the root)
else:
receive the broadcast of the amount of data to receive
resize my send_recv_buf
receive the broadcast of the data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that resizing the send_recv_buf should happen after receiving the recv_count, I think that should work fine. Not sure I remember the part of the discussion about a unified interface. @kurpicz ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I looked at the old protocols and we already voted in favor of "(Bcast) SendRecv-Buffer auf allen ranks oder wenn SendRecv-Buffer nicht gegeben ist, ist man kein Root und man bekommt das Recv Result zurück."
On hold until we decided if we want all ranks to know about failed assertions or only some. |
…s send_recv_buf to bcast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite a few comments but I think they're all pretty minor.
0152a01
to
afd4166
Compare
tests/collectives/mpi_bcast_test.cpp
Outdated
std::iota(values.begin(), values.end(), comm.rank() * 10); | ||
kamping::Span<int> transfer_view(values.data(), asserting_cast<size_t>(num_transferred_values)); | ||
|
||
/// @todo Once we can check assertions, check that providing an recv_count != transfer_view.size() fails. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You added tests for this elsewhere, so this comment can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably uncomment the assertion tests in #329
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DanielSeemaier was quicker merging #329 ;) Hopefully just uncommenting your tests makes them pass ;)
Nope ... my guess is, that his approach does not work when only some ranks fail the KASSERT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one KASSERT test that I don't understand why it should fail.
Other than that, saying EXPECT_KASSERT_FAILS is broken (needs a fix) seems a bit aggressive. You want an added feature ;)
Closes #173