-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1983 Runtime optimizations 2 -- objgroup dispatch/locman/contexts #1984
1983 Runtime optimizations 2 -- objgroup dispatch/locman/contexts #1984
Conversation
Pipelines resultsPR tests (gcc-12, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (gcc-5, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (gcc-6, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-3.9, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (gcc-9, ubuntu, mpich, zoltan) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-10, ubuntu, mpich) Build for a501e7d (2023-01-30 20:12:35 UTC)
PR tests (clang-9, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (gcc-7, ubuntu, mpich, trace runtime, LB) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-5.0, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (gcc-10, ubuntu, openmpi, no LB) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-13, alpine, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (gcc-8, ubuntu, mpich, address sanitizer) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (nvidia cuda 11.0, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-12, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-11, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (nvidia cuda 10.1, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (intel icpx, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-13, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (clang-14, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (gcc-11, ubuntu, mpich, json schema test) Build for 5b48207 (2022-11-16 20:54:24 UTC)
PR tests (intel icpc, ubuntu, mpich) Build for 5b48207 (2022-11-16 20:54:24 UTC)
|
d22a1c4
to
f8d4a00
Compare
) { | ||
runnable::makeRunnable(msg, true, msg->getHandler(), theContext()->getNode()) | ||
.withTDEpochFromMsg() | ||
.run(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be .run()
or .enqueue()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a difficult one. .run()
is correct for all the use cases that currently exist. It runs a system handler on the collection manager that generates the next runnable that is enqueued.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems kinda wasteful. Could we add a method in the collection manager to do that more directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what the next PR will do. It's already implemented on my other branch.
if (local_registered_.find(id) == local_registered_.end()) { | ||
return routeMsg<MessageT>(id,home_node,msg); | ||
} else { | ||
return routeMsgHandlerLocal(msg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we somehow take advantage of having already done the lookup in local_registered_
in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know it's local so we create the runnable directly for it to execute. local_registered_
does not hold the pointer to the actual object (it does not manage objects), just an entry that corresponds to its existence.
305d00b
to
f8d4a00
Compare
Codecov Report
@@ Coverage Diff @@
## develop #1984 +/- ##
===========================================
- Coverage 84.87% 84.86% -0.01%
===========================================
Files 723 720 -3
Lines 25549 25632 +83
===========================================
+ Hits 21685 21753 +68
- Misses 3864 3879 +15
|
8d9f1e3
to
69221fb
Compare
69221fb
to
6e0814e
Compare
17ab108
to
425c4de
Compare
5b48207
to
79fd8d5
Compare
Could you squash down the small patch-up commits that correct mistakes in earlier commits in the PR into those earlier commits? |
79fd8d5
to
5c9ad2b
Compare
87a52cc
to
9fbe4b6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to get at least one other review on this, since it's pretty big
Any chance this could be split up into separate optimizations/PRs with maybe a short description of what's going on in it? It's a bit difficult to follow what all of the changes are doing or how they are more efficient. For example there are a few changes (the |
Basically, the objgroup dispatch wasn't going through RunnableNew the same way a collection goes through it causing extra overhead. The old dispatch was creating a runnable that didn't actually know about the object it was dispatching to, so the Runnable had to call back into the objgroup manager to get the object and actually run the member function. The new code gets the object and associates it with the RunnableNew like the collection manager does. |
|
||
if (not is_void) { | ||
if (is_obj) { | ||
task_ = [=] { objgroup::dispatchObjGroup(msg_, handler); }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the old code, where we had to create a std::function that calls back into the objgroup manager code (because the object isn't known here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor suggestions -- about the auto
thing it's more up to personal preference. Just one thing to flag is Phil's comment, you mentioned it changed but it didn't look like the diff changed.
@@ -334,7 +339,12 @@ struct RunnableNew { | |||
private: | |||
detail::Contexts contexts_; /**< The contexts */ | |||
MsgSharedPtr<BaseMsgType> msg_ = nullptr; /**< The associated message */ | |||
ActionType task_ = nullptr; /**< The runnable's task */ | |||
void* obj_ = nullptr; /**< Object pointer */ | |||
union { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should hold this up because of it, since C++17 is still WIP, but this would be much more elegant with C++17 variant
and std::visit
60b7c95
to
0701e75
Compare
0701e75
to
a501e7d
Compare
Fixes #1983