Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve makeRunnable allocation performance #1899

Closed
3 tasks
lifflander opened this issue Aug 4, 2022 · 0 comments · Fixed by #1963
Closed
3 tasks

Improve makeRunnable allocation performance #1899

lifflander opened this issue Aug 4, 2022 · 0 comments · Fixed by #1963
Assignees

Comments

@lifflander
Copy link
Collaborator

What Needs to be Done?

  • First do this
  • Then do that.
  • Also this other thing.

Is your feature request related to a problem? Please describe.

Describe potential solution outcome

Describe alternatives you've considered

Additional context

@lifflander lifflander self-assigned this Aug 4, 2022
lifflander added a commit that referenced this issue Aug 5, 2022
@lifflander lifflander linked a pull request Aug 5, 2022 that will close this issue
thearusable pushed a commit that referenced this issue Aug 24, 2022
thearusable pushed a commit that referenced this issue Aug 25, 2022
cz4rs pushed a commit that referenced this issue Sep 28, 2022
cz4rs pushed a commit that referenced this issue Sep 28, 2022
cz4rs pushed a commit that referenced this issue Sep 28, 2022
cz4rs pushed a commit that referenced this issue Sep 28, 2022
cz4rs pushed a commit that referenced this issue Sep 28, 2022
cz4rs pushed a commit that referenced this issue Sep 28, 2022
This is a long-standing bug that started showing up with the changes
in #1899, making the runnable more
efficient. TestGroup.test_group_range_construct_2 was consistently
breaking on the gcc-8 build with address sanitizer. With extensive
debugging, I tracked this bug down to the lack of propagation of an
epoch on the remote group construction message when a rooted group is
constructed where the constructing node is not included. Additionally,
the group must be specified by a list (not a range). When these
conditions occur, the group manager sends the information about the
group to a remote node using a packed put, which uses the
`PayloadMessage`: namely, ` struct GroupListMsg :
GroupInfoMsg<GroupMsg<::vt::PayloadMessage>>`. The `PayloadMessage`,
using a default template parameter, uses a basic envelope which is not
large enough to store the epoch. Thus, it gets dropped. Therefore, the
test sometimes breaks because the group construction and following
broadcast escape the `runInEpochCollective` and the test condition
will fail sometimes as it races with the delivery of the group message
being broadcast to the group spanning tree.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants