Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
#1899: envelope: fix bug with lack of epoch propagation on put messages
This is a long-standing bug that started showing up with the changes in #1899, making the runnable more efficient. TestGroup.test_group_range_construct_2 was consistently breaking on the gcc-8 build with address sanitizer. With extensive debugging, I tracked this bug down to the lack of propagation of an epoch on the remote group construction message when a rooted group is constructed where the constructing node is not included. Additionally, the group must be specified by a list (not a range). When these conditions occur, the group manager sends the information about the group to a remote node using a packed put, which uses the `PayloadMessage`: namely, ` struct GroupListMsg : GroupInfoMsg<GroupMsg<::vt::PayloadMessage>>`. The `PayloadMessage`, using a default template parameter, uses a basic envelope which is not large enough to store the epoch. Thus, it gets dropped. Therefore, the test sometimes breaks because the group construction and following broadcast escape the `runInEpochCollective` and the test condition will fail sometimes as it races with the delivery of the group message being broadcast to the group spanning tree.
- Loading branch information