Skip to content

Commit

Permalink
osc/pt2pt: Fix hang with Put and Win_lock_all
Browse files Browse the repository at this point in the history
 * When using `MPI_Put` with `MPI_Win_lock_all` a hang is possible since
   the `put` is waiting on `eager_send_active` to become `true` but
   that variable might not be reset in the case of `MPI_Win_lock_all`
   depending on other incoming events (e.g., `post` or ACKs of lock
   requests.

Signed-off-by: Joshua Hursey <[email protected]>
  • Loading branch information
markalle authored and jjhursey committed Dec 16, 2016
1 parent 0d1336b commit eec1d5b
Showing 1 changed file with 21 additions and 2 deletions.
23 changes: 21 additions & 2 deletions ompi/mca/osc/pt2pt/osc_pt2pt_comm.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
* Copyright (c) 2015 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2016 FUJITSU LIMITED. All rights reserved.
* Copyright (c) 2016 IBM Corporation. All rights reserved.
* $COPYRIGHT$
*
* Additional copyrights may follow
Expand Down Expand Up @@ -336,7 +337,16 @@ static inline int ompi_osc_pt2pt_put_w_req (const void *origin_addr, int origin_

if (is_long_msg) {
/* wait for eager sends to be active before starting a long put */
ompi_osc_pt2pt_sync_wait_expected (pt2pt_sync);
if (pt2pt_sync->type == OMPI_OSC_PT2PT_SYNC_TYPE_LOCK) {
OPAL_THREAD_LOCK(&pt2pt_sync->lock);
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
while (!(peer->flags & OMPI_OSC_PT2PT_PEER_FLAG_EAGER)) {
opal_condition_wait(&pt2pt_sync->cond, &pt2pt_sync->lock);
}
OPAL_THREAD_UNLOCK(&pt2pt_sync->lock);
} else {
ompi_osc_pt2pt_sync_wait_expected (pt2pt_sync);
}
}

OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
Expand Down Expand Up @@ -495,7 +505,16 @@ ompi_osc_pt2pt_accumulate_w_req (const void *origin_addr, int origin_count,

if (is_long_msg) {
/* wait for synchronization before posting a long message */
ompi_osc_pt2pt_sync_wait_expected (pt2pt_sync);
if (pt2pt_sync->type == OMPI_OSC_PT2PT_SYNC_TYPE_LOCK) {
OPAL_THREAD_LOCK(&pt2pt_sync->lock);
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
while (!(peer->flags & OMPI_OSC_PT2PT_PEER_FLAG_EAGER)) {
opal_condition_wait(&pt2pt_sync->cond, &pt2pt_sync->lock);
}
OPAL_THREAD_UNLOCK(&pt2pt_sync->lock);
} else {
ompi_osc_pt2pt_sync_wait_expected (pt2pt_sync);
}
}

header = (ompi_osc_pt2pt_header_acc_t*) ptr;
Expand Down

0 comments on commit eec1d5b

Please sign in to comment.