Skip to content

Commit

Permalink
Fix race in clear scroll (#31259)
Browse files Browse the repository at this point in the history
Here is the problem: if two threads are racing and one hits a failure
freeing a context and the other succeeded, we can expose the value of
the has failure marker to the succeeding thread before the failing
thread has had a chance to set the failure marker. This is a problem if
the failing thread counted down the expected number of operations, then
be put to sleep by a gentle lullaby from the OS, and then the other
thread could count down to zero. Since the failing thread did not get to
set the failure marker, the succeeding thread would respond that the
clear scroll succeeded and that makes that thread a liar. This commit
addresses by first setting the failure marker before we potentially
expose its value to another thread.
  • Loading branch information
jasontedor authored Jun 12, 2018
1 parent e988ace commit a365435
Showing 1 changed file with 5 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -133,10 +133,13 @@ private void onFreedContext(boolean freed) {

private void onFailedFreedContext(Throwable e, DiscoveryNode node) {
logger.warn(() -> new ParameterizedMessage("Clear SC failed on node[{}]", node), e);
/*
* We have to set the failure marker before we count down otherwise we can expose the failure marker before we have set it to a
* racing thread successfully freeing a context. This would lead to that thread responding that the clear scroll succeeded.
*/
hasFailed.set(true);
if (expectedOps.countDown()) {
listener.onResponse(new ClearScrollResponse(false, freedSearchContexts.get()));
} else {
hasFailed.set(true);
}
}
}

0 comments on commit a365435

Please sign in to comment.