Skip to content

Commit

Permalink
go/worker/storage: Fix sync deadlock
Browse files Browse the repository at this point in the history
Commit 2fdfd28 introduced a possibility of the
storage round syncing process to deadlock when an unfortunate series of events
occur in specific order.

Assume the following happens, in order:

1. Round X has just completed syncing and has been applied, so
   lastFullyAppliedRound has been updated to X and all metadata about round X
   has been removed from syncingRounds and hashCache. Since the round has not
   yet been finalized, cachedLastRound still points to X-1.

2. A new incoming block for round X+1 appears.

3. The code checks what needs to be synced and starts with cachedLastRound which
   still points to round X-1 (because round X has not yet been applied). Because
   sync metadata of round X have been removed, it assumes that round X is not
   yet synced so it starts syncing it by queuing sync requests and adding new
   metadata to the top of the syncingRounds heap.

4. Round X is finalized, so cachedLastRound is updated to X.

Since the top of the sync management loop checks if lastFullyAppliedRound + 1 is
at the top of the heap this leads to a deadlock. The top item contains metadata
for round X while the management loop is waiting for X+1.

This commit fixes the issue by using lastFullyAppliedRound in the incoming block
handler instead of cachedLastRound.
  • Loading branch information
kostko committed Jan 24, 2020
1 parent 199b3f2 commit c607f08
Showing 1 changed file with 9 additions and 8 deletions.
17 changes: 9 additions & 8 deletions go/worker/storage/committee/node.go
Original file line number Diff line number Diff line change
Expand Up @@ -590,25 +590,26 @@ mainLoop:
blk := inBlk.(*block.Block)
n.logger.Debug("incoming block",
"round", blk.Header.Round,
"last_synced", cachedLastRound,
"last_synced", lastFullyAppliedRound,
"last_finalized", cachedLastRound,
)

if _, ok := hashCache[cachedLastRound]; !ok && cachedLastRound == n.undefinedRound {
if _, ok := hashCache[lastFullyAppliedRound]; !ok && lastFullyAppliedRound == n.undefinedRound {
dummy := blockSummary{
Namespace: blk.Header.Namespace,
Round: cachedLastRound + 1,
Round: lastFullyAppliedRound + 1,
}
dummy.IORoot.Empty()
dummy.IORoot.Round = cachedLastRound + 1
dummy.IORoot.Round = lastFullyAppliedRound + 1
dummy.StateRoot.Empty()
dummy.StateRoot.Round = cachedLastRound + 1
hashCache[cachedLastRound] = &dummy
dummy.StateRoot.Round = lastFullyAppliedRound + 1
hashCache[lastFullyAppliedRound] = &dummy
}
// Determine if we need to fetch any old block summaries. In case the first
// round is an undefined round, we need to start with the following round
// since the undefined round may be unsigned -1 and in this case the loop
// would not do any iterations.
startSummaryRound := cachedLastRound
startSummaryRound := lastFullyAppliedRound
if startSummaryRound == n.undefinedRound {
startSummaryRound++
}
Expand All @@ -632,7 +633,7 @@ mainLoop:
hashCache[blk.Header.Round] = summaryFromBlock(blk)
}

for i := cachedLastRound + 1; i <= blk.Header.Round; i++ {
for i := lastFullyAppliedRound + 1; i <= blk.Header.Round; i++ {
syncing, ok := syncingRounds[i]
if ok && syncing.outstanding == maskAll {
continue
Expand Down

0 comments on commit c607f08

Please sign in to comment.