Skip to content

Commit

Permalink
[consensus] fix the race condition with the order of operations
Browse files Browse the repository at this point in the history
We send out the epoch change notification before spawning the commit task, so there's race condition that epoch manager
shuts down the buffer manager before the request created and we lose the final commit task.
  • Loading branch information
Zekun Li authored and zekun000 committed Feb 27, 2024
1 parent 0ad1162 commit c52fdc5
Showing 1 changed file with 10 additions and 11 deletions.
21 changes: 10 additions & 11 deletions consensus/src/pipeline/buffer_manager.rs
Original file line number Diff line number Diff line change
Expand Up @@ -344,17 +344,7 @@ impl BufferManager {
self.commit_proof_rb_handle
.replace(self.do_reliable_broadcast(commit_decision));
}
if aggregated_item.commit_proof.ledger_info().ends_epoch() {
self.commit_msg_tx
.send_epoch_change(EpochChangeProof::new(
vec![aggregated_item.commit_proof.clone()],
false,
))
.await;
// the epoch ends, reset to avoid executing more blocks, execute after
// this persisting request will result in BlockNotFound
self.reset().await;
}
let commit_proof = aggregated_item.commit_proof.clone();
self.persisting_phase_tx
.send(self.create_new_request(PersistingRequest {
blocks: blocks_to_persist,
Expand All @@ -369,6 +359,15 @@ impl BufferManager {
}))
.await
.expect("Failed to send persist request");
// this needs to be done after creating the persisting request to avoid it being lost
if commit_proof.ledger_info().ends_epoch() {
self.commit_msg_tx
.send_epoch_change(EpochChangeProof::new(vec![commit_proof], false))
.await;
// the epoch ends, reset to avoid executing more blocks, execute after
// this persisting request will result in BlockNotFound
self.reset().await;
}
info!("Advance head to {:?}", self.buffer.head_cursor());
self.previous_commit_time = Instant::now();
return;
Expand Down

0 comments on commit c52fdc5

Please sign in to comment.