Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loop over confirm_slot until no more entries or slot completed #28535

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 21 additions & 14 deletions core/src/replay_stage.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1727,20 +1727,27 @@ impl ReplayStage {
// All errors must lead to marking the slot as dead, otherwise,
// the `check_slot_agrees_with_cluster()` called by `replay_active_banks()`
// will break!
blockstore_processor::confirm_slot(
blockstore,
bank,
&mut w_replay_stats,
&mut w_replay_progress,
false,
transaction_status_sender,
Some(replay_vote_sender),
None,
verify_recyclers,
false,
log_messages_bytes_limit,
prioritization_fee_cache,
)?;

let mut more_entries_to_process = true;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could do bank.is_complete here, but should've been picked up after-the-fact on the previous iteration and not marked for replay

// more entries may have been received while replaying this slot.
// looping over this ensures that slots will be processed as fast as possible with the
// lowest latency.
while more_entries_to_process {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want some kind of exit out of here based on timing as a safe-guard. Otherwise we are potentially not voting for a long time or starting a leader slot which wouldn't be great.

@carllin wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree that'd be the safest thing to do, could limit to 400-ish ms. worst case exec takes longer than expected and we just hit another iteration of replay before hitting this again

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sleep may not be necessary, worst case we just hit one more iteration of reading from blockstore before returning here right:

if slot_entries_load_result.0.is_empty() {
return Ok(false);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think sleep is the right word, more like:

let start = Instant::now();
while more_entries_to_process && start.elapsed() < Duration::from_millis(TIMEOUT) {
    confirm_slot(...);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm yeah that works, should prevent bad leader from DOSing by continually streaming new shreds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could even probably warn if it hits that timeout

more_entries_to_process = blockstore_processor::confirm_slot(
blockstore,
bank,
&mut w_replay_stats,
&mut w_replay_progress,
false,
transaction_status_sender,
Some(replay_vote_sender),
None,
verify_recyclers,
false,
log_messages_bytes_limit,
prioritization_fee_cache,
)?;
}
let tx_count_after = w_replay_progress.num_txs;
let tx_count = tx_count_after - tx_count_before;
Ok(tx_count)
Expand Down
13 changes: 10 additions & 3 deletions ledger/src/blockstore_processor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -991,7 +991,7 @@ fn confirm_full_slot(
let skip_verification = !opts.poh_verify;
let _ignored_prioritization_fee_cache = PrioritizationFeeCache::new(0u64);

confirm_slot(
let _more_entries_to_process = confirm_slot(
blockstore,
bank,
&mut confirmation_timing,
Expand Down Expand Up @@ -1131,7 +1131,7 @@ pub fn confirm_slot(
allow_dead_slots: bool,
log_messages_bytes_limit: Option<usize>,
prioritization_fee_cache: &PrioritizationFeeCache,
) -> result::Result<(), BlockstoreProcessorError> {
) -> result::Result<bool, BlockstoreProcessorError> {
let slot = bank.slot();

let slot_entries_load_result = {
Expand All @@ -1148,6 +1148,11 @@ pub fn confirm_slot(
load_result
}?;

if slot_entries_load_result.0.is_empty() {
return Ok(false);
}
let slot_full = slot_entries_load_result.2;

confirm_slot_entries(
bank,
slot_entries_load_result,
Expand All @@ -1160,7 +1165,9 @@ pub fn confirm_slot(
recyclers,
log_messages_bytes_limit,
prioritization_fee_cache,
)
)?;

Ok(!slot_full)
}

#[allow(clippy::too_many_arguments)]
Expand Down