Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for newer incremental snapshot before downloading #28143

Closed
wants to merge 6 commits into from
Closed

Check for newer incremental snapshot before downloading #28143

wants to merge 6 commits into from

Conversation

willhickey
Copy link
Contributor

@willhickey willhickey commented Sep 29, 2022

Problem

When a validator downloads snapshots on startup it follows these steps:

  1. gets the latest full and incremental snapshot slots & hashes
  2. downloads the full snapshot
  3. downloads the incremental snapshot.

Step 2 can take a few minutes (eg on my dev machine it takes 10-20 minutes to download the 34GB mainnet-beta snapshot). With default configs (incremental snapshots every 100 slots, retain 4 snapshots) each incremental snapshot persists for about 8 minutes. By the time we get to step 3 the incremental snapshot has been deleted and we get a 404.

Without an incremental snapshot the node attempts to start from the full snapshot which may be very outdated.

Summary of Changes

After downloading the full snapshot get the slot and hash of the latest incremental snapshot before attempting the incremental snapshot download.

download-utils/src/lib.rs Outdated Show resolved Hide resolved
download-utils/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking about this! Thanks for taking a look. I was originally think about being more invasive and changing bootstrap to run the discovery of incremental snapshots after downloading the full snapshot. Your impl is a more straight-forward way to accomplish this task. If we discover issues later, we can also go full-bore with bootstrap.rs.

download-utils/src/lib.rs Outdated Show resolved Hide resolved
download-utils/src/lib.rs Outdated Show resolved Hide resolved
validator/src/bootstrap.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@diman-io diman-io left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I would prefer get last snapshot hash from gossip as in calling code, or send the bootstrap to second round for finding snapshots again (it'll skip downloading full snapshot ... maybe 😀)

Second, the snapshot hash of this newer incremental snapshot must be checked with known validators, if at least one of them is defined (--known-validator option).

// Check and see if we've already got the incremental snapshot; if not, download it
if let Some(incremental_snapshot_hash) = incremental_snapshot_hash {
// If the full snapshot download took more than a few minutes there is probably a newer incremental snapshot
let newest_known_incremental_snapshot = match check_for_newer_incremental_snapshot(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use shadowing here, we don't need original variable below.

let incremental_snapshot_hash = match check_for_newer_incremental_snapshot(
...

And it'll reduce PR too.

PS. I don't have write permission for suggesting.

@diman-io
Copy link
Contributor

diman-io commented Sep 29, 2022

@brooksprumo Maybe good simple solution is just check the difference between last snapshot slot and current slot after bootstrapping. If it greater then --maximum-local-snapshot-age then start bootstrap again?

@brooksprumo
Copy link
Contributor

@brooksprumo Maybe good simple solution is just check the difference between last snapshot slot and current slot after bootstrapping. If it greater then --maximum-local-snapshot-age then start bootstrap again?

Yeah, that sounds good to me! This would also hopefully fix the problem where the gossip tables are not fully populated and cause bootstrap to not see any incremental snapshots to download. With the extra time of downloading a full snapshot, that should be plenty of time to get the gossip tables with better information.

@brooksprumo
Copy link
Contributor

Related issue: #26180

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Dec 29, 2022
@github-actions github-actions bot closed this Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants