Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SnapshotPackagerService purges old bank snapshots #31511

Conversation

brooksprumo
Copy link
Contributor

@brooksprumo brooksprumo commented May 5, 2023

Problem

We retain more bank snapshots than we need, longer than we need.

Bank snapshots are currently used for one thing: creating a snapshot archive. And currently in development, we'll also use bank snapshots to speed up restart by booting from a bank snapshot instead of a snapshot archive. In any case, an observation is that once a snapshot has been archived, all bank snapshots older are no longer useful.

(We want to keep around the bank snapshot that was just archived for the fast boot case mentioned above.)

Also, bank snapshots contain hard-links to account storage files (for fast boot). We should free those up as fast as possible, since they can artificially take up space if the accounts are stored in a RAM disk.

Note that we still have an outstanding problem to solve: What about bank snapshots that are not archived? This happens for accounts hash verifier requests, and if a node has disabled snapshots (which still must produce bank snapshots for AHV, but not snapshot archives). I'll be working on that separately from this PR.

Summary of Changes

SnapshotPackagerService will purge bank snapshots older than the one that was just archived.

@brooksprumo brooksprumo added the noCI Suppress CI on this Pull Request label May 5, 2023
@brooksprumo brooksprumo self-assigned this May 5, 2023
@brooksprumo brooksprumo force-pushed the snap/purge-bank-snapshots-older-than-slot branch from b169a7a to afd4e00 Compare May 5, 2023 14:22
@brooksprumo brooksprumo changed the title Snap/purge bank snapshots older than slot SnapshotPackagerService purges old bank snapshots May 5, 2023
@brooksprumo brooksprumo force-pushed the snap/purge-bank-snapshots-older-than-slot branch from afd4e00 to e8db8ac Compare May 5, 2023 18:42
@brooksprumo brooksprumo added CI Pull Request is ready to enter CI and removed noCI Suppress CI on this Pull Request labels May 5, 2023
@solana-grimes solana-grimes removed the CI Pull Request is ready to enter CI label May 5, 2023
@codecov
Copy link

codecov bot commented May 5, 2023

Codecov Report

Merging #31511 (e8db8ac) into master (39a092c) will increase coverage by 81.4%.
The diff coverage is 88.8%.

@@             Coverage Diff             @@
##           master   #31511       +/-   ##
===========================================
+ Coverage        0    81.4%    +81.4%     
===========================================
  Files           0      731      +731     
  Lines           0   205114   +205114     
===========================================
+ Hits            0   167019   +167019     
- Misses          0    38095    +38095     

@brooksprumo brooksprumo marked this pull request as ready for review May 5, 2023 19:56
@brooksprumo brooksprumo requested a review from apfitzge May 5, 2023 19:56
Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - logic seems sound that we don't want to keep the snapshots (hardlinks) around. They are no longer necessary after we archive (keeping current).

Taking it a step further, what reason is there for having more than the most reason snapshot?

@brooksprumo
Copy link
Contributor Author

brooksprumo commented May 5, 2023

LGTM - logic seems sound that we don't want to keep the snapshots (hardlinks) around. They are no longer necessary after we archive (keeping current).

Taking it a step further, what reason is there for having more than the most reason snapshot?

We will want to keep around the most recent bank snapshot so that we can boot from it (eventually). Beyond that, there's no need.

When the bank snapshots were put into temp directories, they were cleaned up as part of a normal running validator in AccountsBackgroundService when taking a new snapshot (bank, not archive). That's not really safe now, since the bank snapshots for archival are not in temp dirs... We're kinda hoping that AHV and SPS run fast enough so that ABS doesn't rug the bank snapshots... That's a future PR to fix up.

I'm sure there will be more to learn/fix here as well.

@brooksprumo brooksprumo merged commit a6c39de into solana-labs:master May 5, 2023
@brooksprumo brooksprumo deleted the snap/purge-bank-snapshots-older-than-slot branch May 5, 2023 21:22
brooksprumo added a commit to brooksprumo/solana that referenced this pull request May 6, 2023
brooksprumo added a commit to brooksprumo/solana that referenced this pull request May 9, 2023
brooksprumo added a commit to brooksprumo/solana that referenced this pull request May 9, 2023
brooksprumo added a commit to brooksprumo/solana that referenced this pull request May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants