Create bank snapshots #4244

sambley · 2019-05-10T03:33:55Z

Problem

Full node startup is slow when many transactions have been processed as this involved querying for the full ledger and replaying it.

Summary of Changes

Serialize bank state into a snapshot and try to restore from that on boot.

Fixes #2475

codecov · 2019-05-10T03:59:16Z

Codecov Report

Merging #4244 into master will increase coverage by <.1%.
The diff coverage is 87.8%.

@@           Coverage Diff            @@
##           master   #4244     +/-   ##
========================================
+ Coverage    78.7%   78.7%   +<.1%     
========================================
  Files         165     166      +1     
  Lines       28639   29223    +584     
========================================
+ Hits        22557   23017    +460     
- Misses       6082    6206    +124

codecov · 2019-05-10T03:59:17Z

Codecov Report

Merging #4244 into master will increase coverage by 1.6%.
The diff coverage is 88.2%.

@@           Coverage Diff            @@
##           master   #4244     +/-   ##
========================================
+ Coverage    77.5%   79.2%   +1.6%     
========================================
  Files         178     179      +1     
  Lines       32049   32051      +2     
========================================
+ Hits        24857   25403    +546     
+ Misses       7192    6648    -544

sambley · 2019-05-10T13:55:06Z

@aeyakovenko, regarding serializing the BankStatusCache, could you clarify more on what you mean by "Only the signatures per block are needed"

aeyakovenko · 2019-05-10T13:57:51Z

@sambley the status code is not necessary. We just need the signatures to do a duplicate check. It might be possible to shrink the signature size to 16 random bytes without loss to security.

aeyakovenko · 2019-05-13T00:36:03Z

@sambley 20 bytes would definitely be secure

sambley · 2019-05-13T00:39:49Z

@aeyakovenko, to reduce the snapshot size, have split the cache entries in status cache into two, with one holding the items for which snapshot has been taken prior and the current active set for which snapshot needs to be taken. This allows for incremental snapshots of status cache, do you see any issues with this approach. We can further cut this down by reducing by using only 20 bytes of signature like you had indicated
https://github.com/solana-labs/solana/pull/4244/files#diff-92c739d9ad61135b886d1a44957fe485

aeyakovenko · 2019-05-13T02:33:28Z

@sambley, I just thought of this, but blocktree already stores all the signatures. @rob-solana @carllin wouldn’t it be easier to just scan the last 32 blocks to recover them

carllin · 2019-05-13T03:43:07Z

@aeyakovenko, blocktree only stores blobs, so you would have to copy the blobs out of storage, deserialize them, then scan each transaction for the signature, seems slow. Moreover, somebody could have sent you a bogus transaction that didn't actually make it successfully through the bank/update the status cache, which we don't currently track

aeyakovenko · 2019-05-13T05:45:22Z

@carllin don't we already know the root fork? we just need to read all the blocks chained to it to get the signature cache. the snapshot for the cache is read from disk anyways.

carllin · 2019-05-13T07:53:30Z

@aeyakovenko, yes, the root fork is stored in blocktree. You can get all the blocks that are children of this root, but how do you propose to rebuild he signature cache from these blocks?

aeyakovenko · 2019-05-13T13:21:54Z

@carllin read all the transactions and read their signatures. The status itself is not important.

carllin · 2019-05-13T20:39:23Z

Currently, there could potentially be transactions that get included in blocktree, but their signatures don't get included in the status cache (for instance if they cause transaction errors).

aeyakovenko · 2019-05-14T01:51:48Z

@carllin anything that is recorded into the ledger should be in the cache. And vice versa. Otherwise it’s a ram spam attack vector.

carllin · 2019-05-14T04:12:06Z

We don't currently mark blobs as invalid or purge them if they cause unexpected errors. We also don't mark those forks as dead. Thats something that still needs to be done. But even if this work was completed, it's possible that a blob that causes such unexpected errors exists in blocktree on restart if the node shuts down/restarts before the purging logic can complete

aeyakovenko · 2019-05-14T04:13:55Z

@carllin given a root fork, we can’t reconstruct the ledger that got us there?

carllin · 2019-05-14T05:12:18Z

@aeyakovenko, you can't currently reliably derive all the transactions that went into that root fork. You know that some subset of the transactions in the blocks that chain to the root fork were included.

aeyakovenko · 2019-05-14T05:33:46Z

@carlin, I don’t understand what is missing. A root fork depends on frozen blocks. So the entire frozen block path to genesis is available. Each block contains exactly the set of transactions that compose that block. That’s all we need.

carllin · 2019-05-14T05:40:50Z

I think the problem is the assumption that every transaction in a block makes it into the status cache, which is not currently true. The signatures of transactions that cause certain types of errors do not get recorded into the status cache.

aeyakovenko · 2019-05-14T05:41:58Z

@carllin, those transactions don’t need to be present in the cache.

carllin · 2019-05-14T05:43:00Z

Currently blocktree cannot differentiate those transactions that cause errors from those that would succeed without replaying those transactions.

aeyakovenko · 2019-05-14T05:48:32Z

@carllin we don’t need the status, just the signatures to avoid encoding a dup

carllin · 2019-05-14T06:05:39Z

Right, but some of those signatures wouldn't make it into the cache if we were to replay them, but you can't tell which those are by just looking at them. I think the danger here is if we were to just include every signature in blocktree, then a validator booting from a snapshot could potentially include a signature that the other validators don't have in their signature caches

aeyakovenko · 2019-05-14T06:12:53Z

@carllin all signatures that are in the ledger must be in the cache. All signatures that are not in the ledger, must not be in the cache.

rob-solana · 2019-05-14T15:46:57Z

we use the cache to return TX status to RPC clients, though, right?

aeyakovenko · 2019-05-14T16:42:32Z

@rob-solana yes, but it’s not critical

sakridge · 2019-05-25T19:27:23Z

I think it's a known issue at the moment.

sambley · 2019-05-28T05:53:59Z

@sakridge, @rob-solana, @carllin, could you help review the changes and let me know if you have any comments.

rob-solana · 2019-05-28T17:09:26Z

@sambley, but these AccountNotFounds are not happening on the other validators, correct? If everybody is playing the same transactions, then all the accounts should match so the behavior should be the same So there is some mismatch in accounts happening on the validator that is restarting from the snapshots? Can we figure out what these zero balanced accounts look like on validators that aren't failing?

@carllin, identified a fix in the restore and have not been to reproduce the ANF so far (would have to test more to see if its really gone). I am noticing a different issue now where after restore, a few votes are getting confirmed and then the below message starts appearing and all further votes are getting dropped. Do you have clues to what could be causing it?

[2019-05-25T05:38:22.705442839Z INFO solana::replay_stage] bank frozen 559
[2019-05-25T05:38:22.706442201Z INFO solana::replay_stage] validator fork confirmed 558 826
[2019-05-25T05:38:22.707536546Z INFO solana::replay_stage] new fork:568 parent:559
[2019-05-25T05:38:22.710977032Z INFO solana::replay_stage] bank frozen 568
[2019-05-25T05:38:22.712022085Z INFO solana::replay_stage] new fork:569 parent:568
[2019-05-25T05:38:23.298403567Z WARN solana_vote_api::vote_state] dropping vote Vote { slot: 568, hash: 5aotmPDFuLYztKPhyzwgE39VsciMMJwgYasxnLRbhpKC }, no matching slot/hash combination
[2019-05-25T05:38:23.298434102Z WARN solana_vote_api::vote_state] dropping vote Vote { slot: 569, hash: G2BS8cZFkAYNgPduXkNWTRz7gq4BApq7Ypjypejbc3AM }, no matching slot/hash combination

is this showing up in a single-node test or on a restarted validator?

rob-solana · 2019-05-28T17:38:10Z

runtime/src/append_vec.rs

    map: MmapMut,
    // This mutex forces append to be single threaded, but concurrent with reads
    append_offset: Mutex<usize>,
    current_len: AtomicUsize,
    file_size: u64,
 }

+impl Drop for AppendVec {
+    fn drop(&mut self) {
+        let _ = std::fs::remove_dir_all(&self.path.parent().unwrap());


convention for this is _ignored =

will update this

convention for this is _ignored =

will update accordingly

rob-solana · 2019-05-28T17:38:54Z

runtime/src/bank.rs

 #[derive(Default)]
-pub struct Bank {
+pub struct BankRc {


what's the Rc suffix mean?

what's the Rc suffix mean?

This just holds the Arc members of the bank as they need to be serialized in a special manner and the test of the members of bank can be serialized directly.

rob-solana · 2019-05-28T18:11:56Z

@sambley I don't see anything obviously wrong with this patch, still looking

sambley · 2019-05-28T23:37:40Z

@sambley, but these AccountNotFounds are not happening on the other validators, correct? If everybody is playing the same transactions, then all the accounts should match so the behavior should be the same So there is some mismatch in accounts happening on the validator that is restarting from the snapshots? Can we figure out what these zero balanced accounts look like on validators that aren't failing?

@carllin, identified a fix in the restore and have not been to reproduce the ANF so far (would have to test more to see if its really gone). I am noticing a different issue now where after restore, a few votes are getting confirmed and then the below message starts appearing and all further votes are getting dropped. Do you have clues to what could be causing it?
[2019-05-25T05:38:22.705442839Z INFO solana::replay_stage] bank frozen 559
[2019-05-25T05:38:22.706442201Z INFO solana::replay_stage] validator fork confirmed 558 826
[2019-05-25T05:38:22.707536546Z INFO solana::replay_stage] new fork:568 parent:559
[2019-05-25T05:38:22.710977032Z INFO solana::replay_stage] bank frozen 568
[2019-05-25T05:38:22.712022085Z INFO solana::replay_stage] new fork:569 parent:568
[2019-05-25T05:38:23.298403567Z WARN solana_vote_api::vote_state] dropping vote Vote { slot: 568, hash: 5aotmPDFuLYztKPhyzwgE39VsciMMJwgYasxnLRbhpKC }, no matching slot/hash combination
[2019-05-25T05:38:23.298434102Z WARN solana_vote_api::vote_state] dropping vote Vote { slot: 569, hash: G2BS8cZFkAYNgPduXkNWTRz7gq4BApq7Ypjypejbc3AM }, no matching slot/hash combination

is this showing up in a single-node test or on a restarted validator?

I have not seen this happen on single node test, have seen the issue on testnet when running with 6 nodes. It happens even if the validator has not been restarted. Seems to happen at the point of fork, when dumping the individual hashes of the account when this happened, did notice that the individual hash values different for 2 or 3 accounts.

sambley · 2019-05-30T04:32:24Z

let me know if there are any additional comments, if not would like to go ahead and merge this one.

mvines · 2019-05-31T01:52:42Z

@sambley - yeah let's merge it when you're ready. We can always have follow up PRs as needed. Are you planning to look at sharing snapshots between validator nodes next?

mvines · 2019-05-31T01:55:37Z

run.sh

-solana-keygen -o "$dataDir"/config/leader-vote-account-keypair.json
-solana-keygen -o "$dataDir"/config/leader-stake-account-keypair.json
+leader_keypair="$dataDir/config/leader-keypair.json"
+if [ -e "$leader_keypair" ]


Oh instead of

if [ -e "$leader_keypair" ] then

please use this form:

if [[ -e "$leader_keypair" ]]; then

because

less lines of code

[[ is a bash built-in

also, in the built-in, the quotes around $leader_keypair are unnecessary

…s#4243)" This reverts commit 81fa69d.

sambley · 2019-05-31T04:26:10Z

@sambley - yeah let's merge it when you're ready. We can always have follow up PRs as needed. Are you planning to look at sharing snapshots between validator nodes next?

@mvines, yes I can start looking at sharing the snapshots and accounts between validator nodes. Please let me know if you have ideas of how best to share the data between the nodes.

mvines · 2019-05-31T04:31:41Z

Awesome. I don't have anything super concrete to offer. Ideally a validator that's just starting up could download the most recent snapshot from the cluster entrypoint it was given (as v1, later it could potentially try to fetch snapshots from other nodes it finds over gossip too perhaps). Http seems like a reasonable choice. I'm not sure if we can serve normal http responses over the existing JSON RPC API, so perhaps it would be better to just have a new pure http port that a validator node exports too.

aeyakovenko · 2019-05-31T05:26:51Z

@sambley I think the best way to sync the checkpoint would be to paginate the index and the accounts across many validators.

garious requested a review from sakridge May 10, 2019 15:30

sambley force-pushed the snapshot branch from 0d151d2 to ee865bf Compare May 13, 2019 00:25

sambley force-pushed the snapshot branch from 6b163dc to 2cce616 Compare May 14, 2019 05:11

sambley mentioned this pull request May 14, 2019

Use 20 bytes signature slice for cache purposes #4260

Merged

sambley force-pushed the snapshot branch from 2cce616 to 80257a5 Compare May 15, 2019 03:50

sambley force-pushed the snapshot branch 2 times, most recently from 862d7ea to c5fe5d4 Compare May 28, 2019 00:54

rob-solana reviewed May 28, 2019

View reviewed changes

sambley force-pushed the snapshot branch from c5fe5d4 to b43ad89 Compare May 30, 2019 02:09

mvines reviewed May 31, 2019

View reviewed changes

sambley added 9 commits May 30, 2019 19:47

Revert "Revert "Create bank snapshots (solana-labs#3671)" (solana-lab…

3f5ac66

…s#4243)" This reverts commit 81fa69d.

keep saved and unsaved copies of status cache

c5762b0

fix format check

bd45f28

bench for status cache serialize

6fd3583

misc cleanup

228240b

remove appendvec storage on purge

855b3c6

fix accounts restore

d92db84

cleanup

990a666

Pass snapshot path as args

e8c6efb

sambley force-pushed the snapshot branch from b43ad89 to ed97e43 Compare May 31, 2019 02:51

Fix clippy

34a56ca

sambley force-pushed the snapshot branch from ed97e43 to 34a56ca Compare May 31, 2019 04:03

sambley merged commit 182096d into solana-labs:master May 31, 2019

brooksprumo mentioned this pull request May 6, 2021

Incremental Snapshots #17088

Closed

29 tasks

Create bank snapshots #4244

Create bank snapshots #4244

Conversation

sambley commented May 10, 2019

Problem

Summary of Changes

codecov bot commented May 10, 2019

Codecov Report

codecov bot commented May 10, 2019 • edited Loading

Codecov Report

sambley commented May 10, 2019 • edited Loading

aeyakovenko commented May 10, 2019

aeyakovenko commented May 13, 2019

sambley commented May 13, 2019

aeyakovenko commented May 13, 2019

carllin commented May 13, 2019 • edited Loading

aeyakovenko commented May 13, 2019 • edited Loading

carllin commented May 13, 2019

aeyakovenko commented May 13, 2019

carllin commented May 13, 2019

aeyakovenko commented May 14, 2019

carllin commented May 14, 2019 • edited Loading

aeyakovenko commented May 14, 2019

carllin commented May 14, 2019 • edited Loading

aeyakovenko commented May 14, 2019

carllin commented May 14, 2019 • edited Loading

aeyakovenko commented May 14, 2019

carllin commented May 14, 2019

aeyakovenko commented May 14, 2019

carllin commented May 14, 2019

aeyakovenko commented May 14, 2019 • edited Loading

rob-solana commented May 14, 2019

aeyakovenko commented May 14, 2019

sakridge commented May 25, 2019

sambley commented May 28, 2019

rob-solana commented May 28, 2019

rob-solana May 28, 2019

Choose a reason for hiding this comment

sambley May 28, 2019

Choose a reason for hiding this comment

rob-solana May 28, 2019

Choose a reason for hiding this comment

sambley May 28, 2019

Choose a reason for hiding this comment

rob-solana commented May 28, 2019

sambley commented May 28, 2019

sambley commented May 30, 2019

mvines commented May 31, 2019

mvines May 31, 2019

Choose a reason for hiding this comment

rob-solana May 31, 2019

Choose a reason for hiding this comment

sambley commented May 31, 2019

mvines commented May 31, 2019

aeyakovenko commented May 31, 2019

codecov bot commented May 10, 2019 •

edited

Loading

sambley commented May 10, 2019 •

edited

Loading

carllin commented May 13, 2019 •

edited

Loading

aeyakovenko commented May 13, 2019 •

edited

Loading

carllin commented May 14, 2019 •

edited

Loading

carllin commented May 14, 2019 •

edited

Loading

carllin commented May 14, 2019 •

edited

Loading

aeyakovenko commented May 14, 2019 •

edited

Loading