Fix zero-lamport accounts preventing slot cleanup #12606

carllin · 2020-10-01T02:44:37Z

Problem

Some slot S containing an update for an account with key K may not be cleaned up if at the time of clean_slots, the latest update in K is in some rooted slot S' > S and K has zero lamports. This is because in clean_slots()

if account_info.lamports == 0 {

and

else if accounts_index.uncleaned_roots.contains(slot)

are mutually exclusive, and uncleaned_roots is purged at the end of the clean_slots() loop, so those slots will not be re-examined if the pubkey is not updated again in some future root that is added to uncleaned root

Summary of Changes

Fixes #

runtime/src/accounts_db.rs

ryoqun · 2020-10-01T03:10:17Z

runtime/src/accounts_db.rs

@@ -681,7 +686,8 @@ impl AccountsDB {
                        let (slot, account_info) = &list[index];
                        if account_info.lamports == 0 {
                            purges.insert(*pubkey, accounts_index.would_purge(pubkey));
-                        } else if accounts_index.uncleaned_roots.contains(slot) {


Thanks for fixing another AccountsDB bug.. ;)

iirc, there was some reason that I intentionally chained with else if. But I can't remember the exact reason. And, sorry for the lack of comment. I'll try to remember and report back here. Anyway, a lot happened since then. So, this is just ok as long as the ci is green and running validator causes no bad snapshot.

@ryoqun, no worries! Was it to avoid deleting the slot such that an AppendVec added earlier to purges wouldn't be deleted, i.e. so this wouldn't panic? https://github.com/solana-labs/solana/blob/master/runtime/src/accounts_db.rs#L720. I suspect this might be the reason because if you change the else if to an if on master, and try running this test: https://github.com/solana-labs/solana/pull/12606/files#diff-2099c5256db4eb5975c8834af38f6456R3118, that unwrap() will panic

(for the record) this line is added #8436.

@carllin Hmm, it seems some of other tests are failing. Could you look at it? Those failing tests will be very cryptic to you. So don't hesitate to ask questions. At first glance, I can't tell these are actually failing by being broken or these test assertions need to just be updated.

I'm still trying to remember something. Still, the memory is blur but it was like this:

clean_old_rooted_accounts should conceptually be complement to the cleaning of zero lamport accounts. So, for every given live account, we do either clean_old_rooted_accounts or cleaning of zero lamport accounts. Even if we don't clean old_rooted accounts for zero lamport accounts, the cleaning of zero lamport logic can purge them eventually. So there is no need to do both; it's kind of waste doing both for each account every time we run clean_accounts().

There was a problem / bug when we're too eager to clean old index entries of zero lamport accounts. So combined the above bullet point concern, I've avoided doing both and chose if else approach. (That's why tests failing I guess).

... And I failed to comment as such because that assumption was so prevalent at the time. It was obvious thing to me. (my bad).

@ryoqun, no worries! Was it to avoid deleting the slot such that an AppendVec added earlier to purges wouldn't be deleted, i.e. so this wouldn't panic? https://github.com/solana-labs/solana/blob/master/runtime/src/accounts_db.rs#L720. I suspect this might be the reason because if you change the else if to an if on master, and try running this test: https://github.com/solana-labs/solana/pull/12606/files#diff-2099c5256db4eb5975c8834af38f6456R3118, that unwrap() will panic

As said in the my above comment, I don't think my thought for the original implementation decision started from these concerns. At the time, I didn't even tried to do if approach. Anyway, I think your observation is correct for the interrelation of two cleaning and panic of wrap with the if approach..

@carllin Hmm, it seems some of other tests are failing. Could you look at it? Those failing tests will be very cryptic to you. So don't hesitate to ask questions. At first glance, I can't tell these are actually failing by being broken or these test assertions need to just be updated.

I'm still trying to remember something. Still, the memory is blur but it was like this:

clean_old_rooted_accounts should conceptually be complement to the cleaning of zero lamport accounts. So, for every given live account, we do either clean_old_rooted_accounts or cleaning of zero lamport accounts. Even if we don't clean old_rooted accounts for zero lamport accounts, the cleaning of zero lamport logic can purge them eventually. So there is no need to do both; it's kind of waste doing both for each account every time we run clean_accounts().

There was a problem / bug when we're too eager to clean old index entries of zero lamport accounts. So combined the above bullet point concern, I've avoided doing both and chose if else approach. (That's why tests failing I guess).

... And I failed to comment as such because that assumption was so prevalent at the time. It was obvious thing to me. (my bad).

Yeah, working on those tests! I think the ones that are failing are a mix of assertions failing and me not handling: #12606 (comment), which is breaking the issue described here: #7157 (This is the issue with old AppendVecs reviving zero-lamport accounts, which should now be addressed by calc_delete_dependencies()

I see! It looks that you passed the CI fully. So happy. :)

So, the once independent cleaning mechanisms finally start to interact each other in one pass after battle-tested for half a year. That sounds cool.

runtime/src/accounts_db.rs

codecov · 2020-10-02T08:21:10Z

Codecov Report

Merging #12606 into master will increase coverage by 0.0%.
The diff coverage is 100.0%.

@@           Coverage Diff            @@
##           master   #12606    +/-   ##
========================================
  Coverage    82.0%    82.0%            
========================================
  Files         359      359            
  Lines       83989    84123   +134     
========================================
+ Hits        68875    69009   +134     
  Misses      15114    15114

runtime/src/accounts_db.rs

ryoqun · 2020-10-02T11:09:49Z

@carllin not strictly, related to this pr, but I found rather worried flaky test failure: https://buildkite.com/solana-labs/solana/builds/32038#b65ccd3b-3b4d-40c6-82bc-51b685ed5948/247-1671

Could you take a look at that before merging this?

---- bank::tests::test_clean_zero_lamport_account_different_hash stdout ----
thread 'bank::tests::test_clean_zero_lamport_account_different_hash' panicked at 'No bank hash was found for this bank, that should not be possible', runtime/src/accounts.rs:671:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/std/src/panicking.rs:475
   1: core::panicking::panic_fmt
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/panicking.rs:85
   2: core::option::expect_failed
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/option.rs:1211
   3: core::option::Option<T>::expect
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/option.rs:331
   4: solana_runtime::accounts::Accounts::bank_hash_info_at
             at ./src/accounts.rs:669
   5: solana_runtime::bank::Bank::hash_internal_state
             at ./src/bank.rs:3236
   6: solana_runtime::bank::Bank::rehash
             at ./src/bank.rs:1345
   7: solana_runtime::bank::tests::test_clean_zero_lamport_account_different_hash
             at ./src/bank.rs:9365
   8: solana_runtime::bank::tests::test_clean_zero_lamport_account_different_hash::{{closure}}
             at ./src/bank.rs:9359
   9: core::ops::function::FnOnce::call_once
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/ops/function.rs:233
  10: core::ops::function::FnOnce::call_once
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/ops/function.rs:233
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
 
failures:
    bank::tests::test_clean_zero_lamport_account_different_hash

runtime/src/accounts_db.rs

ryoqun · 2020-10-02T11:25:01Z

@carllin Sorry for late review as always... ;) Also, this time I'm afraid but I have to assign some homework to you.

ryoqun · 2020-10-02T11:30:12Z

Anyway, I see this pr is great progress.

I'm really looking forward to the less bloated snapshots. And slightly increased write thanks to more granular storage lock? I think zero lamport tombstones will be reclaimed much faster with this pr.

Also, please test this for rather long duration, pointing to mainnet-beta and testnet.

carllin · 2020-10-03T02:02:30Z

@ryoqun for sure, thanks for the review! Definitely letting this run against mainnet for a while.

@carllin not strictly, related to this pr, but I found rather worried flaky test failure: https://buildkite.com/solana-labs/solana/builds/32038#b65ccd3b-3b4d-40c6-82bc-51b685ed5948/247-1671

Could you take a look at that before merging this?

---- bank::tests::test_clean_zero_lamport_account_different_hash stdout ----
thread 'bank::tests::test_clean_zero_lamport_account_different_hash' panicked at 'No bank hash was found for this bank, that should not be possible', runtime/src/accounts.rs:671:14
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/std/src/panicking.rs:475
   1: core::panicking::panic_fmt
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/panicking.rs:85
   2: core::option::expect_failed
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/option.rs:1211
   3: core::option::Option<T>::expect
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/option.rs:331
   4: solana_runtime::accounts::Accounts::bank_hash_info_at
             at ./src/accounts.rs:669
   5: solana_runtime::bank::Bank::hash_internal_state
             at ./src/bank.rs:3236
   6: solana_runtime::bank::Bank::rehash
             at ./src/bank.rs:1345
   7: solana_runtime::bank::tests::test_clean_zero_lamport_account_different_hash
             at ./src/bank.rs:9365
   8: solana_runtime::bank::tests::test_clean_zero_lamport_account_different_hash::{{closure}}
             at ./src/bank.rs:9359
   9: core::ops::function::FnOnce::call_once
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/ops/function.rs:233
  10: core::ops::function::FnOnce::call_once
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/core/src/ops/function.rs:233
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
 
failures:
    bank::tests::test_clean_zero_lamport_account_different_hash

Addressed this here: #12658

ryoqun · 2020-10-09T09:39:54Z

runtime/src/accounts_db.rs

+                    let count = self
+                        .storage
+                        .read()


@carllin Opps, Sorry to spot this at later stage of this pr. But, I think there is panic! possibility here, because of the shrinker thread, which is running in parallel.

By moving this self.storage.read() lock inside here from the outer loop prologue, now shrinker thread are free to update account storages while this is running. unwrap() at L798 here bite us.

This is because shrinker effectively replaces old AccountStorageEntry with new one while removing the old AccountStorageEntry (we might be accessing this from here). remove_dead_accounts() never cleans up any empty AccountStorageEntries unless all of them for given slot is empty. But the do_shrink_slot do its work diligently by removing any empty StorageEntry manually after handle_reclaims, knowing they can almost certainly can remove empty one of them at the very least.

As a matter of fact, we're not affected from other store codepath, however. Because we're mangling rooted slots. other store codepath only touches non-rooted new slots. The only two which touches the rooted slots are this cleaner and shrinker.

@ryoqun, shrink and clean run serially in AccountsBackgroundService, no?

Yeah, you're right; Even more, the above comment is wrong: we're actually serializing the executions: https://github.com/solana-labs/solana/pull/12606/files?diff=split&w=1#diff-2099c5256db4eb5975c8834af38f6456R703

nis: So, I think this lock can go away now.

ryoqun

LGTM with nits

Great work! Also thanks for addressing various review comments! This got a lot clearer. 👍
And, sorry for the false alarm about the shrinker<=>cleaner data race bug... ;)

ryoqun · 2020-10-09T15:55:43Z

@carllin btw, I'd rather like to wait backporting this to v1.4 for the duration of 1 or 2 point releases. Because this isn't strictly a bug fix, I think we can wait. Thus, I want your previous grand work (#12194) to get ripe a bit on v1.4. What do you think?

carllin · 2020-10-09T19:30:52Z

@ryoqun I think that makes sense! I'll hold off on the backport and let the other PR bake a bit, thanks for the review!

ryoqun · 2020-10-19T05:38:35Z

@carllin Time to backport this? it looks like testnet is doing well with #12194.

Also, we're queueing #12126 and #12787 (preferably as the next patch release after this pr). Also, I rather want to spend most of time with all these prs backported on v1.4 so that we could be confident when the mainnet finally moves to the v1.4.

carllin · 2020-10-19T05:57:23Z

@ryoqun definitely, tagged!

Co-authored-by: Carl Lin <[email protected]> (cherry picked from commit 16d45b8)

Co-authored-by: Carl Lin <[email protected]> (cherry picked from commit 16d45b8) Co-authored-by: carllin <[email protected]>

ryoqun · 2020-10-25T13:25:29Z

it looks like snapshot got bigger a bit.

ryoqun · 2020-10-25T13:25:58Z

https://metrics.solana.com:3000/d/V5LPmn_Zk/testnet-monitor-edge-ryoqun?panelId=67&fullscreen&orgId=2&from=now-30d&to=now&var-datasource=Solana%20Metrics%20(read-only)&var-testnet=tds&var-hostid=All

ryoqun · 2020-10-25T13:29:18Z

As expected?, clean-old-root is reduced alot: https://metrics.solana.com:3000/d/V5LPmn_Zk/testnet-monitor-edge-ryoqun?panelId=45&fullscreen&orgId=2&from=now-30d&to=now&var-datasource=Solana%20Metrics%20(read-only)&var-testnet=tds&var-hostid=All

carllin requested review from ryoqun and sakridge October 1, 2020 02:44

carllin commented Oct 1, 2020

View reviewed changes

runtime/src/accounts_db.rs Show resolved Hide resolved

carllin commented Oct 1, 2020

View reviewed changes