Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactors accounts lt hash startup verification #3397

Merged
merged 1 commit into from
Oct 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 35 additions & 10 deletions accounts-db/src/accounts_db.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8597,6 +8597,7 @@ impl AccountsDb {
limit_load_slot_count_from_snapshot: Option<usize>,
verify: bool,
genesis_config: &GenesisConfig,
should_calculate_duplicates_lt_hash: bool,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next, here's the new parameter for generate_index(). We use this parameter to determine whether or not to calculate the duplicates lt hash value, and not checking the cli arg.

A nice fallout is that now we only conditionally malloc the LtHash when handling the duplicates (below). No more perf hit when not using accounts lt hash.

) -> IndexGenerationInfo {
let mut total_time = Measure::start("generate_index");
let mut slots = self.storage.all_slots();
Expand Down Expand Up @@ -8853,7 +8854,7 @@ impl AccountsDb {
accounts_data_len_from_duplicates: u64,
num_duplicate_accounts: u64,
uncleaned_roots: IntSet<Slot>,
duplicates_lt_hash: Box<DuplicatesLtHash>,
duplicates_lt_hash: Option<Box<DuplicatesLtHash>>,
}
impl DuplicatePubkeysVisitedInfo {
fn reduce(mut a: Self, mut b: Self) -> Self {
Expand All @@ -8870,9 +8871,30 @@ impl AccountsDb {
other.accounts_data_len_from_duplicates;
self.num_duplicate_accounts += other.num_duplicate_accounts;
self.uncleaned_roots.extend(other.uncleaned_roots);
self.duplicates_lt_hash
.0
.mix_in(&other.duplicates_lt_hash.0);

match (
self.duplicates_lt_hash.is_some(),
HaoranYi marked this conversation as resolved.
Show resolved Hide resolved
other.duplicates_lt_hash.is_some(),
) {
(true, true) => {
// SAFETY: We just checked that both values are Some
self.duplicates_lt_hash
.as_mut()
.unwrap()
.0
.mix_in(&other.duplicates_lt_hash.as_ref().unwrap().0);
}
(true, false) => {
// nothing to do; `other` doesn't have a duplicates lt hash
}
(false, true) => {
// `self` doesn't have a duplicates lt hash, so pilfer from `other`
self.duplicates_lt_hash = other.duplicates_lt_hash;
}
(false, false) => {
// nothing to do; no duplicates lt hash at all
}
}
Comment on lines +8875 to +8897
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that handling the duplicates can return the LtHash as an Option, we need to update how we handle merging. It's actually quite nice, as now intermediate threads that spin up to just do the merge-reduce don't need to malloc an LtHash either (they can use the allocation from the merged-from side).

}
}

Expand Down Expand Up @@ -8909,6 +8931,7 @@ impl AccountsDb {
pubkeys,
&rent_collector,
&timings,
should_calculate_duplicates_lt_hash,
);
let intermediate = DuplicatePubkeysVisitedInfo {
accounts_data_len_from_duplicates,
Expand Down Expand Up @@ -8937,7 +8960,7 @@ impl AccountsDb {
self.accounts_index
.add_uncleaned_roots(uncleaned_roots.into_iter());
accounts_data_len.fetch_sub(accounts_data_len_from_duplicates, Ordering::Relaxed);
if self.is_experimental_accumulator_hash_enabled() {
if let Some(duplicates_lt_hash) = duplicates_lt_hash {
let old_val = outer_duplicates_lt_hash.replace(duplicates_lt_hash);
assert!(old_val.is_none());
}
Expand Down Expand Up @@ -9049,11 +9072,13 @@ impl AccountsDb {
pubkeys: &[Pubkey],
rent_collector: &RentCollector,
timings: &GenerateIndexTimings,
) -> (u64, u64, IntSet<Slot>, Box<DuplicatesLtHash>) {
should_calculate_duplicates_lt_hash: bool,
) -> (u64, u64, IntSet<Slot>, Option<Box<DuplicatesLtHash>>) {
let mut accounts_data_len_from_duplicates = 0;
let mut num_duplicate_accounts = 0_u64;
let mut uncleaned_slots = IntSet::default();
let mut duplicates_lt_hash = Box::new(DuplicatesLtHash::default());
let mut duplicates_lt_hash =
should_calculate_duplicates_lt_hash.then(|| Box::new(DuplicatesLtHash::default()));
Comment on lines +9080 to +9081
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's where we conditionally malloc the LtHash. And the other change in this function can now switch based on if this is Some/None.

let mut removed_rent_paying = 0;
let mut removed_top_off = 0;
let mut lt_hash_time = Duration::default();
Expand Down Expand Up @@ -9097,7 +9122,7 @@ impl AccountsDb {
removed_rent_paying += 1;
removed_top_off += lamports_to_top_off;
}
if self.is_experimental_accumulator_hash_enabled() {
if let Some(duplicates_lt_hash) = duplicates_lt_hash.as_mut() {
let (_, duration) = meas_dur!({
let account_lt_hash =
Self::lt_hash_account(&loaded_account, pubkey);
Expand Down Expand Up @@ -9696,7 +9721,7 @@ pub mod tests {

let genesis_config = GenesisConfig::default();
assert!(!db.accounts_index.contains(&pubkey));
let result = db.generate_index(None, false, &genesis_config);
let result = db.generate_index(None, false, &genesis_config, false);
HaoranYi marked this conversation as resolved.
Show resolved Hide resolved
// index entry should only contain a single entry for the pubkey since index cannot hold more than 1 entry per slot
let entry = db.accounts_index.get_cloned(&pubkey).unwrap();
assert_eq!(entry.slot_list.read().unwrap().len(), 1);
Expand Down Expand Up @@ -9736,7 +9761,7 @@ pub mod tests {
append_vec.accounts.append_accounts(&storable_accounts, 0);
let genesis_config = GenesisConfig::default();
assert!(!db.accounts_index.contains(&pubkey));
let result = db.generate_index(None, false, &genesis_config);
let result = db.generate_index(None, false, &genesis_config, false);
let entry = db.accounts_index.get_cloned(&pubkey).unwrap();
assert_eq!(entry.slot_list.read().unwrap().len(), 1);
assert_eq!(append_vec.alive_bytes(), aligned_stored_size(0));
Expand Down
37 changes: 21 additions & 16 deletions runtime/src/bank.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1731,6 +1731,7 @@ impl Bank {
// When there is a feature gate for the accounts lt hash, if the feature is enabled
// then it will be *required* that the snapshot contains an accounts lt hash.
let accounts_lt_hash = fields.accounts_lt_hash.unwrap_or_else(|| {
info!("Calculating the accounts lt hash...");
let (accounts_lt_hash, duration) = meas_dur!({
thread_pool.install(|| {
bank.rc
Expand All @@ -1742,6 +1743,7 @@ impl Bank {
)
})
});
info!("Calculating the accounts lt hash... Done in {duration:?}");
calculate_accounts_lt_hash_duration = Some(duration);
accounts_lt_hash
});
Expand Down Expand Up @@ -5619,25 +5621,24 @@ impl Bank {
.wait_for_complete();

let slot = self.slot();
let verify_kind = if self
.rc
.accounts
.accounts_db
.is_experimental_accumulator_hash_enabled()
{
VerifyKind::Lattice
} else {
VerifyKind::Merkle
};

if verify_kind == VerifyKind::Lattice {
// Calculating the accounts lt hash from storages *requires* a duplicates_lt_hash.
// If it is None here, then we must use the index instead, which also means we
// cannot run in the background.
if duplicates_lt_hash.is_none() {
let verify_kind = match (
duplicates_lt_hash.is_some(),
self.rc
.accounts
.accounts_db
.is_experimental_accumulator_hash_enabled(),
) {
(true, _) => VerifyKind::Lattice,
(false, false) => VerifyKind::Merkle,
(false, true) => {
// Calculating the accounts lt hash from storages *requires* a duplicates_lt_hash.
// If it is None here, then we must use the index instead, which also means we
// cannot run in the background.
config.run_in_background = false;
VerifyKind::Lattice
}
}
};
Comment on lines +5625 to +5641
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And finally we get to startup verification! We can now primarily use the Option-ness of the passed-in duplicates_lt_hash to decide which kind of accounts verification to do. We still support using the CLI arg as a way to force using the lattice-based verification too (which is useful in tests, and also experimenting on running validators before the feature gate is enabled).


if config.require_rooted_bank && !accounts.accounts_db.accounts_index.is_alive_root(slot) {
if let Some(parent) = self.parent() {
Expand Down Expand Up @@ -5678,6 +5679,10 @@ impl Bank {
use_bg_thread_pool: config.run_in_background,
};

info!(
"Verifying accounts, in background? {}, verify kind: {verify_kind:?}",
config.run_in_background,
);
if config.run_in_background {
let accounts = Arc::clone(accounts);
let accounts_ = Arc::clone(&accounts);
Expand Down
8 changes: 8 additions & 0 deletions runtime/src/serde_snapshot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -884,6 +884,7 @@ where
bank_fields.epoch_accounts_hash,
capitalizations,
bank_fields.incremental_snapshot_persistence.as_ref(),
bank_fields.accounts_lt_hash.is_some(),
)?;

let bank_rc = BankRc::new(Accounts::new(Arc::new(accounts_db)));
Expand Down Expand Up @@ -1049,6 +1050,7 @@ fn reconstruct_accountsdb_from_fields<E>(
epoch_accounts_hash: Option<Hash>,
capitalizations: (u64, Option<u64>),
incremental_snapshot_persistence: Option<&BankIncrementalSnapshotPersistence>,
has_accounts_lt_hash: bool,
) -> Result<(AccountsDb, ReconstructedAccountsDbInfo), Error>
where
E: SerializableStorage + std::marker::Sync,
Expand Down Expand Up @@ -1232,6 +1234,11 @@ where
})
.unwrap();

// When generating the index, we want to calculate the duplicates lt hash value (needed to do
// the lattice-based verification of the accounts in the background) optimistically.
// This means, either when the cli arg is set, or when the snapshot has an accounts lt hash.
let is_accounts_lt_hash_enabled =
accounts_db.is_experimental_accumulator_hash_enabled() || has_accounts_lt_hash;
Comment on lines +1237 to +1241
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the main change that the rest of the PR falls out from: When loading from a snapshot, here is where we generate the index. And generate_index() needs to know if it should calculate the duplicates lt hash value.

Before snapshot support was added for accounts lt hash, the only thing we needed to check was the CLI arg. Now we need to check the CLI arg and the snapshot. Inside generate_index() we don't have a way to know if the accounts lt hash was in the snapshot, so we pass in a new parameter here.

(More info: And since we don't have a Bank instance here yet, we don't know which feature gates are enabled. In typical snapshot style, we use the presence of the accounts lt hash in the snapshot as a proxy for if the feature is enabled or not.)

let IndexGenerationInfo {
accounts_data_len,
rent_paying_accounts_by_partition,
Expand All @@ -1240,6 +1247,7 @@ where
limit_load_slot_count_from_snapshot,
verify_index,
genesis_config,
is_accounts_lt_hash_enabled,
);
accounts_db
.accounts_index
Expand Down
1 change: 1 addition & 0 deletions runtime/src/serde_snapshot/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ mod serde_snapshot_tests {
None,
(u64::default(), None),
None,
false,
HaoranYi marked this conversation as resolved.
Show resolved Hide resolved
)
.map(|(accounts_db, _)| accounts_db)
}
Expand Down
Loading