Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid to miss to root for local slots before the hard fork #19912

Merged
merged 7 commits into from
Jun 26, 2022

Conversation

ryoqun
Copy link
Member

@ryoqun ryoqun commented Sep 15, 2021

Problem

some rooted slots could be omitted to be marked.

some rooted slots could be omitted after restart (hard fork), if the hard fork slot is ahead of the validator's latest ledger root before the restart.

This is usually the case because current practice of cluster restart is based on the opt-conf-ed slot. (i.e. no node should have rooted that slot yet)

ultimately, this leads to hole in bigtable.

cc: @mvines @sakridge @carllin

Summary of Changes

fix it. (this is wip; I'm working on to fix the actual hole now). impl finished

related: #13536

@ryoqun ryoqun marked this pull request as draft September 15, 2021 13:08
@codecov
Copy link

codecov bot commented Sep 15, 2021

Codecov Report

Merging #19912 (c924ff8) into master (8caced6) will decrease coverage by 0.2%.
The diff coverage is 76.9%.

❗ Current head c924ff8 differs from pull request most recent head 34ad3d5. Consider uploading reports for the commit 34ad3d5 to get more accurate results

@@            Coverage Diff            @@
##           master   #19912     +/-   ##
=========================================
- Coverage    82.1%    81.9%   -0.3%     
=========================================
  Files         628      631      +3     
  Lines      171471   174118   +2647     
=========================================
+ Hits       140878   142615   +1737     
- Misses      30593    31503    +910     

core/src/validator.rs Outdated Show resolved Hide resolved
Comment on lines 1296 to 1300
reconcile_blockstore_roots_with_external_source(
ExternalRootSource::HardFork(hard_fork_restart_slot),
&blockstore,
&mut last_blockstore_root,
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this is the meat of this pr

"Reconciling slots as root based on tower root: {:?} ({}..{}) ",
new_roots, tower_root, last_blockstore_root
"Reconciling slots as root based on external root: {:?} ({}..{}) ",
new_roots, external_root, last_blockstore_root
);
blockstore.set_roots(new_roots.iter())?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carllin I wonder this .set_roots(...) should be accompanied with set_duplicate_confirmed_slots_and_hashes(...) like in blockstore_processor?:

blockstore.set_duplicate_confirmed_slots_and_hashes(rooted_slots.into_iter()).expect("Blockstore::set_duplicate_confirmed should succeed");

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ryoqun ah yeah it should, good catch!

Although I think you'll see that there are no calls to fn set_duplicate_confirmed_slots_and_hashes in 1.6 and 1.7 currently 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after some code reading, i had to make a kind of compromise regarding the treatment of duplicate confirmed slots...: https://github.com/solana-labs/solana/pull/19912/files#r902224585

so, i've defined small fn (mark_slots_as_if_rooted_normally_at_startup) for the sole sake of documentation to make the nuance very explicit, to my taste. i hope it'll be more immune to comment rots. ;)

core/src/consensus.rs Outdated Show resolved Hide resolved
core/src/validator.rs Outdated Show resolved Hide resolved
@ryoqun ryoqun requested review from AshwinSekar and removed request for jbiseda September 28, 2021 06:24
@stale
Copy link

stale bot commented Jun 12, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Jun 12, 2022
@CriesofCarrots
Copy link
Contributor

CriesofCarrots commented Jun 13, 2022

@ryoqun , any chance of resuscitating this??

@stale stale bot removed the stale [bot only] Added to stale content; results in auto-close after a week. label Jun 13, 2022
@ryoqun
Copy link
Member Author

ryoqun commented Jun 14, 2022

@ryoqun , any chance of resuscitating this??

@CriesofCarrots thanks for caring this pr.

yeah, i haven't forgetting this pr at all... so stressful that i can't get to this for this extended period of time.... I'm planning to work on this, this week really.

@ryoqun ryoqun force-pushed the no-root-gap-before-hard-fork branch from 350c018 to cb4e410 Compare June 17, 2022 07:27
@ryoqun ryoqun force-pushed the no-root-gap-before-hard-fork branch from 0ef68ea to c924ff8 Compare June 18, 2022 05:22
@ryoqun
Copy link
Member Author

ryoqun commented Jun 20, 2022

status: I resuscitated this pr with all review so far being addressed. I'll write replies to review comments later.

Comment on lines +1371 to +1379
// Unfortunately, we can't supply duplicate-confirmed hashes,
// because it can't be guaranteed to be able to replay these slots
// under this code-path's limited condition (i.e. those shreds
// might not be available, etc...) also correctly overcoming this
// limitation is hard...
blockstore.mark_slots_as_if_rooted_normally_at_startup(
new_roots.into_iter().map(|root| (root, None)).collect(),
false,
)?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm yeah this is yet another tricky edge case to reason about, but I think it should be ok.

Not marking these slots as duplicate confirmed just means:

  1. we can't serve requests to people asking for the correct version of these slots via AncestorHashesRepairType
  2. Should be safe as long as we're not freezing the slot again during replay (not happening b/c rooted)

Comment on lines +1346 to +1348
// blockstore.last_root() might have been updated already.
// so take a &mut param both to input (and output iff we update root)
last_blockstore_root: &mut Slot,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there

@@ -430,7 +429,7 @@ pub mod tests {
OptimisticallyConfirmedBank::locked_from_bank_forks_root(&bank_forks),
)),
&poh_recorder,
tower,
None,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

info!("Tower state: {:?}", tower);
tower
} else {
warn!("creating default tower....");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here


impl<'a> From<ProcessBlockStore<'a>> for Tower {
Copy link
Member Author

@ryoqun ryoqun Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mvines I don't think this is a correct use of From, which is introduced at #23852 (https://github.com/solana-labs/solana/pull/23852/files#r902237878).

This is converting quite different things. ideally, From should be a cast for conceptually similar things. Also, this made code reading a bit hard because i can't easily spot when ProcessBlockStore::process is exactly executed now. (i.e. searching the correct .into() is hard)

it seems this trait workaround is only needed for this test test_tvu_exit and i think alternative approach should arguably be tolerable (still not ideal, i skipped properly introducing enum or trait just for this purpose...): https://github.com/solana-labs/solana/pull/19912/files#r902243960

btw, i didn't know there is an identity/no-op blanket From/Into impl provided here
https://doc.rust-lang.org/src/core/convert/mod.rs.html#559:

// From (and thus Into) is reflexive
#[stable(feature = "rust1", since = "1.0.0")]
#[rustc_const_unstable(feature = "const_convert", issue = "88674")]
impl<T> const From<T> for T {
    /// Returns the argument unchanged.
    fn from(t: T) -> T {
        t
    }
}

granted, this refactor isn't relevant to this pr but i couldn't resist the urge, when i was code-reading to guarantee the safety of the scary blockstore mutation with hard fork params.

happy to separate this change into another preparatory pr, if you desire. :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this looks fine to me

@ryoqun ryoqun requested a review from carllin June 21, 2022 07:27
@ryoqun
Copy link
Member Author

ryoqun commented Jun 21, 2022

status: ... I'll write replies to review comments later.

this is finally done!

@carllin could you review this? cc: @CriesofCarrots

@ryoqun
Copy link
Member Author

ryoqun commented Jun 26, 2022

sanity checked and got:

$ ./target/release/solana-ledger-tool -l config/ledger bounds
[2022-06-26T05:34:56.257536980Z INFO  solana_ledger_tool] solana-ledger-tool 1.11.0 (src:34ad3d51; feat:528226107)
[2022-06-26T05:34:56.257639239Z INFO  solana_ledger::blockstore] Maximum open file descriptors: 500000
[2022-06-26T05:34:56.257647743Z INFO  solana_ledger::blockstore] Opening database at "/home/ryoqun/work/solana/solana/config/ledger/rocksdb"
[2022-06-26T05:34:56.257664842Z INFO  solana_ledger::blockstore_db] Disabling rocksdb's automatic compactions...
[2022-06-26T05:34:56.258457406Z INFO  solana_ledger::blockstore_db] Opening Rocks with secondary (read only) access at: "/home/ryoqun/work/solana/solana/config/ledger/rocksdb/solana-secondary"
[2022-06-26T05:34:56.258475519Z INFO  solana_ledger::blockstore_db] This secondary access could temporarily degrade other accesses, such as by solana-validator
[2022-06-26T05:34:56.293056701Z INFO  solana_ledger::blockstore] "/home/ryoqun/work/solana/solana/config/ledger/rocksdb" open took 35ms
Ledger has data for 225 slots 0 to 224
  with 158 rooted slots from 0 to 192
  and 32 slots past the last root


...

[2022-06-26T05:35:39.983289202Z INFO  solana_ledger::blockstore_processor] ledger processed in 157 µs and 890 ns. root slot is 223, 1 bank: 223
[2022-06-26T05:35:39.983289475Z INFO  solana_metrics::metrics] datapoint: process_blockstore_from_root total_time_us=157i frozen_banks=1i slot=223i forks=1i calculate_capitalization_us=23321i
[2022-06-26T05:35:39.983590734Z INFO  solana_core::consensus] adjusting lockouts (after replay up to 223): [193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223] tower root: 192 replayed root: 223
[2022-06-26T05:35:39.983611982Z INFO  solana_core::consensus] adjusted tower's anchored slot: Some(223)
[2022-06-26T05:35:39.983616519Z INFO  solana_core::consensus] All restored votes were behind; resetting root_slot and last_vote in tower!
[2022-06-26T05:35:39.983665577Z ERROR solana_core::validator] Hard fork is detected; discarding tower restoration result: Ok(Tower { node_pubkey: 7eVT4kj1gJ7dij8bDEeqmAYcWPP7jyDV7hgBhAN68zVm, threshold_depth: 8, threshold_size: 0.6666666666666666, vote_state: VoteState { node_pubkey: 7eVT4kj1gJ7dij8bDEeqmAYcWPP7jyDV7h
gBhAN68zVm, authorized_withdrawer: 7eVT4kj1gJ7dij8bDEeqmAYcWPP7jyDV7hgBhAN68zVm, commission: 44, votes: [], root_slot: Some(223), authorized_voters: AuthorizedVoters { authorized_voters: {0: 7eVT4kj1gJ7dij8bDEeqmAYcWPP7jyDV7hgBhAN68zVm} }, prior_voters: CircBuf { buf: [(11111111111111111111111111111111, 0, 0), (111111
11111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (1111111111111111111111111111111
1, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111
111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0)
, (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (1111111111111111111111
1111111111, 0, 0)], idx: 31, is_empty: true }, epoch_credits: [(0, 156, 0)], last_timestamp: BlockTimestamp { slot: 0, timestamp: 0 } }, last_vote: Vote(Vote { slots: [], hash: 11111111111111111111111111111111, timestamp: None }), last_vote_tx_blockhash: 11111111111111111111111111111111, last_timestamp: BlockTimestamp
 { slot: 222, timestamp: 1656221556 }, stray_restored_slot: None, last_switch_threshold_check: None })
[2022-06-26T05:35:39.983707287Z ERROR solana_core::validator] Rebuilding a new tower from the latest vote account due to failed tower restore: The tower is useless because of new hard fork: 223
[2022-06-26T05:35:39.983671541Z ERROR solana_metrics::metrics] datapoint: tower_error error="Hard fork is detected; discarding tower restoration result: Ok(Tower { node_pubkey: 7eVT4kj1gJ7dij8bDEeqmAYcWPP7jyDV7hgBhAN68zVm, threshold_depth: 8, threshold_size: 0.6666666666666666, vote_state: VoteState { node_pubkey: 7eV
T4kj1gJ7dij8bDEeqmAYcWPP7jyDV7hgBhAN68zVm, authorized_withdrawer: 7eVT4kj1gJ7dij8bDEeqmAYcWPP7jyDV7hgBhAN68zVm, commission: 44, votes: [], root_slot: Some(223), authorized_voters: AuthorizedVoters { authorized_voters: {0: 7eVT4kj1gJ7dij8bDEeqmAYcWPP7jyDV7hgBhAN68zVm} }, prior_voters: CircBuf { buf: [(11111111111111111
111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (
11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (1111111111111111111111111
1111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111
111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111, 0, 0), (11111111111111111111111111111111,
 0, 0), (11111111111111111111111111111111, 0, 0)], idx: 31, is_empty: true }, epoch_credits: [(0, 156, 0)], last_timestamp: BlockTimestamp { slot: 0, timestamp: 0 } }, last_vote: Vote(Vote { slots: [], hash: 11111111111111111111111111111111, timestamp: None }), last_vote_tx_blockhash: 11111111111111111111111111111111,
 last_timestamp: BlockTimestamp { slot: 222, timestamp: 1656221556 }, stray_restored_slot: None, last_switch_threshold_check: None })"
[2022-06-26T05:35:39.983734623Z ERROR solana_metrics::metrics] datapoint: tower_error error="Unable to restore tower: The tower is useless because of new hard fork: 223"
[2022-06-26T05:35:39.983949656Z INFO  solana_core::consensus] Reconciling slots as root based on external root: [223, 222, 221, 220, 219, 218, 217, 216, 215, 214, 213, 212, 211, 210, 209, 208, 207, 206, 205, 204, 203, 202, 201, 200, 199, 198, 197, 196, 195, 194, 193] (external: HardFork(223), blockstore: 192)
[2022-06-26T05:35:39.984036580Z INFO  solana_core::validator] Waiting for 80% of activated stake at slot 223 to be in gossip...
[2022-06-26T05:35:39.984046941Z INFO  solana_core::validator] Supermajority reached, 100% active stake detected, starting up now.

seems working properly.

@ryoqun ryoqun merged commit cd2878a into solana-labs:master Jun 26, 2022
@ryoqun
Copy link
Member Author

ryoqun commented Feb 26, 2023

seems this worked for 2023-02-26 mainnet-beta outage:

$ grep -E "Reconcil|storage-bigtable-upload-block.*17952640" logs/solana-validator.log | sort -k 6
[2023-02-26T01:36:36.707933988Z INFO  solana_core::consensus] Reconciling slots as root based on external root: [179526403, 179526402, 179526401, 179526400] (external: HardFork(179526403), blockstore: 179526391)
[2023-02-26T01:59:17.685949819Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526400i transactions=43i bytes=84527i
[2023-02-26T01:59:22.383623281Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526401i transactions=3353i bytes=1601930i
[2023-02-26T01:59:19.889260232Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526402i transactions=2400i bytes=1395552i
[2023-02-26T01:59:19.276003004Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526403i transactions=2335i bytes=857771i
[2023-02-26T01:59:17.685875247Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526404i transactions=0i bytes=116i
[2023-02-26T01:59:17.686201328Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526405i transactions=958i bytes=391503i
[2023-02-26T01:59:17.686394797Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526406i transactions=1460i bytes=574413i
[2023-02-26T01:59:17.685981530Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526407i transactions=254i bytes=139982i
[2023-02-26T01:59:17.686645294Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526408i transactions=2715i bytes=1006228i
[2023-02-26T01:59:17.686236445Z INFO  solana_metrics::metrics] datapoint: storage-bigtable-upload-block slot=179526409i transactions=906i bytes=370063i

Comment on lines +2363 to +2364
assert_eq!(&slots_a[slots_a.len() - roots_a.len()..].to_vec(), &roots_a);
assert_eq!(&slots_b[slots_b.len() - roots_b.len()..].to_vec(), &roots_b);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here

Comment on lines +217 to +221
/// Dangerous. Currently only needed for a local-cluster test
pub fn unset_parent(&mut self) {
self.parent_slot = None;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants