Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix tpu tvu bank race #2707

Merged
merged 2 commits into from
Feb 11, 2019
Merged

fix tpu tvu bank race #2707

merged 2 commits into from
Feb 11, 2019

Conversation

carllin
Copy link
Contributor

@carllin carllin commented Feb 8, 2019

Problem

The TVU bank could potentially be behind the TPU bank at the time of a leader -> validator transition. In the case of a leader -> leader transition, the TPU bank is then copied from the trailing TVU bank, and broadcast is started up with a last id that is not up to date (causing verification failures) and banking_stage threads are started up with an old bank state.

Summary of Changes

Wait for the TVU bank to catch up to current tick height before performing rotation.

Fixes #

@garious
Copy link
Contributor

garious commented Feb 8, 2019

Can you add a test for this case?

@codecov
Copy link

codecov bot commented Feb 8, 2019

Codecov Report

Merging #2707 into master will increase coverage by 0.7%.
The diff coverage is 95.6%.

@@           Coverage Diff            @@
##           master   #2707     +/-   ##
========================================
+ Coverage    77.7%   78.4%   +0.7%     
========================================
  Files         114     114             
  Lines       18864   18962     +98     
========================================
+ Hits        14663   14873    +210     
+ Misses       4201    4089    -112

Copy link
Member

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't land this without a test. I removed roughly this from my de-bootstraping PR because no test told me to keep it.

@@ -319,6 +320,10 @@ impl Fullnode {
tick_height,
);

while self.bank.tick_height() < tick_height {
sleep(Duration::from_millis(10));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's gotta be a smarter way to do this rather than a semi-random sleep too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but that only further justifies a test. A smarter way can be postponed if we add a good test now.

@mvines
Copy link
Member

mvines commented Feb 9, 2019

@carllin so this patch rolls into the work I’m doing with stressing fullnodes with super low ticks per slot. Want me to take from here?

@carllin
Copy link
Contributor Author

carllin commented Feb 9, 2019

@mvines, I'm currently working on a test for this case, so I'll just push that once its ready?

@carllin carllin force-pushed the RotationFix branch 2 times, most recently from ecdd291 to 67b15f3 Compare February 10, 2019 04:52
@carllin carllin merged commit 4b38ecd into solana-labs:master Feb 11, 2019
andreisilviudragnea pushed a commit to andreisilviudragnea/solana that referenced this pull request Aug 23, 2024
…bs#2707)

ff cleanup: remove enable_gossip_duplicate_proof_ingestion, it's active everywhere.
jeffwashington pushed a commit to jeffwashington/solana that referenced this pull request Aug 27, 2024
…bs#2707)

ff cleanup: remove enable_gossip_duplicate_proof_ingestion, it's active everywhere.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants