Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify number of hashes for each block of entries #6262

Merged
merged 3 commits into from
Oct 31, 2019

Conversation

jstarry
Copy link
Member

@jstarry jstarry commented Oct 7, 2019

Problem

We don't currently verify the number of ticks per block and the number of hashes per tick

Summary of Changes

  • Update create_ticks to accept number of hashes per tick
  • Verify tick count and hash count when processing entry slices
  • Verify that all blocks end in a tick entry
  • Remove BlobError and replace it with BlockError
  • Add granularity to BlocktreeProcessorError

Fixes #4282

@jstarry jstarry force-pushed the verify-num-hashes branch 2 times, most recently from 23a43d3 to 51b67b2 Compare October 8, 2019 03:35
@codecov
Copy link

codecov bot commented Oct 8, 2019

Codecov Report

Merging #6262 into master will increase coverage by 6.3%.
The diff coverage is 94.7%.

@@           Coverage Diff            @@
##           master   #6262     +/-   ##
========================================
+ Coverage    72.2%   78.5%   +6.3%     
========================================
  Files         212     213      +1     
  Lines       44405   41129   -3276     
========================================
+ Hits        32076   32308    +232     
+ Misses      12329    8821   -3508

@jstarry jstarry marked this pull request as ready for review October 17, 2019 05:04
@jstarry jstarry requested review from rob-solana and carllin October 17, 2019 14:23
Copy link
Member

@mvines mvines left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're only verifying the hash count when blocktree processor first loads the ledger from disk. We also need to verify the hash count in replay stage as each new block is ingested

@@ -299,6 +287,20 @@ impl EntrySlice for [Entry] {
);
res
}

fn validate_hash_separation(&self, hashes_per_tick: u64) -> bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of the name of this function: validate_hash_separation()

validate_hash_count()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, do you think that it would be surprising that validate_hash_count rejects an entry slice that doesn't end in a tick entry? I want to enforce that but maybe it would be better do in another place

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to do that. There's no guarantee an entry slice always ends with a tick

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, what about slots? Should we enforce that they end in ticks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need that as the number of ticks in a slot is something that's computed dynamically by the bank on each node. The incoming ledger data itself has no "slot marker".
eg,

solana/runtime/src/bank.rs

Lines 807 to 810 in 5468be2

} else if current_tick_height % self.ticks_per_slot == 0 {
// Register a new block hash if at the last tick in the slot
Some(self.blockhash_queue.write().unwrap())
} else {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aeyakovenko, @rob-solana, maybe marking this type of failure as a dead slot isn't the correct approach? Summary of concerns here: #6498

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carllin so from your last comment here: #6498 (comment) is the conclusion that this PR is fine with the last entry check? Between #6498 and #6512 I think we're covered

Copy link
Contributor

@carllin carllin Oct 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstarry So from my understanding, in this PR, the situation that will trigger BlockError::TrailingEntry in both replay_stage and blocktree_processor:

Observed max_tick_height, and the last entry was not a tick (extra trailing entry after the last tick)

This is a slightly larger set of conditions under which BlockError::TrailingEntry should be triggered than what was proposed in #6512, which was:

Observed max_tick_height, and the last tick shred was not marked with LAST_SHRED_IN_SLOT.

Essentially the difference between the proposal and what's in this PR is, if the If the last tick shred was marked with LAST_SHRED_IN_SLOT, this slot shouldn't be marked as dead, even if there are trailing entries (because there might be a subset of this slot transmission that reaches consensus). If the last tick shred wasn't marked with LAST_SHRED_IN_SLOT, then this should be marked as dead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carllin the reason there's a separate issue for the "last tick needs to coincide with last shred" is that this PR doesn't cover it yet, and can land on its own.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup agreed, just summarizing the difference for clarity.

@jstarry
Copy link
Member Author

jstarry commented Oct 18, 2019

It looks like we're only verifying the hash count when blocktree processor first loads the ledger from disk. We also need to verify the hash count in replay stage as each new block is ingested

Yes, my mistake, working on a fix now


EDIT: Fixed

@jstarry jstarry force-pushed the verify-num-hashes branch 4 times, most recently from ef84bdb to 64e4727 Compare October 20, 2019 16:10
core/src/replay_stage.rs Outdated Show resolved Hide resolved
}

#[test]
fn test_dead_fork_invalid_slot_tick_count() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rob-solana in response to this comment: #6262 (comment)...

I have a test here for making sure the fork is marked as dead. Looks like mark_dead_slot sets the BankProgress to dead as well as the blocktree slot 👍

ledger/src/entry.rs Outdated Show resolved Hide resolved
core/src/replay_stage.rs Outdated Show resolved Hide resolved
@jstarry jstarry force-pushed the verify-num-hashes branch 2 times, most recently from 694a3b3 to 05685bb Compare October 22, 2019 16:25
@jstarry jstarry force-pushed the verify-num-hashes branch 2 times, most recently from d5f8650 to e1fa7dd Compare October 24, 2019 22:52
carllin
carllin previously approved these changes Oct 24, 2019
@jstarry
Copy link
Member Author

jstarry commented Oct 24, 2019

Hm CI says not so fast, investigating...

@carllin
Copy link
Contributor

carllin commented Oct 24, 2019

wow @sagar-solana local-cluster catching things 😛

@mergify mergify bot dismissed carllin’s stale review October 25, 2019 01:32

Pull request has been modified.

@jstarry
Copy link
Member Author

jstarry commented Oct 25, 2019

@carllin pretty sure it's fixed, CI is just having time out issues now. Can you review my last change? fad6bda

@jstarry jstarry merged commit e8e5ddc into solana-labs:master Oct 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need to verify a block contains the expected number of hashes
4 participants