-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: properly handle genesis as part of stateless validation #10633
Conversation
a8575d1
to
bddbc41
Compare
bddbc41
to
6de4311
Compare
@staffik @jancionear could you review please? If you LGTM it then I'll stamp an approval as well. |
2d92a9d
to
0a408eb
Compare
0a408eb
to
e974b63
Compare
@pugachAG please let me know when I should proceed with review, I see quite of lot changes keep being added now. |
53d8e08
to
dbcfa97
Compare
dbcfa97
to
c5b2c5e
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #10633 +/- ##
==========================================
- Coverage 72.42% 72.38% -0.05%
==========================================
Files 729 729
Lines 148996 149046 +50
Branches 148996 149046 +50
==========================================
- Hits 107912 107884 -28
- Misses 36207 36263 +56
- Partials 4877 4899 +22
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
@staffik @jancionear this PR is ready for review now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// would be skipped when using stateless validation. | ||
this_block_should_be_skipped = height < 3; | ||
// included in the block at all. | ||
this_block_should_be_skipped = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case uses_stateless_validation
it looks like we do not even need to run the containing loop.
What about if height > 1 && !uses_stateless_validation {
and move the comment above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like we can skip the loop since it also updates invalid_chunks_in_this_block
which is used later.
I've updated this to avoid setting this_block_should_be_skipped = false
here since it is noop: 10a99ff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks nice, left some comments
chain/epoch-manager/src/adapter.rs
Outdated
/// This function is only needed to have a 'return true' implementation | ||
/// as part of MockEpochManager to make tests work with stateless validation | ||
/// enabled. | ||
/// TODO(#10640): remove this after we get rid of MockEpochManager | ||
fn is_chunk_producer_for_epoch( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is also used in Chain::should_produce_state_witness_for_this_or_next_epoch
, so I think the comment is wrong. It still will be needed after MockEpochManager
is removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see how it can be confusing. What I meant here is that it doesn't have to be a part of EpochManagerAdapter
trait. I guess it is not that important, so I will just drop the comment and close the issue.
if block.header().is_genesis() { | ||
assert_eq!(receipt_source_blocks.len(), 1); | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel uneasy about having an assert in the validation code. I think it would be better to have an error, there's no reason to risk a panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe it would be nicer to have an explicit condition: if witnes for genesis { source receipts should be empty}
. I feel that it would be easier to reason about than the current solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assert is not part of validation logic, it ensures that our chain is in the valid state (receipt_source_blocks
is not calculated using our own chain state, not state witness). I will replace it with proper error handling so you feel better about it 😉
Maybe it would be nicer to have an explicit condition
It is a bit more code, but indeed should be easier to understand, done
@@ -280,7 +280,7 @@ impl ViewClientActor { | |||
let cps: Vec<AccountId> = shard_ids | |||
.iter() | |||
.map(|&shard_id| { | |||
let cp = epoch_info.sample_chunk_producer(block_height, shard_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to unwrap()
here? The function returns an Error, so maybe it would be better to return an error.
(Also applies to other unwraps()
on the new sample_chunk_producer
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shard id should be valid here let shard_ids = self.epoch_manager.shard_ids(&epoch_id)?
returning error requires a bit of refactoring here since it is part of map
, so we cannot just add ?
, so probably should be done in a separate PR
let sample = v4.chunk_producers_sampler.get(shard_id)?.sample(seed); | ||
v4.chunk_producers_settlement.get(shard_id)?.get(sample).copied() | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_chunk_state_witness_bad_shard_id test started failing: this actually uncovered a real issue which could result in crashing chunk validator when state witness contains invalid shard id, fixed in c5b2c5e
So IIUC we are now validating the shard_id
of ChunkStateWitness
by trying to sample a chunk producer for this shard_id
. That feels a bit hacky, maybe we should have some function that prevalidates the witness header (or just shard_id) before doing anything else. Sampling the producer could potentially work with invalid shard ids (e.g producers[(height + shard) % producers.len()
), so it's not obvious that it can be done with untrusted data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's another time that an invalid shard_id could cause a crash (#10621 ...), I think it would be good to go through the code and just remove all usages of [shard_id]
. It causes problems now, and it's gonna be even worse with resharding :/ I'll make an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #10648
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to unblock
@@ -118,24 +118,6 @@ impl ChunkStateWitnessInner { | |||
} | |||
} | |||
|
|||
impl ChunkStateWitness { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yayy!
Noticed the introduction of `is_genesis` function in #10633 This PR tries to use the `is_genesis` function in place of comparing prev_hash with CryptoHash::default()
This PR removes "approve anything" shortcut when previous chunk is part of genesis. The only difference in case of genesis is that we don't want to execute main state transition and instead just check that post state root matches genesis state root for that shard.
This also exposed some issue I had to fix to make tests work:
MockEpochManager::get_epoch_chunk_producers
returns empty Vec which results inChain:: should_produce_state_witness_for_this_or_next_epoch
returningfalse
. Fixed by addingis_chunk_producer_for_epoch
as part ofEpochManagerAdapter
soMockEpochManager
can override it.test_chunk_state_witness_bad_shard_id
test started failing: this actually uncovered a real issue which could result in crashing chunk validator when state witness contains invalid shard id, fixed in c5b2c5eCloses #10502.