-
Notifications
You must be signed in to change notification settings - Fork 659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: make adversarial_behaviors tests pass with stateless validation #10571
Conversation
The tests use `process_all_actor_messages()` to pass `PartialChunk` messages between peers. The problem is that this function consumes (but ignores) `ChunkStateWitness` and `ChunkEndorsement` messages. This means that these messages disappear and never reach their destination. Let's use `handle_filtered` to process only `PartialChunk` messages, without consuming any other ones. Thanks to this we will be able to process the other messages later.
…ents Some tests produce invalid chunk state witnesses and/or invalid endorsements to test how malicious behavior is handled. In these tests `process_chunk_state_witness` and `process_chunk_endorsement` will sometimes return an error, which is expected. At the moment functions used to propagate chunk state witnesses and endorsements don't tolerate any errors, they `unwrap()` the result of processing, which causes a panic on any error. Let's add an argument which allows to ignore the errors when propagating chunk witnesses and endorsements. This will make it possible to use them in tests which test adversarial behaviors.
Propagate chunk state witnesses and endorsements in `test_non_adversarial_case`, which makes the test pass with stateless validation. This is a non-adversarial case, so there are no invalid/missing chunks in this test.
Fix two tests which test how the blockchain behaves when there is one malicious chunk producer who produces invalid chunks. The tests didn't work with stateless validation. To fix them we first have to properly propagate chunk state witnesses and chunk endorsements. We should allow errors when processing witnesses and endorsements, as the malicious chunk producer will cause some of the validations to fail. Then it's also necessary to adjust the expected behavior. The previous version of the protocol first included new chunks into a block and then validated the whole block. This means that if some chunk was invalid, it would cause the whole block to be invalid, and the block would be skipped. This is why the current code expected the first block with invalid chunks in an epoch to be skipped. With stateless validation, the situation is different. Chunk validators will refuse to send endorsements for the invalid chunk, which means that it won't be included in any blocks, and no blocks will be skipped. The only exception is the block at height 2, which is skipped. I suspect that this is because the first few blocks in the blockchain still use the old validation logic, not stateless validation.
I suspect that my attempt to handle genesis state witness spills over to the rest of the code base, taking a look |
Well, yes. In these tests we are lucky enough to have bad CP producing first chunk after genesis, which gets auto-endorsed in our current code (e.g. nearcore/chain/client/src/stateless_validation/chunk_validator.rs Lines 592 to 595 in 2eec0c5
How about we push this PR forward but tackle first-after-genesis chunk validation in a separate PR? I hope it is enough to put genesis chunk state root into main state transition and skip "apply new chunk" step. I wasn't seeing any bans in logs, however. |
Thanks for the explanation. I updated the comment to better reflect what's going on and added a TODO(#10502) to remove this logic once we'll be able to properly handle state witness for the first blocks. The test tests the current behavior, so that seems ok. Later when the behavior changes we will also change the test.
In my logs (log.txt.zip) I can see the following lines: 0.333s ERROR start_process_block{provenance=PRODUCED block_height=2}: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv)
0.337s ERROR start_process_block{provenance=NONE block_height=2}: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv)
0.341s ERROR start_process_block{provenance=NONE block_height=2}: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv)
0.352s ERROR start_process_block{provenance=NONE block_height=2}: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv)
0.372s ERROR process_blocks_with_missing_chunks: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv)
0.386s ERROR process_blocks_with_missing_chunks: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv)
0.404s ERROR process_blocks_with_missing_chunks: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv)
0.415s ERROR process_blocks_with_missing_chunks: client: Banning chunk producer for producing invalid chunk chunk_producer=AccountId("test7") epoch_id=EpochId(11111111111111111111111111111111) chunk_hash=ChunkHash(6zGqyVzHRB1cgjCGq1DLfvdanbtrTyPAdcSejHEFPVMv) Generated using: cargo nextest run --package integration-tests --features nightly,test_features --nocapture test_banning_chunk_producer_when_seeing_invalid_chunk 2>&1 | tee log.txt |
Oh, that's right... UPD: could be a good idea to completely remove such unused logic after stateless validation release |
I read a bit into the code and I think this is actually the second chunk after genesis. 0.180s DEBUG custominfo: Produced block. Height: 1 Prev block hash: HppBDhchEtKYbxDuoKFa112aEK4j6KNYWmtNr5BjjRLq
0.322s DEBUG custominfo: Produced block. Height: 2 Prev block hash: CVWFUwZDcvT4MUZQpXGQUZ6U8pZNuCr2EcGo2pVZH1qn The invalid chunk is in block |
Block 1 never has chunks, that's another specific of the code. |
Ahh ok. When there's no chunk in block |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #10571 +/- ##
==========================================
+ Coverage 71.92% 71.97% +0.04%
==========================================
Files 724 724
Lines 147133 147140 +7
Branches 147133 147140 +7
==========================================
+ Hits 105830 105908 +78
+ Misses 36437 36380 -57
+ Partials 4866 4852 -14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Description
The tests in
adversarial_behaviors.rs
didn't work with stateless validation, let's make them pass again.To achieve this we first need to propagate chunk state witnesses and endorsements, and then adjust the expected behavior to match the one exhibited by stateless validation.
The most complex part of this PR is modifying which blocks should be skipped in
test_banning_chunk_producer_when_seeing_invalid_chunk_base()
.This function creates a blockchain with one malicious chunk producer and runs it for a few epochs.
Which blocks will be skipped depends on whether we are using stateless validation or not.
Without stateless validation
In the old version of the protocol the invalid chunk is included in a block and then the whole block is validated. The chunk is invalid, so the whole block is also invalid and the block should be skipped. After that the malicious chunk producer is banned for the rest of the epoch, which means that block producers will reject chunks sent by this chunk producer, so the subsequent blocks won't contain any invalid chunks and won't be skipped.
Only the first block with invalid chunks in an epoch should be skipped.
With stateless validation
With stateless validation the situation is entirely different. When the malicious chunk producer produces an invalid chunk, chunk validators will refuse to send out endorsements for this chunk, so the chunk won't be included in any block. This means that no blocks should be skipped.
There is one exception - in the test the block produced at height 2 is invalid and gets skipped. I suspect that this is because the first few blocks after genesis don't use stateless validation, and they are still handled by old protocol. I'm not 100% sure if that's the case, @Longarithm could you confirm that this explanation makes sense?
Something is definitely different for the first few blocks, in the logs I can see
client: Banning chunk producer for producing invalid
at height 2, which seems to belong to the old protocol. In the new version malicious chunk producers are banned usingBanPeer
, without any log messages.Refs: #10506, zulip discussion