Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report malice on sibling blocks from the same validator #196

Merged
merged 4 commits into from
Oct 22, 2019

Conversation

afck
Copy link
Collaborator

@afck afck commented Oct 18, 2019

Backport of openethereum#11160

@varasev
Copy link

varasev commented Oct 18, 2019

I fixed gnosischain/posdao-test-setup#60 a bit and tried to use that for manual testing this PR.

Unfortunately, it seems that the duplicated node is not reported.

The steps to reproduce:

  1. Build this branch with cargo build --release --features final: https://github.com/poanetwork/parity-ethereum/tree/afck-siblings

  2. Clone https://github.com/poanetwork/posdao-test-setup/tree/vk-sibling-blocks and start that with npm run all

  3. On the block #8 or later launch the duplicated node (node9 duplicates the node3):

    ../parity-ethereum/target/release/parity --config "./config/node9.toml" >> "./parity-data/node9/log" 2>&1 &
    
  4. After that we expect that there should be the following error in the logs: Validator 0x522d…89b2 produced sibling blocks in the same step, but I don't see that.

In node3's logs I see the following warning for some blocks as expected:

WARN ethcore::engines::authority_round  Attempted to seal block on the same step as parent. Is this authority sealing with more than one node?

@varasev
Copy link

varasev commented Oct 18, 2019

Fixed in poanetwork/posdao-contracts@570f377

Also, I caught some weird case (only once, and cannot reproduce that again):

After I started the duplicated node on the block ~8, I saw the following error in the logs of validator nodes 1, 2, 3:

2019-10-18 14:21:20  Reported malicious validator 0x522d…89b2 at block 1
2019-10-18 14:21:20  Failed to query malice reports "please ensure the contract and method you\'re calling exist! failed to decode empty bytes. if you\'re using jsonrpc this is likely due to jsonrpc returning `0x` in case contract or method don\'t exist", dropping pending report.

Seems it didn't see maliceReportedForBlock getter of the ValidatorSet contract for some reason: https://github.com/poanetwork/parity-ethereum/blob/afck-siblings/ethcore/src/engines/validator_set/safe_contract.rs#L357

I tried to restart everything from scratch and repeat the same on block ~8, but this time there weren't any errors and the duplicated node still wasn't reported as well.

vkomenda and others added 4 commits October 21, 2019 11:23
This was originally written by @vkomenda, then squashed for
easier rebasing on master. Cleanup of `received_step_hashes`
was moved to `verify_block_family`, since `on_prepare_block`
does not exist on master, and a unit test was added.
Original commit messages:

added the map of received block header hashes

do not return an error and remove older received block records

optimised older record removal

block hash comparison optimisation and the weak client ref fix

SIBLING_MALICE_DETECTION_PERIOD constant

review comments

using step numbers instead of block numbers
@varasev
Copy link

varasev commented Oct 22, 2019

Regarding this error:

... Failed to query malice reports ...

Sorry, it seems that was my fault: after POSDAO contracts refactoring the maliceReportedForBlock getter changed the format of the returned array. I fixed that today and will try to retest everything here again: poanetwork/posdao-contracts@570f377

@varasev
Copy link

varasev commented Oct 22, 2019

This comment is still actual: #196 (comment)

But I noticed that works sometimes:

I tried to launch the test about 10 times but only caught the right behavior 2 times. Both times I saw the warning Validator 0x522d…89b2 produced sibling blocks in the same step and the validator was removed as expected.

In both cases I started the duplicated node on block number ~10, and the warning appeared on block #40 (first time) and #42 (second time).

I don't know if that is expected behavior, because the condition https://github.com/poanetwork/parity-ethereum/blob/afck-siblings/ethcore/src/engines/authority_round/mod.rs#L1552 didn't work immediately and took place only sometimes.

@afck
Copy link
Collaborator Author

afck commented Oct 22, 2019

That's strange; could you try again, with sibling_malice_detection_period set to, say, 1000? I'd just like to make sure that it's not just because our LRU cache is too small.

@varasev
Copy link

varasev commented Oct 22, 2019

Unfortunately, that didn't help: I hardcoded sibling_malice_detection_period to 1000 and rebuilt Parity.

But I still see the following warning for some blocks as expected:

WARN ethcore::engines::authority_round  Attempted to seal block on the same step as parent. Is this authority sealing with more than one node?

This time I caught the error Validator 0x522d…89b2 produced sibling blocks in the same step on the block #74.

Maybe the reason is that the verify_block_family is only called for the blocks for which we are not an author? I remember I made some checks which functions are called on every block and by whom: #129 (comment)

If that's the reason, then maybe use verify_block_basic instead of verify_block_family? Because verify_block_basic is called every block regardless if we are an author of the block or not.

@varasev
Copy link

varasev commented Oct 22, 2019

I'm attaching node logs, hope they might be helpful. The node9 is the duplicate of node3.

logs.zip

@afck
Copy link
Collaborator Author

afck commented Oct 22, 2019

So if I understand correctly, you run the network with multiple nodes for a single validator, but without any special code to create bad blocks?
In that case there are two things that can happen in each of that validator's slots:

  • Either node 2 receives node 1's block before it tries to produce one itself. In that case it prints the "Attempted to seal block on the same step as parent." warning and refuses to seal. So it doesn't do anything wrong in the end, and doesn't get reported.
  • Or the two nodes produce their block pretty much at the same time, without getting the other's first. In that case, they should get reported because they "produced sibling blocks in the same step".

In the log files you uploaded, it looks like the latter happened. Did the reporting work in that case?

It should be fine if nodes don't report themselves. After all, most real attackers wouldn't do that either.

@varasev
Copy link

varasev commented Oct 22, 2019

So if I understand correctly, you run the network with multiple nodes for a single validator, but without any special code to create bad blocks?

Yes, there are three validators (node1, node2, node3 in https://github.com/poanetwork/posdao-test-setup/tree/vk-sibling-blocks), and one of them (node3) has the second node (node9) launched a bit later than others (on a block number 10). There is no special code to create bad blocks.

In the log files you uploaded, it looks like the latter happened.

Yes. I just meant that happened not immediately after I had launched the duplicated node (node9 in this case). I launched the node9 at the block number 10, but the reporting (and the corresponding log warning) happened much later (e.g., on blocks 40, 42, 74 for the different test launches). So, the condition https://github.com/poanetwork/parity-ethereum/blob/afck-siblings/ethcore/src/engines/authority_round/mod.rs#L1552 didn't work for the nearest block tried to be created by node9.

Did the reporting work in that case?

Yes, the reportMalicious was called every time I saw the error nodes produce their block pretty much at the same time.

But I mean that I didn't see the error for every block the node9 tried to produce (that happened only for some far block in the future).

Or the two nodes produce their block pretty much at the same time, without getting the other's first.

OK, so the reason that I didn't see the error for every block is that the two nodes (node3 and node9) mostly tried to produce a block on a bit different time. The times of those blocks not always matched, that's why I saw the error Attempted to seal block on the same step for most cases and didn't saw the error produced sibling blocks in the same step.

Is that correct?

@afck
Copy link
Collaborator Author

afck commented Oct 22, 2019

I think so, yes. So it may be working as expected.
However, I would expect to see one of the two messages for every step where both nodes were running and it was that validator's turn. I only see two "Attempted" logs (one in node 3 and one in node 9), and one "sibling blocks" log in each of the other logs.
How many validators (not nodes) are there in total? 9?

@varasev
Copy link

varasev commented Oct 22, 2019

How many validators (not nodes) are there in total? 9?

Three (node1, node2, node3). Others are just candidates (we could just turn them off).

I only see two "Attempted" logs (one in node 3 and one in node 9), and one "sibling blocks" log in each of the other logs.

If you open the logs I attached, you will see Attempted to seal block message for every block produced by the node3 (i.e. every third block) beginning from the block 11 and ending by block 74 (see node3.log and node9.log).

So, I guess everything is OK here and we can merge this.

@afck
Copy link
Collaborator Author

afck commented Oct 22, 2019

Ah, great, I had missed that! That makes sense. I think it's working fine then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants