-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed block sync might leave some sync data in the main chain database #4866
Comments
Rewind will store all the re-orged blocks in the orphan database. |
Initial reaction is to disagree. The chain metadata keeps track of the valid full block height. We already handle having a chain of valid headers but no/partial blocks. If we reset on an invalid header, we'll have to download valid headers all over again. Same for blocks. If a synced header/block is invalid after syncing others, the blocks that led up to them are still valid so why should we throw them away?
|
It happens in rewind, not reorganise
|
The header sync does this correctly, it only "swaps" if the syncing chain has higher PoW and is Valid. It will even do that for a partial. Which is correct. But going into a waiting state, can cause issues as now the database has incomplete state while accepting new blocks. So what can happen? A header sync attempts a sync of a few blocks, but fails on connection. No the BN goes back to waiting state meaning it can accept blocks again. Now it accepts new blocks, for those same heights again... |
The same problem can be said for the block sync. We might download 1000 block headers, and all are valid. |
When propagated blocks is active again, and receives a block, it should recognise that the block as an orphan, and not base anything on the headers - This does fit with the error we were seeing so definitely could be a bug there. Agree with block sync that we need to remove invalid headers up to and after a failed block (mmrs are invalid). Maybe we need to restore the old state, but since we only change for a higher PoW I don't think that is critical. |
I don't really understand the problem or solution, but happy if you want to leave this issue open and make a PR against it, showing the problem |
Description --- If sync fails resets chain to the highest pow chain the node locally has the data to. Motivation and Context --- See: #4866 How Has This Been Tested? --- Unit tests
Currently, syncing is handled by the BN state machine. If the node things its behind, the node will try header_sync to download headers.
If it has enough VALID headers, it will commit them as headers on the main chain, and continue do to so for as long it is provided VALID headers.
If this fails, the node goes to waiting state.
It should not, it should either try and sync the VALID blocks from a node, or delete the headers and on.
But most cases this will return
Ok
The block sync has the same problem, after failing, we need to check and see if we need to reset back to our previous state.
The text was updated successfully, but these errors were encountered: