-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fix reversion loop #4439
Comments
Re-orgs do not prune chain state and validators run as archive nodes, so the state should be available always 🤔 |
We're surely not currently set up to fetch old chain state during disputes, no? If depends if adversaries could push us into demanding purged chain state: Any chain reversion caused by the approvals system also causes a slash, but not necessarily 100%, depending upon what optimizations we employ there. We should avoid an adversary guilty of 100% slashable soundness offenses from distracting us from checking their offending chain. We've discussed blind availability recovery for that case, so we'd just assume everyone has chunks, but if not then we'll time out eventually. This complicates life for pre-availability disputes. |
Thanks @bkchr ! I guess it could also be a result of this error:
|
Yes, but that means that we have 32 blocks imported on the same height, which should not happen :P |
On chain reversion, if we still have ongoing disputes with regards to the reverted chain, then we rely on that old chain to recover the relevant session of the candidate for example. |
assuming we don't have a reversion loop, but it looks like we did - finding out, whether we can still have reversion loops was the whole point of that test run, so a good thing we tested. 😄 |
One thing that came to my mind, do you may start working on a fork that you never have imported? Aka you see a dispute that references some block that you are not aware of? Or can that not happen? |
Yes I think remote disputes should work on non-imported forks. |
Yep we want to work disputes to work even for candidates that appeared on an to the node unknown fork. But the issue with the first error is different - it uses a recent head for querying the chain for the availability-recovery monitors |
I saw this in our testnet (build with polkadot-launch)
it seems on polkadot-v0.9.3 branch including Polkadot 0.9.3 relaychain trigger this often, I saw this 3 times in one week |
the story looks like because of #5639 validators randomly panic, then only 1 or 2 validators still alive, then the collator will be trouble |
@jasl your issue is on the parachain side and this is tracked here: paritytech/cumulus#432 |
The "Too many sibling blocks inserted" was already fixed in paritytech/cumulus#432. Is there anything else we need to do for this one? |
No, I think we are good. |
Our testing on Rococo revealed, that we still might be susceptible to reversion loops, as can be witnessed, by those errors for example (and others obviously):
Seen logs:
Especially the "Too many sibling blocks inserted" suggests a reversion loop, because this can only happen if we have at least 32 heads a the same height.
The text was updated successfully, but these errors were encountered: