Drop skip too large condition for liveness safety #6014
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue Addressed
Chating with @michaelsproul he noted that Lighthouse still includes a condition that could jeopardize the liveness of the chain under extreme circumstances.
If some codepath (for example block production) needs to advance a state multiple epochs such that the total runtime of that advance is larger than
seconds_per_slot
, it will just error. If this condition happens the node will be stuck, unable to propose blocks on the chain and making the problem worse. If enough nodes follow this rule, the network will never progress and stall. The motivation for this error is to protect the node against very expensive state advance requests, but these are necessary for liveness.For context, the condition was introduced in this PR
I believe we should drop the condition now as is. In the future, we may choose to harden the node against corner cases where the network is in a problematic state, such as having no head in the last N epochs.
Proposed Changes
Drop skip too large condition for liveness safety