-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terminal Total Difficulty calculation may differ between nodes #2603
Comments
Actually, rather than storing the TTD in the state, we should just add a new field that records the total difficulty of the block referenced in Eth1Data when |
Good catch! 👍 Originally, there was no "the transition store has not already been initialized" requirement in the spec and it was assumed that The moment of calling the |
tbh, I'm more and more in favor of hard-coding TTD again. This would simplify the mechanism and edgecases. And because the TTD could ship in consensus and execution clients, this would also prevent users that run pre-merge execution-engine implementations without a beacon node from accidentally following the PoW chain for a long time after the merge, and instead result in a failure that would be noticed and resolved. |
I second that. In this case won't a single release for the execution client containing hardcoded TTD and the Merge logic be enough? |
@ajsutton I was wondering the same. If a beacon node process more than one epoch boundary such that epoch == MERGE_FORK_EPOCH, should it compute terminal_total_difficulty again and overwrite the existing value in the store? If TTD is hard-coded, what happens if TTD is reached before whatever condition is picked to activate the merge fork? |
I think this is fundamentally the problem - it doesn't matter if you take the first or the last value you always have the potential for a chain split. I think using the last value is more vulnerable to attack since it allows someone to just publish their block late to ensure they're what sets the TTD but neither really works. You'd have to wait until the MERGE_FORK_EPOCH is finalized to make it reliable.
That would go quite badly, likely leading to a chain halt. We'd need to pick a TTD that's far enough in the future that we're confident the merge epoch will be reached first. |
What if we do it as follows:
At the first glance this may work but I might have missed something. This approach would guarantee that TTD hits after the Merge HF. Even if someone would include a non-empty payload into a beacon block signifying the transition before execution clients get shipped with the Merge logic it will be rejected by the Merge logic on the beacon chain side as This approach should not stretch the period of the whole upgrading process as we will likely need to wait a month for the Merge HF, then yet another month (2 weeks) for the transition vs releasing everything at once with TTD hitting in 1.5-2 months with the same intention (a months for the HF and 0.5-1 months for transition). This approach should save us one extra release of the execution client software. In my mental model updating execution clients is more resourceful than updating consensus clients. I might be wrong though and then this proposal might not make sense. |
After this change #2595 this statement became unsound as |
@mkalinin If TTD is hard-coded, what happens if miners stop mining the Ethereum chain early and that total difficulty is never reached? Since their revenue will drop to 0 at day N, it's not a huge difference from stopping getting revenue at day N-1. I guess this scenario affects all approaches discussed in this PR and current master. EDIT: Oh already brought up in #2547 👍 |
Currently the merge transition process (https://github.com/ethereum/consensus-specs/blob/dev/specs/merge/fork.md#initializing-transition-store) computes the terminal total difficulty dynamically as part of
process_slot
into the first slot ofMERGE_FORK_EPOCH
and only if "the transition store has not already been initialized". The transition total difficulty is then stored outside of theBeaconState
so is not fork aware or part of consensus (until the terminal PoW block is actually reached).There are a couple of ways that nodes may wind up with an incorrect terminal total difficulty using the current spec.
Problem Sources
Checkpoint Sync
If a node uses checkpoint sync with a state between
MERGE_FORK_EPOCH
and the terminal PoW block being included, it will never initialise the transition store because thestate.slot % SLOTS_PER_EPOCH == 0, compute_epoch_at_slot(state.slot) == MERGE_FORK_EPOCH
condition will never be met (the state is already past that point). Such a node will then reject any block that includes a non-emptyExecutionPayload
and fail to follow the chain when the merge completes.It has taken quite a bit of time to establish ways for users to easily access a
BeaconState
to use as part of checkpoint sync, so requiring theBeaconState
plus the TTD value to be supplied would be a significant step backwards in usability and the progression away from always replaying every block to sync the beacon chain.Clients supporting checkpoint sync would need to release an update once the TTD is known to hard code it in the client to resolve this.
Forks
As the transition store is initialised the first time a node processes slots into the merge epoch, it's possible for different nodes to use states from different forks when setting the TTD. This is further complicated because many beacon chain clients pre-process slot and epoch transitions as a performance optimisation.
To give a concrete example, lets say MERGE_FORK_EPOCH=100 and thus slot 3199 is the last slot before the merge fork activates and slot 3200 the first one after it. And also assume epoch 100 is the end of the eth1 voting period (probably not actually true but makes numbers nice and round).
At slot 3198, the Eth1Data voting process is one vote short of agreeing on the new Eth1Data.
At slot 3199, a block is published to a minority of nodes which provides that last required vote.
Nodes that receive block 3199 and pre-process the epoch transition, will now initialize their transition store because they're following a fork where block 3199 is included on the chain.
Then block 3200 is received which uses block 3198 as its parent and it is accepted as the canonical chain, orphaning block 3199. Thus the canonical Eth1Data is the old one because 50% was never reached.
At this point there's no visible consensus split - all nodes will reorg over to block 3200 and use the old Eth1Data as canonical for including deposits etc. However the minority of nodes which did receive block 3199 will have initialised TransitionStore based on the new Eth1Data, calculating a different terminal total difficulty to the other nodes. When the earlier of the terminal difficulties is reach the chain split will be revealed as different nodes will expect a different terminal PoW block.
Another issue that arises from not having the TTD stored outside of the
BeaconState
and thus not being fork aware, is that clients may separate processing empty slots from applying a block. In that case even receiving an invalid block could still cause the TTD to be set because theprocess_slots
operation was performed independently and wouldn't be rolled back when the block processing fails. There's a large number of variations of this kind of problem including REST APIs that could cause a beacon node to process empty slots and thus trigger initialisation of the transition store.Fundamentally, introducing side effects to
process_slots
is a very significant change to the design assumptions for beacon node clients.Possible Solutions
Store TTD in
BeaconState
We could remove the side effects from
process_slots
and restore the ability to process any block given only the pre-state but adding a field toBeaconState
which stores the calculated TTD. This would be set as part of theupgrade_to_merge
function. It's still slightly awkward that the total difficulty for the Eth1Data block would need to be known and isn't available from the state but that's readily available from the execution engine and is only required for this special state transition.The downside is this would be a new field that would then be removed again in the next consensus upgrade.
It would also be effectively impossible to override the TTD via an API as it would require an irregular state transition.
Initialize
TransitionStore
on FinalizationThe "Forks" problems could be solved by only initializing the
TransitionStore
when theMERGE_FORK_EPOCH
is finalized. There is then only one possible fork. Checkpoint sync would remain broken and require clients to ship a release with the TTD hard codedHard code TTD
We could scrap calculating the TTD and just hard code it in clients. The key downside of this is that providing sufficient notice for users to upgrade would mean setting a TTD far in the future (see Rationale in #2462).
This could be mitigated by setting expectations that users will need to upgrade their consensus clients twice for the merge. The first upgrade includes sets the
MERGE_FORK_EPOCH
and the second would be shipped shortly afterMERGE_FORK_EPOCH
is reached and include the hard coded TTD. This is what users were told to expect for the beacon chain launch as well - that a client update would be released in the week between genesis being set and the chain actually starting (so that if the genesis was calculated incorrectly the update could fix it - in the end clients agreed but a release still went out to embed the genesis state in that week).The text was updated successfully, but these errors were encountered: