-
Notifications
You must be signed in to change notification settings - Fork 981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finalize deposit before applying it to a state #3689
Conversation
The implementation complexity of the fork-aware pubkey cache has not been as bad as I originally imagined in Lodestar and Lighthouse. Would like to know from other teams too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since deposits are not applied until after finalization, we don't need activation_eligibility_epoch
anymore, as its role is replaced by deposit.epoch
in a PendingDeposit
. Then we can remove is_eligible_for_activation_queue
from process_registry_updates
, and move its balance check to is_eligible_for_activation
:
def is_eligible_for_activation(state: BeaconState, validator: Validator) -> bool:
"""
Check if ``validator`` is eligible for activation.
"""
return (
# Has not yet been activated
and validator.activation_epoch == FAR_FUTURE_EPOCH
# Has sufficient effective balance
and validator.effective_balance >= MIN_ACTIVATION_BALANCE
)
@@ -1271,12 +1258,14 @@ def process_deposit_receipt(state: BeaconState, deposit_receipt: DepositReceipt) | |||
if state.deposit_receipts_start_index == UNSET_DEPOSIT_RECEIPTS_START_INDEX: | |||
state.deposit_receipts_start_index = deposit_receipt.index | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about checking is_valid_deposit_signature
here? Unless we specifically want to avoid burning a top-up with incorrect signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two things why do signature check as is:
- Some infra might already have top-ups sent with zero signatures (worth checking tho)
- Rate limiting a number of deposit signature checks per epoch is really nice, it alleviates the risk of delaying block processing if someone manages to pack a thousand deposits into it
Co-authored-by: fradamt <[email protected]>
In Teku, we implemented it in a way, where we cache only the finalized indices and for the rest we do an N loop starting from the last finalized index + 1. It could be a problem for chains which are not finalizing for a long time, but otherwise I don't see a big issue with it, but would prefer if this PR makes it into the specs. |
An important consideration w.r.t. the inactivity leak is that this PR prevents top-ups during non-finality. This affects both the ability of leaking validators to maintain their stake share, and non-leaking validators from increasing their share. |
This is an important consideration. Do you think there is any risk related to that? Honest validators’ stake should remain stable as they keep voting — this is at least a desired property, but i don’t know if in practice this property holds in all cases as there might be not enough block space to accommodate all submitted attestations on chain thus honest validators may leak too. Preventing dishonest validators that aren’t voting from increasing their stake share is not strongly desired but a nice property, imo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the introduction of the queue, as it makes sure that BLS signature verifications remain bounded (16 / block). This way, the CL hardware requirements don't become a concern on every future gas limit increase.
Finality also simplifies implementations, because it means that validator indices to pubkey mappings are the same across all valid branches. As new validator activations can only happen while finalizing, this shouldn't affect the underlying PoS security.
The one thing that will no longer be possible is topping up a validator while the network is not finalizing / possibly leaking. I don't personally have concerns with that, but it should be documented at least as a semantic change that affects non-finalizing networks.
@@ -155,7 +153,7 @@ The following values are (non-configurable) constants used throughout the specif | |||
|
|||
| Name | Value | Unit | | |||
| - | - | :-: | | |||
| `PENDING_BALANCE_DEPOSITS_LIMIT` | `uint64(2**27)` (= 134,217,728) | pending balance deposits | | |||
| `PENDING_DEPOSITS_LIMIT` | `uint64(2**27)` (= 134,217,728) | pending deposits | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2^32 would be the theoretical limit with infinite gas fee and a single block that fills the entire deposit tree
if processed_amount + deposit.amount > available_for_processing: | ||
break | ||
increase_balance(state, deposit.index, deposit.amount) | ||
if deposit.epoch >= state.finalized_checkpoint.epoch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also check for the legacy style deposits import to have caught up with the queue.
Otherwise, deposits from the old style (via eth1_data) and from the new queue can get reordered, possibly messing with decentralized staking pools that rely on a linear deposit order.
Same with Prysm. We implemented it the same way as Teku, as mentioned by @StefanBratanov. I'm indifferent to this, but if the sole purpose is to remove the fork-aware pubkey cache, I don't think it's necessary. |
Outcomes of the breakout session on the change proposed by this PR:
|
Benchmarked a block with 1000 deposits using lcli.
(Note: the Even on this relatively small network, the block arrived quite late on the node and took quite a bit to process.
The execution layer processing itself took ~3secs. Consensus block processing is interleaved with the execution processing in lighthouse, so overall it took |
@pawanjay176 thanks a lot for this benchmark!
I am curious why is that happened? It’s hard to believe that the propagation time took that many especially considering a small network. My guess would be that it wasn’t sent to the wire before fully processed by the proposer’s node, wdyt? |
Sorry took a while, was on vacation.
Yeah I think that is right, the propagation seems near instant but producing the big block, sending it to the VC across the http api, getting it back and then publishing it finally took longer than for a smaller block. |
""" | ||
return ( | ||
validator.activation_eligibility_epoch == FAR_FUTURE_EPOCH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct to assume that validators can never have an activation_eligibility_epoch > ELECTRA_FORK_EPOCH
? e.g., by depositing during the last slot before electra activates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those validators are added to the pending deposits queue in upgrade_to_electra
epoch=get_current_epoch(state), | ||
pubkey=validator.pubkey, | ||
withdrawal_credentials=validator.withdrawal_credentials, | ||
amount=balance, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excess_balance
? Because MIN_ACTIVATION_BALANCE
was already credited.
pubkey=validator.pubkey, | ||
withdrawal_credentials=validator.withdrawal_credentials, | ||
amount=balance, | ||
signature=BLSSignature(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For SyncAggregate
, G2_POINT_AT_INFINITY
is used instead of a zero value when noone signed.
Or, None
with EIP-7688 optionals (but I guess zero value or infinity is also fine, as it is not checked anyway).
PendingBalanceDeposit(index=index, amount=excess_balance) | ||
state.pending_deposits.append( | ||
PendingDeposit( | ||
epoch=get_current_epoch(state), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an off-by-one here?
When epoch N finalizes, everything up through and including compute_start_slot_at_epoch(N)
is finalized.
- If activations are processed for
deposit.epoch <= finalized_checkpoint.epoch
, then that would also activate deposits on the non-finalized slots within N (only the initial slot of N is finalized). - If activations are processed for
deposit.epoch < finalized_checkpoint.epoch
, then deposits within the very first slot of N are not processed for another ~6 minutes despite already being finalized.
Currently, it seems we are following the second strategy (deposit.epoch >= state.finalized_checkpoint.epoch
). Probably fine to have the extra ~6 minutes wait for deposits within the first slot, but could be avoided, so still keeping the comment for awareness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally in favour of strategy (2) because of its simplicity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this and I changed my mind. The issue is that clients may (and probably will) filter deposits by blocks or db invariants like "finalized", the presence of this off by one creates edge cases that need to be always checked and tested. While it's not likely to create a consensus bug I feel it will be a source of bugs nonetheless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we set deposit.epoch
in the queue to get_current_epoch(state) + 1
, will that prevent issues that you have in mind? It can also be renamed to deposit.activation_epoch
to better reflect the semantics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess storing slot
and processing deposits up through deposit_receipt.slot <= compute_start_slot_at_epoch(state.finalized_checkpoint.epoch)
is the most straight forward.
PendingDeposit( | ||
epoch=get_current_epoch(state), | ||
pubkey=validator.pubkey, | ||
withdrawal_credentials=validator.withdrawal_credentials, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are withdrawal_credentials
necessary to be saved here?
- If yes, there could be a race where a
BLSToExecutionChange
gets processed while thePendingDeposit
is in the queue, changing the withdrawal credentials of the validator while thePendingDeposit
retains the stale entry. - If no, we could just set this to a zero value (or to
None
with EIP-7688 optionals).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn’t necessary so I think that setting it to zero value should be fine
Closing in favour of #3818 |
1 similar comment
Closing in favour of #3818 |
Proposal
Extends
state.pending_balance_deposits
to keep the deposit data and an epoch when a deposit was processed by the consensus layer. The deposit is applied to the state only when that epoch gets finalized and if the activation churn has enough capacity.Analysis
(pubkey, index)
cache implementation.100_000
validators in the activation queue the pending deposits queue will consume ~18MB vs 1.5MB of just pending deposit balances.validator.activation_aligibility_epoch
which can be repurposed in the future.