Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-protocol deposits flow (no queue approach) #3177

Merged
merged 26 commits into from
Mar 14, 2023
Merged

Conversation

mkalinin
Copy link
Contributor

Introduction

On the top of EIP-6110 proposal, this PR introduces a specification of in-protocol deposit processing flow. Advantages of the new deposits machinery are outlined in the EIP, let's list them here for visibility:

  • Significant increase of deposits security by supplanting proposer voting. With the proposed in-protocol mechanism, an honest online node can’t be convinced to process fake deposits even when more than 2/3 portion of stake is adversarial.
  • Decrease of delay between submitting deposit transaction on Execution Layer and its processing on Consensus Layer. That is, ~13 minutes with in-protocol deposit processing compared to ~12 hours with the existing mechanism.
  • Eliminate beacon block proposal dependency on JSON-RPC API data polling that suffers from failures caused by inconsistencies between JSON-RPC API implementations and dependency of API calls processing on the inner state (e.g. syncing) of client software.
  • Eliminate requirement to maintain and distribute deposit contract snapshots (EIP-4881).
  • Reduction of design and engineering complexity of Consensus Layer client software on a component that has proven to be brittle.

Deprecating Eth1Data poll

The proposed design allows for straightforward deprecation of Eth1Data poll. Once EIP-6110 is activated supplied deposits start to be queued in the beacon state in the pending_deposits queue. When the state.eth1_deposit_index reaches the index of the first queued deposit receipt the new mechanism takes over. Once that happens the network stops relying on Eth1Data poll because the poll does always lag behind the new mechanism due to the follow distance and the voting period.

After the new mechanism takes over CL client developers are free to decide when they want to get rid of the old machinery and stop their client from voting on Eth1Data. The deprecation may be rolled out in uncoordinated way without any impact on the network.

DoS vector

Block's data complexity is analysed in the Data complexity section of the EIP and is not considered as a source of additional DoS vector. Beacon state data complexity is trivial as the new IndexedDepositData structure has a smaller size than the Validator structure.

The most consuming computation of deposit processing is signature verification. The complexity is bounded by a maximum number of deposits per block which is around 512 with 30M gas block. So, it is ~512 signature verifications which is roughly about 0.5s of processing without batched verification optimisation. An attacker would need to spend 100 ETH to slow down block processing by a hundred of milliseconds which doesn't look as a sustainable and valuable attack on a long period of time.

An optimistically syncing node may be susceptible to a more severe attack scenario. Such a node can't validate a list of deposits provided in a payload which makes it possible for attacker to include as many deposits as the limitation allows to. Currently, it is 8,192 deposits (1.5MB of data) with rough processing time of 8s. Considering an attacker would need to sign off on this block with its crypto economically viable signature, this attack vector doesn't seem to be valuable as it can't result in a significant slow down of a sync process.

Optimistic sync

An optimistically syncing node have to rely on the honest majority assumption. That is, if adversary is powerful enough to finalize a deposit flow, a syncing node will have to apply these deposits disregarding the validity of deposit receipts with respect to the execution of a given block. Thus, adversary may convince an honest node to accept fake deposits if it can reach finality on invalid chain. The same is applicable to the validity of EL world state today and the proposed deposits machinery doesn't change security model in that regard.

Online nodes can't be tricked into this situation because their EL verifies supplied deposits with respect to the execution.

Pending deposits queue

The proposed design relies on pending_deposits queue in the beacon chain state. This queue servers the following goals:

  1. Bridges the gap between deposits included via the Eth1Data voting and deposit receipts supplied by the new machinery. The gap is inevitable because proposer voting is lagging behind the head by the follow distance plus the size of the voting period.
  2. Limits a number of validator balance top ups per epoch. This limitation is one of the inputs of the weak subjectivity period computation. The queue limits deposit processing to MAX_DEPOSITS * SLOTS_PER_EPOCH to preserve the status quo.
  3. Finalizes deposits flow before creating any validator record. This property allows implementations to avoid making (pubkey, index) cache re-org aware.
    Note that this requirement isn't introduced by this proposal and already exists today. Consider the case when there are two competing forks during a period of non-finality and both forks are longer than the follow distance. Any deposit transaction missed or misordered on any of these forks with respect to another fork may result in a different validator indexes assigned to the same pubkeys on each fork.
    However, this proposal moves the re-org handling problem from an edge case area to an important thing that either spec or implementations must take care of.

Based on historical data analysis, the expectation is to have less than 1 deposit per block which is less than 32 deposits per epoch. With such a pace the pending deposits queue will contain less than 96 deposits at once in the case when finality is advanced every epoch. Based on that, we assume that the state hashing complexity increase is bounded by roughly a hundred deposits per block in a normal case.

The new machinery doesn't use is_valid_merkle_branch function which call implies 32 hashing operations per each deposit. Therefore, the computational complexity is expected to be reduced by the proposed mechanism.

Approaches reducing the queue usage

There is a possibility to narrow down the queue usage to (1) only. The queue may be used until the new deposit processing takes over, after that the queue may stay empty and be removed in one of the next CL upgrades.

Achieving that implies giving up on (2) and (3).

Weak subjectivity

One of the inputs into WS period computation is a number of top ups per epoch that stakers may do to prevent their validators from being ejected and ultimately increase their portion of stake with respect to those that are leaking.

Number of top ups is limited by MAX_DEPOSITS * SLOTS_PER_EPOCH to preserve the status quo. Removing the queue would mean this boundary moved to 512 * SLOTS_PER_EPOCH with 30M gas block.

Modified script shows the following changes in the WS period computations (deviations are highlighted):

Safety Decay Avg. Val. Balance (ETH) Val. Count WS (Epochs), 16 deposits per slot WS (Epochs), 512 deposits per slot
10 20 32768 272 256
10 20 65536 288 257
10 20 131072 320 258
10 20 262144 384 260
10 20 524288 512 264
10 20 1048576 768 272
10 24 32768 310 310
10 24 65536 365 365
10 24 131072 474 474
10 24 262144 692 692
10 24 524288 692 692
10 24 1048576 1041 692
10 28 32768 504 504
10 28 65536 752 752
10 28 131072 1248 1248
10 28 262144 2241 2241
10 28 524288 2241 2241
10 28 1048576 2241 2241
10 32 32768 665 665
10 32 65536 1075 1075
10 32 131072 1894 1894
10 32 262144 3532 3532
10 32 524288 3532 3532
10 32 1048576 3532 3532

The deviations don't look too impactful. Tagging @adiasg to see his thoughts on whether we allowed to increase top ups boundary.

Validator pubkey cache

Client implementations will have to maintain their validator's cache in a way that it dependends on a block tree branch. It can be a composite cache consisting of finalized part and a few small pieces tied to a tip of each branch.

Validator becomes active only after relevant deposits are finalized which means that finalized part of the cache should cover most of the hits. While non-finalized parts are supposed to be used only for look ups during deposit processing.

@mkalinin
Copy link
Contributor Author

mkalinin commented Feb 22, 2023

Replacement of state.pending_deposits

As discussed on in-protocol deposits breakout session during the Edelweiss Interop event, the recent design update attempts to get rid of pending_deposits queue. The queue is replaced by deposit_receipt_start_index state variable which holds an index of the first processed in-protocol deposit. The new deposit operations are processed after the Eth1Data poll deposits in every block, and the transition period looks like as follows:

  • Transition starts with the first processed in-protocol deposit which index is stored in deposit_receipt_start_index variable -- transition starts;
  • state.eth1_deposit_index being incremented by Eth1Data poll mechanism reaches deposit_receipt_start_index;
  • Eth1Data poll machinery keeps including already processed deposits into a block until a subsequent deposit transaction is submitted, processing of these excess deposits are skipped to avoid invalid balance increments;
  • A subsequent in-protocol deposit finishes the transition by starting to increment state.eth1_deposit_index.

Deposits from the latest voting period can overlap with the first in-protocol deposits (as mentioned above) and induce addition data complexity which is considered as negligible.

@hwwhww hwwhww added the general:RFC Request for Comments label Feb 23, 2023
Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkalinin Great work!
It looks like a sound solution to me. Need more tests to verify if it works bug-free though. I agreed to merge it into the feature spec collections soon.

specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
Comment on lines 256 to 258
# Set deposit receipt start index
if state.deposit_receipts_start_index == NOT_SET_DEPOSIT_RECEIPTS_START_INDEX:
state.deposit_receipts_start_index = deposit_receipt.index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be any old Eth1Data Deposit after the first DepositReceipt is processed? That is, will there be a period of two types of deposit data sources in the same block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this period will last until state.eth1_deposit_index reaches state.deposit_receipts_start_index and a subsequent deposit receipt is processed or (if no deposit receipt appears) state.eth1_deposit_index reaches state.eth1_data.deposit_count

@dapplion
Copy link
Member

dapplion commented Mar 1, 2023

Really nice design! Noting a todo to update the PR description to reflect the queue-less design. I'm ok giving up (2) and (3) for protocol simplicity.

@mkalinin mkalinin changed the title In-protocol deposits flow In-protocol deposits flow (no queue approach) Mar 1, 2023
specs/_features/eip6110/beacon-chain.md Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
specs/_features/eip6110/beacon-chain.md Outdated Show resolved Hide resolved
#### New `apply_deposit`

```python
def apply_deposit(state: BeaconState,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might consider giving this new mechanism a name like "synchronous deposits" or something. So that the objects and function names would be further disambiguated from the old machinery -- e.g. apply_synchronous_deposit, SynchronousDepositReceipt, etc

that said, maybe you have a better name than "Synchronous deposits"

domain = compute_domain(DOMAIN_DEPOSIT) # Fork-agnostic domain since deposits are valid across forks
signing_root = compute_signing_root(deposit_message, domain)
# Initialize validator if the deposit signature is valid
if bls.Verify(pubkey, signing_root, signature):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to keep this as parallel to the previous fork process_deposit functions as possible -- so I'd prefer to use the if not gating condition.

Also, we could make a new helper for this and past phases -- verify_and_add_validator() that handles all of this logic (from line 247 to 260 here).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I guess we could make apply_deposit the generic function, define it starting in phase 0, and just utilize it here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, apply_deposit was introduced here for deduplication purposes, it has the same logic as process_deposit starting from the point after merkle proof verification and bumping deposit index in the state.

Probably, we can introduce this function in Altair spec and reuse it here. If we did this for the Phase0 then it would have to be overridden in Altair.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which isn't the worst. It's a nice consistency throughout even with the altair override


# Signify the end of transition to in-protocol deposit logic
if state.eth1_deposit_index >= state.deposit_receipts_start_index:
state.eth1_deposit_index = deposit_receipt.index + 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what is going on here. Isn't the end signified already by state.eth1_deposit_index >= state.deposit_receipts_start_index and thus we don't need to artificially set eth1_deposit_index to the current receipt+1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't do this then Eth1Data poll will keep including already processed deposits in block.

The other way around is to make a change to the validator spec to prevent Eth1Data poll from including deposits if state.eth1_deposit_index >= state.deposit_receipts_start_index.

# [New in EIP-6110]
# Skip already processed deposits
if state.eth1_deposit_index >= state.deposit_receipts_start_index:
state.eth1_deposit_index += 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about how this works deposit_receipts_start_index is static once initialized. So why do we need to do this increment here? can't it just be a return and forever return

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once state.eth1_deposit_index reaches the first in-protocol deposit index there is likely a tail of deposits from the same voting period remaining to be processed. Say, no in-protocol deposit is happening for hours from that event. In such a case Eth1Data poll machinery would be attempting to include the same portion of deposits because the index wouldn't be update.

If we make a change to the validator spec to stop including old fashioned deposits once the first in-protocol index is reached then we can do a simple return. Current spec is made in a way that no changes to the validator logic is required, and all the new changes are introduced into the state transition function where we can ensure to have a good test coverage without additional developer's work.

Copy link
Contributor Author

@mkalinin mkalinin Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do change the validator spec then we can leave process_deposit unchanged by enforcing the following condition in the process_operations:

# Disable former deposit mechanism once all prior deposits are processed
eth1_deposit_index_limit = min(state.eth1_data.deposit_count, state.deposit_receipts_start_index)
if state.eth1_deposit_index < eth1_deposit_index_limit:
    assert len(body.deposits) == min(MAX_DEPOSITS, eth1_deposit_index_limit - state.eth1_deposit_index)
else:
    assert len(body.deposits) == 0

This actually would look cleaner from the spec perspective

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it into the spec. Lmk, if it looks fine and I will update validator guide respectively

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPD: changes to the validator guide are in place now

Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice edits!

I'm happy to merge the feature and iterate on testing and other things from here in other PRs

# [Modified in EIP-6110]
# Disable former deposit mechanism once all prior deposits are processed
eth1_deposit_index_limit = min(state.eth1_data.deposit_count, state.deposit_receipts_start_index)
if state.eth1_deposit_index < eth1_deposit_index_limit:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! yes, much cleaner and easier to reason about. 🙏

Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

specs/_features/eip6110/validator.md Outdated Show resolved Hide resolved

### Deposits

The expected number of deposits MUST be changed from `min(MAX_DEPOSITS, eth1_data.deposit_count - state.eth1_deposit_index)` to the result of the following function:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to myself: I just figured out that this PR kind of deprecates MAX_DEPOSITS preset. MAX_DEPOSITS doesn't represent the upper bound of deposits anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:RFC Request for Comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants