Added standalone light client patch #2130

vbuterin · 2020-11-12T09:28:48Z

Adding the standalone beacon chain changes to add light client support as a separate spec patch, independent of phase 1, to enable it to be tested and potentially rolled out sooner than the rest of phase 1.

JustinDrake · 2020-11-12T11:54:54Z

Two main substantive suggestions:

Increase LIGHT_CLIENT_COMMITTEE_SIZE to 256 to address the RANDAOtage attack vector.
Add a pubkeys_sum field to CompactCommittee which sums all the pubkeys. This unlocks an optimisation whereby light clients only have to download pubkeys of validators that do not vote. For a 2/3 threshold this is at least a 2x optimisation. In the expected/optimistic scenario where 95% of the light client committee votes this is a ~20x optimisation.

specs/lightclient/beacon-chain.md

Co-authored-by: Alex Stokes <[email protected]>

See discussion for further details.

JustinDrake · 2020-11-15T10:58:21Z

Fine combed the document :)

(bug fix) Called compute_shuffled_index with active_validator_count (as opposed to active_validator_indices).
Used a Bitlist (as opposed to a Bitvector) for BeaconBlockBody.sync_committee_bits for consistency with SyncCommittee.pubkeys and SyncCommittee.compact_validators.
Added assert len(body.sync_committee_bits) == len(committee_indices).
Should current_sync_committee and next_sync_committee be initialised, e.g. in initialize_beacon_state_from_eth1?
Should we sign over both previous_block_root and previous_slot? This may facilitate gossipping sync committee signatures as well as simplify and optimise light client implementations.
Should we rotate current_sync_committee and next_sync_committee only if the previous rotation falls under finality? If not, how are long-range (greater than EPOCHS_PER_SYNC_COMMITTEE_PERIOD) forks handled?
Should sync committees explicitly sign over state.finalized_checkpoint? This would simplify light client implementations and save them from downloading and verifying unnecessary Merkle branches.
What do you suggest as the precise honest behaviour of a sync committee member? Explicitly spelling it out would help me go through edge cases.
Should we consider adding sync committee slashing conditions? For example, we don't want sync committtee members equivocating over state.finalized_checkpoint.
In the exceptional case where active_validator_count < MAX_SYNC_COMMITTEE_SIZE, what do you think of "padding" the sync committee with duplicate validators? This handling of the exceptional case (which really ought to never happen on mainnet) would simplify light client implementations because the variable-size Bitlist and List would be fixed-sized Bitvector and Vector. The state transition function would also be simplified a bit.
What do you think of increasing MAX_SYNC_COMMITTEE_SIZE further, e.g. to 512 or 1024? The rationale is that the marginal on-chain overhead is negligeable (especially with fancy batched point addition) and light clients could choose to increase their committee size to improve robustness against adaptive attacks (as well as improve liveness in the worst case by shielding themselves against sampling variance). It also allows fancier light clients that sample random subsets of the sync committee, again for robustness. This does go against the pubkeys_aggregate optimisation.
Does the micro-incentivisation break down when the networking latency is poor enough that it takes more than one slot for the block proposer to gather a good sync committee signature? What about relaxing the latency for the purpose of micro-incentivisation? (Note that sync committee signatures that don't make it onchain would still be valuable offchain for light clients.)
Do the cosmetic changes look good to you? (I may have introduced a bug or two.)

vbuterin · 2020-11-15T12:19:57Z

Should we rotate current_sync_committee and next_sync_committee only if the previous rotation falls under finality? If not, how are long-range (greater than EPOCHS_PER_SYNC_COMMITTEE_PERIOD) forks handled?

I'm inclined to say, no need for special cases here, if none of the committees in a given period get up to 2/3 the client should just pick the block that has the most light client participants. Obviously in such a case the client would wait for some time to verify that no better block is available.

Should we sign over....

What's wrong with just having the light client proof package that gets sent over the wire contain SSZ branches? Seems more general-purpose...

What do you suggest as the precise honest behaviour of a sync committee member?

Halfway through each slot, publish an attestation to what you think is the the head of the chain.

Should we consider adding sync committee slashing conditions?

Hmm... I'm inclined to say don't bother, because if the committee is faulty then nothing stops them from feeding clients the wrong chain consistently, and there's no way for the sync committee to have that high a level of security anyway.

In the exceptional case where active_validator_count < MAX_SYNC_COMMITTEE_SIZE, what do you think of "padding" the sync committee with duplicate validators?

Hmm.... instinctively it seems wasteful and unnatural. Is it really that complicated to have variable size lists? I guess another option would be to have them be vectors and pad the pubkeys and bits with zeroes to represent that there's nothing there.

What do you think of increasing MAX_SYNC_COMMITTEE_SIZE further, e.g. to 512 or 1024?

Seems okay as long as light clients have a choice to only use part of the committee if they so desire, so eg. sum_of_pubkeys would need to be extended into an array that contains the sum of the first 2**k for each k. That said, are we that worried about a committee of size 256 being attacked?

Does the micro-incentivisation break down when the networking latency is poor enough that it takes more than one slot for the block proposer to gather a good sync committee signature?

Remember that the signatures would still be floating around the gossip net even if only a few of them make it onto the chain, so technically as long as there's even a small percent change of being included the incentives to publish should work fine. But if we want to make it easier to read the signatures from just the chain, I guess one route would be to keep retrying a slot if it gets <50% participation (eg. for a maximum of 4 retries).

vbuterin · 2020-11-15T12:23:20Z

Cosmetic changes look good to me!

hwwhww

@vbuterin

I made the linter pass: #2133

specs/lightclient/beacon-chain.md

JustinDrake · 2020-11-15T20:11:18Z

I'm inclined to say, no need for special cases here, if none of the committees in a given period get up to 2/3 the client should just pick the block that has the most light client participants. Obviously in such a case the client would wait for some time to verify that no better block is available.

What about safety?

Let's say there's a 1/3 attacker, 1/3 honest attesting (and light-client voting) for fork A, 1/3 honest attesting (and light-client voting) for fork B.

Now the attacker attests for fork A and light-client votes for fork B. Fork A finalises but light clients are stuck on fork B.

What's wrong with just having the light client proof package that gets sent over the wire contain SSZ branches?

Light client implementations will often be hyper-secure and/or hyper-optimised. For example, consider a light client implementation in a zkSNARK circuit. Or a light client implementation in a metered virtual machine like the EVM. Or a light client implementation in a secure enclave. Or a light client implementation in a formal verification framework. The implementation costs are orders of magnitude higher than Python.

Avoiding unnecessary data like Merkle branches and as well unnecessary complexities like variable-sized Merkle branches may pay off big time.

Seems more general-purpose

Having just the previous root is definitely more minimalist :)

instinctively it seems wasteful

Do you mean wasteful when there are less than 256 active validators? Optimising for that edge case which should never happen in practice feels like a micro-optimisation.

are we that worried about a committee of size 256 being attacked?

I'm worried! A single committee break simultaneously breaks every Eth2-to-X bridge (where X = Cosmos, Near, Polkadot, etc.) and there could easily be $1B+ (or $10B+) in extractable value for an attacker.

Because EPOCHS_PER_SYNC_COMMITTEE_PERIOD in on the order of 1 day it has "human scale". By that I mean that 1 day is sufficient for an attacker to 1) publicise a $1m-per-validator bribing attack (e.g. on Twitter and Reddit) and 2) have corrupted validators manually perform an action to claim their $1m bribe. How many rational validators would say no to a $1m bribe? Also note that the attack cost is zero because there is no enshrined slashing.

vbuterin · 2020-11-16T01:54:42Z

Avoiding unnecessary data like Merkle branches and as well unnecessary complexities like variable-sized Merkle branches may pay off big time.

We don't have any variable-sized Merkle branches though; SSZ was deliberately designed to make any access path neatly correspond to a single generalized index. I'd also add that if you want to know the state root, usually it's not because you care about the root, it's because you care about something in the state, so you would need Merkle paths anyway.

Let's say there's a 1/3 attacker, 1/3 honest attesting (and light-client voting) for fork A, 1/3 honest attesting (and light-client voting) for fork B.

Why would this situation persist for an entire day? Wouldn't the fork resolve and there be a block that has 2/3 voting?

To be clear, the fork choice rule I am proposing is not slot-by-slot. It's "start from the last block you are confident about, take the validator set from it, then find the descendant of that block that was signed by the most participants of that validator set, and repeat"

Do you mean wasteful when there are less than 256 active validators? Optimising for that edge case which should never happen in practice feels like a micro-optimisation.

I mean wasteful code-wise. It feels like it would still lead to annoying complexity (eg. where else in the code can the same validator be part of a committee twice??!). If we want fixed size the better path does seem to be to pad the committee with fake pubkeys that no one can sign with.

Because EPOCHS_PER_SYNC_COMMITTEE_PERIOD in on the order of 1 day it has "human scale". By that I mean that 1 day is sufficient for an attacker to 1) publicise a $1m-per-validator bribing attack (e.g. on Twitter and Reddit) and 2) have corrupted validators manually perform an action to claim their $1m bribe. How many rational validators would say no to a $1m bribe? Also note that the attack cost is zero because there is no enshrined slashing.

OK fair! I'm okay with pushing the committee size up as long as the pubkeys_aggregate is either removed or turns into a list of prefix-sums of powers of two.

On second thought, I'd be ok with some kind of slashing if we can figure out the right conditions; I guess something like "if you vote for block X but in the same epoch you used a block that conflicts with X as a last-justified-block you get slashed" could work. Main challenge is that we'd like to make sure that it's easy for anti-slash DBs to verify that a validator isn't slashing themselves.

specs/lightclient/sync-protocol.md

ralexstokes · 2020-11-16T23:13:06Z

specs/lightclient/sync-protocol.md

+
+## Light client state updates
+
+The state of a light client is stored in a `memory` object of type `LightClientMemory`. To advance its state a light client requests an `update` object of type `LightClientUpdate` from the network by sending a request containing `(memory.shard, memory.header.slot, slot_range_end)`. It calls `validate_update(memory, update)` on each update that it receives in response. If `sum(update.aggregate_bits) * 3 > len(update.aggregate_bits) * 2` for any valid update, it accepts that update immediately; otherwise, it waits around for some time and then finally calls `update_memory(memory, update)` on the valid update with the highest `sum(update.aggregate_bits)`.


If sum(update.aggregate_bits) * 3 > len(update.aggregate_bits) * 2 for any valid update

is there a reason we are not summing effective stake (via effective_balance found in the appropriate SyncCommittee in memory)?

relying on the bits is a good approximation... until it is not and the edge case results in committee corruption

It vastly simplifies the light client spec to rely only on the bits. We can make up for it by making membership in the sync committee itself probabilistic and balance-dependent.

specs/lightclient/sync-protocol.md

ralexstokes · 2020-11-16T23:56:04Z

specs/lightclient/sync-protocol.md

+
+    # Verify signature
+    active_pubkeys = [p for (bit, p) in zip(update.aggregation_bits, committee.pubkeys) if bit]
+    domain = compute_domain(DOMAIN_SYNC_COMMITTEE, memory.version)


i think it is worth considering adding the genesis_validators_root as input to compute_domain for the same reasons we have it on the beacon chain.

it only adds a constant 32 byte overhead for the light client (which can even be hard-coded into client code) and reduces the scope of admissible updates a light client would even bother with

happy to make a PR on top of this one adding the changes if there is support for the change

hwwhww · 2020-11-25T09:36:16Z

specs/lightclient/sync-protocol.md

+| `FINALIZED_ROOT_INDEX` | `Index(BeaconState, 'finalized_checkpoint', 'root')` |
+| `NEXT_SYNC_COMMITTEE_INDEX` | `Index(BeaconState, 'next_sync_committee')` |


Do these Index helpers equal get_generalized_index in https://github.com/ethereum/eth2.0-specs/blob/5f9112ad4227d12ea03c001517d53518e6e355f0/ssz/merkle-proofs.md?

If so, we may need to either make merkle-proofs.md executable again or implement it in remerkleable.

Do these Index helpers equal get_generalized_index

Yes :)

we may need to either make merkle-proofs.md executable again or implement it in remerkleable

When we do so I suggest shortening "generalized_index" to something like "node_index", "tree_index", "Merkle_index", or just "index" for short.

hwwhww · 2020-11-25T09:38:47Z

specs/lightclient/sync-protocol.md

+    # Verify update header root is the finalized root of the finality header, if specified
+    if update.finality_header == BeaconBlockHeader():
+        signed_header = update.header
+        assert update.finality_branch == [ZERO_HASH for _ in range(log2(FINALIZED_ROOT_INDEX))]


It seems that we need to add a uint64 casting to all these log2, or reuse get_generalized_index_length.

dankrad · 2020-11-26T17:05:45Z

My main concern here is what happens when a committee does not achieve a 2/3 vote during EPOCHS_PER_SYNC_COMMITTEE_PERIOD. Realistically, blockchain/smart contract light clients will only use 2/3 links because everything else is too dangerous. But when a sync committees tenure ends, there is no incentive to keep trying to get one of the votes over the threshold.

Intuitively I would have said sync committees should only be switched when the next sync committee has been confirmed by a 2/3 vote. But that of course can lead to terrible situations where a sync committee intentionally withholds confirming the next one.

One idea would be that there is a mechanism to record the "success" of each sync committee to achieve 2/3, and if it never reached it, a final vote on the last block can be included at any time for some reward.

dankrad · 2020-11-26T17:08:26Z

Further remark in a similar direction: Intuitively, the previous sync committee has almost the same "validity" in epoch EPOCHS_PER_SYNC_COMMITTEE_PERIOD and EPOCHS_PER_SYNC_COMMITTEE_PERIOD+1 (after committees have switched). What about giving the old committee the opportunity for one, and only one, more vote? The nice thing is that they could vote for the "next next" sync committee, effectively allowing to skip one committee at a time. Also, in case the new committee is poor and doesn't come to a consensus, it gives one opportunity for the old committee to create a 2/3+1 link.

djrtwo

did a quick pass

specs/lightclient/beacon-chain.md

djrtwo · 2020-11-29T23:04:27Z

specs/lightclient/beacon-chain.md

+```python
+class BeaconBlockBody(phase0.BeaconBlockBody):
+    # Sync committee aggregate signature
+    sync_committee_bits: Bitvector[SYNC_COMMITTEE_SIZE]


We might consider putting this in the outer BeaconBlock/BeaconBlockHeader container(s). This would allow for sync to occur with just the BeaconBlockHeader and no additional proofs.

Seems reasonable to elevate this to baseline verification (similar to the proposer signature) rather than the in the payload of the block. Otherwise, sync via this light mechanism will always require BeaconBlockHeader plus a small sync_signature proof into the body

djrtwo · 2020-11-29T23:19:33Z

specs/lightclient/beacon-chain.md

+    active_validator_count = uint64(len(active_validator_indices))
+    seed = get_seed(state, base_epoch, DOMAIN_SYNC_COMMITTEE)
+    i, sync_committee_indices = 0, []
+    while len(sync_committee_indices) < SYNC_COMMITTEE_SIZE:


I suggest we create a helper compute_weighted_committee or something to (1) make it clear why we aren't using previous committee shuffling algorithms, and (2) to allow for better code reuse in the event we want to reuse or just directly test this functionality

specs/lightclient/beacon-chain.md

djrtwo · 2020-11-30T00:22:54Z

specs/lightclient/beacon-chain.md

+```python
+def process_sync_committee(state: BeaconState, body: BeaconBlockBody) -> None:
+    # Verify sync committee aggregate signature signing over the previous slot block root
+    previous_slot = max(state.slot, Slot(1)) - Slot(1)


Maybe a get_previous_slot accessor like get_previous_epoch

specs/lightclient/beacon-chain.md

terencechain · 2020-12-06T19:40:10Z

specs/lightclient/beacon-chain.md

+    # Reward sync committee participants
+    participant_rewards = Gwei(0)
+    active_validator_count = uint64(len(get_active_validator_indices(state, get_current_epoch(state))))
+    for participant_index in participant_indices:


Should we exclude slashed validators for rewards?

hwwhww · 2020-12-07T18:51:11Z

specs/lightclient/beacon-chain.md

+
+```python
+def process_block(state: BeaconState, block: BeaconBlock) -> None:
+    phase0.process_block(state, block)


Note that by phase0.process_block, it uses the phase 0 package.

The constants and the function calls process_block would be phase 0 version. It may cause some side effects that we may easily overlook.

You are passing (lightclient_patch.state, lightclient_patch.block) to process_block(state: phase0.BeaconState, block: phase0.BeaconBlock).

Saving 3 lines probably is not worth it. 😅

djrtwo · 2020-12-07T20:24:36Z

Saving 3 lines probably is not worth it. 😅

Yes! I would tend to agree

* Bump remerkleable to 0.1.18 * Disable `sync-protocol.md` for now. Make linter pass * Enable lightclient tests * Use *new* `optional_fast_aggregate_verify` * Fix ToC and codespell * Do not run phase1 tests with Lightclient patch * Fix the Eth1Data casting bug. Add a workaround. * Fix `run_on_attestation` testing helper * Revert * Rename `optional_fast_aggregate_verify` to `eth2_fast_aggregate_verify` * Apply Proto's suggestion * Apply Danny's suggestion * Fixing tests * Fix after rebasing * Rename `LIGHTCLIENT` -> `LIGHTCLIENT_PATCH` * New doctoc * Add lightclient patch configs * fix gitignore light client patch generator output * Upgrade state for light client patch * Add `lightclient-fork.md` to deal the fork boundary and fix `process_block_header` * Misc cleanups 1) Add a summary note for every function that is changed. 2) Avoid changing `process_block` (instead only change `process_block_header`). 3) Rename `G2_INFINITY_POINT_SIG` to `G2_POINT_AT_INFINITY` to avoid `SIG` contraction. 4) Misc cleanups * Update block.py * Update beacon-chain.md * Fix typo "minimal" -> "mainnet" Co-authored-by: Marin Petrunić <[email protected]> * Use the new `BeaconBlockHeader` instead of phase 0 version * Update config files * Move `sync_committee_bits` and `sync_committee_signature` back to `BeaconBlockBody` Co-authored-by: protolambda <[email protected]> Co-authored-by: Justin <[email protected]> Co-authored-by: Marin Petrunić <[email protected]>

specs/lightclient/beacon-chain.md

hwwhww

Giving the initial version a green light. The tests & iterations will be added with other PRs.

light client tests

djrtwo · 2020-12-17T13:53:29Z

Added some sanity tests. merging!

Added standalone light client patch

7c6ede5

hwwhww added the scope:light-clients label Nov 12, 2020

ralexstokes reviewed Nov 12, 2020

View reviewed changes

specs/lightclient/beacon-chain.md Outdated Show resolved Hide resolved

ralexstokes reviewed Nov 12, 2020

View reviewed changes

specs/lightclient/beacon-chain.md Outdated Show resolved Hide resolved

vbuterin and others added 4 commits November 13, 2020 10:19

Update specs/lightclient/beacon-chain.md

9e3690a

Co-authored-by: Alex Stokes <[email protected]>

Reduce reward by SLOTS_PER_EPOCH

620b812

Small changes to make Justin happy

555e131

Significant polishing and a few substantive fixes

e7d52d9

See discussion for further details.

hwwhww mentioned this pull request Nov 15, 2020

Make lightclient patch pass the linter #2133

Merged

hwwhww reviewed Nov 15, 2020

View reviewed changes

specs/lightclient/beacon-chain.md Show resolved Hide resolved

specs/lightclient/beacon-chain.md Outdated Show resolved Hide resolved

Fix bugs—thanks @hwwhww

114e388

vbuterin and others added 4 commits November 16, 2020 15:02

Added light client syncing protocol

1f210fd

Python syntax highlighted updates

97dc916

Removed extraneous data

ca88dd6

Fix ToC (cherry-picked from @hwwhww's PR)

cbb3856

ralexstokes reviewed Nov 16, 2020

View reviewed changes

specs/lightclient/sync-protocol.md Outdated Show resolved Hide resolved