Make eth1 caching work with fast synced node #709

pawanjay176 · 2019-12-11T19:03:27Z

Issue Addressed

Addresses #637

Proposed Changes

Adds methods to fetch deposit_root and deposit_count at any block height and uses them to populate the BlockCache in place of the calls to read contract state directly at previous blocks.

This reverts commit e3d0325.

pawanjay176 · 2019-12-11T19:04:07Z

Currently, the issue is eth1 block caching starts simultaneously along with deposit caching and therefore we get wrong values for the deposit_root and deposit_count until the deposit cache is in sync with deposits on the eth1 side.

We could potentially stall the block caching until the deposit caching is in sync to get around this.

pawanjay176 · 2019-12-11T19:05:49Z

beacon_node/eth1/src/deposit_cache.rs

+    /// and returns the deposit count as `index + 1`.
+    ///
+    /// Returns 0 if no logs are present.
+    /// Note: This function assumes that `block_number` > `deposit_contract_deploy_block`


Can't think of a good way to handle the "before deposit contract deployed" case.

I think we will have to take note of when the deposit contract was deployed and return an error if we search prior to this. This scenario should never happen, only if a testnet has been badly configured.

paulhauner

Yep, it looks like you're on the right track!

The next step will be the hard part, that is implementing it into the beacon chain. I'm thinking you'll likely have to modify the following function such that it can read blocks from the block cache and "fill in" their deposit count and root from the deposit cache:

lighthouse/beacon_node/beacon_chain/src/eth1_chain.rs

Lines 408 to 454 in 4eba265

    
           /// Calculates and returns `(new_eth1_data, all_eth1_data)` for the given `state`, based upon the 
        
           /// blocks in the `block` iterator. 
        
           /// 
        
           /// `prev_eth1_hash` is the `eth1_data.block_hash` at the start of the voting period defined by 
        
           /// `state.slot`. 
        
           fn eth1_data_sets<'a, I>( 
        
               blocks: I, 
        
               prev_eth1_hash: Hash256, 
        
               voting_period_start_seconds: u64, 
        
               spec: &ChainSpec, 
        
               log: &Logger, 
        
           ) -> Option<(Eth1DataBlockNumber, Eth1DataBlockNumber)> 
        
           where 
        
               I: DoubleEndedIterator<Item = &'a Eth1Block> + Clone, 
        
           { 
        
               let eth1_follow_distance = spec.eth1_follow_distance; 
        
               let in_scope_eth1_data = blocks 
        
                   .rev() 
        
                   .skip_while(|eth1_block| eth1_block.timestamp > voting_period_start_seconds) 
        
                   .skip(eth1_follow_distance as usize) 
        
                   .filter_map(|block| Some((block.clone().eth1_data()?, block.number))); 
        
               if in_scope_eth1_data 
        
                   .clone() 
        
                   .any(|(eth1_data, _)| eth1_data.block_hash == prev_eth1_hash) 
        
               { 
        
                   let new_eth1_data = in_scope_eth1_data 
        
                       .clone() 
        
                       .take(eth1_follow_distance as usize); 
        
                   let all_eth1_data = 
        
                       in_scope_eth1_data.take_while(|(eth1_data, _)| eth1_data.block_hash != prev_eth1_hash); 
        
                   Some(( 
        
                       HashMap::from_iter(new_eth1_data), 
        
                       HashMap::from_iter(all_eth1_data), 
        
                   )) 
        
               } else { 
        
                   error!( 
        
                       log, 
        
                       "The previous eth1 hash is not in cache"; 
        
                       "previous_hash" => format!("{:?}", prev_eth1_hash) 
        
                   ); 
        
                   None 
        
               } 
        
           }

Given that we might be collecting several hundred blocks whilst we're voting, having the O(log n) binary search through the deposits would be nice.

paulhauner · 2019-12-12T01:27:24Z

beacon_node/eth1/src/deposit_cache.rs

+        Some(
+            self.logs
+                .iter()
+                .take_while(|log| block_number >= log.block_number)


Considering that the logs are ordered, we could do a binary search here.

…n deposit cache

paulhauner

Looking good. A couple of minor comments and a potential optimization that I think is necessary.

Keen to hear your thoughts.

paulhauner · 2019-12-18T01:27:16Z

beacon_node/eth1/src/deposit_cache.rs

@@ -3,6 +3,8 @@ use eth2_hashing::hash;
 use tree_hash::TreeHash;
 use types::{Deposit, Hash256};

+const DEPOSIT_CONTRACT_TREE_DEPTH: usize = 32;


This const tends to keep getting duplicated around, perhaps we can import this instead:

lighthouse/eth2/types/src/deposit.rs

Line 10 in 24e941d

pub const DEPOSIT_TREE_DEPTH: usize = 32;

paulhauner · 2019-12-18T01:28:48Z

beacon_node/eth1/src/deposit_cache.rs

+        DepositCache {
+            logs: Vec::new(),
+            roots: Vec::new(),
+            // 0 to be compatible with Service::Config. Should be ideally 1


Perhaps we change Service::Config? Your suggestion seems quite reasonable.

paulhauner · 2019-12-18T06:12:17Z

beacon_node/eth1/src/deposit_cache.rs

+    pub fn get_deposit_root_from_cache(&self, block_number: u64) -> Option<Hash256> {
+        let index = self.get_deposit_count_from_cache(block_number)?;
+        let roots = self.roots.get(0..index as usize)?;
+        let tree = DepositDataTree::create(roots, index as usize, DEPOSIT_CONTRACT_TREE_DEPTH);


As you mentioned, this is going to be fairly inefficient.

For our testnet, this means that if we sync a cache of 4,096 blocks we're going to have to find the root of a list of 16,384 for each of those blocks. I just benched a tree hash root of 16k hashes at 8ms, so we're looking at 32 secs of hashing just to fill the cache.

I think a fairly easy solution to this would be to attach a DepositDataTree and a Vec<u64, Hash256> to the deposit cache. Each time we import a deposit we add the deposit to the DepositDataTree and then push the new root into the Vec. In this scenario, we incrementally build the deposit tree once and when we want to resolve a block number to a deposit root we can just binary search the Vec.

Thoughts?

Yup, this sounds perfect 👍

pawanjay176 · 2019-12-18T12:17:26Z

beacon_node/eth1/src/deposit_cache.rs

 }

 /// Mirrors the merkle tree of deposits in the eth1 deposit contract.
 ///
 /// Provides `Deposit` objects with merkle proofs included.
-#[derive(Default)]
 pub struct DepositCache {
    logs: Vec<DepositLog>,
    roots: Vec<Hash256>,


@paulhauner Perhaps this should be named leafs instead of roots?

paulhauner

Looks good! There's a couple of tiny comments, but I'm happy to merge this in once you've looked at them :)

paulhauner · 2019-12-18T21:41:14Z

beacon_node/eth1/src/deposit_cache.rs

+    /// and queries the `deposit_roots` map to get the corresponding `deposit_root`.
+    pub fn get_deposit_root_from_cache(&self, block_number: u64) -> Option<Hash256> {
+        let index = self.get_deposit_count_from_cache(block_number)?;
+        Some(self.deposit_roots.get(index as usize)?.clone())


I like how you indexed by deposit count, this is better than my suggestion.

paulhauner · 2019-12-18T21:42:52Z

beacon_node/eth1/src/deposit_cache.rs

 }

 /// Mirrors the merkle tree of deposits in the eth1 deposit contract.
 ///
 /// Provides `Deposit` objects with merkle proofs included.
-#[derive(Default)]
 pub struct DepositCache {
    logs: Vec<DepositLog>,
    roots: Vec<Hash256>,


paulhauner · 2019-12-18T23:03:59Z

beacon_node/eth1/src/deposit_cache.rs

+            .binary_search_by(|deposit| deposit.block_number.cmp(&block_number));
+        match index {
+            Ok(index) => return self.logs.get(index).map(|x| x.index + 1),
+            Err(prev) => {


Should this be next instead of prev?

Yup you are right

paulhauner

Great work @pawanjay176!

I didn't think this would make it into v0.1.1 but you made it happen!

pawanjay176 added 6 commits December 11, 2019 17:23

Add functions to get deposit_count and deposit_root from deposit cache

f82e275

Fetch deposit root and deposit count from cache

e331816

Fix bugs

9fbc01c

Add test

365d6a2

Compare deposit_count between the caching and http eth1 blocks

e3d0325

Revert "Compare deposit_count between the caching and http eth1 blocks"

a8e99da

This reverts commit e3d0325.

pawanjay176 added the work-in-progress PR is a work-in-progress label Dec 11, 2019

pawanjay176 commented Dec 11, 2019

View reviewed changes

paulhauner reviewed Dec 12, 2019

View reviewed changes

pawanjay176 added 11 commits December 13, 2019 16:59

Fetch deposit cache using binary search instead of linear search

bba7a5d

BlockCache waits till DepositCache is in sync

3f4f88c

Merge branch 'master' into eth1-fast-sync

1ec606c

Truncate required_blocks in block_cache upto latest_processed_block i…

9f14df3

…n deposit cache

Clean up

69f49de

Handled getting deposit count before deploying deposit contract

8d0b5db

More cleanup

ffda83f

Remove calls to http get deposit/count

921dd4a

Fix block cache tests

cc246aa

Merge branch 'master' into eth1-fast-sync

754eaa2

Minor changes

7706f8f

pawanjay176 added the ready-for-review The code is ready for review label Dec 17, 2019

Fix bootnode ports

e709f7e

pawanjay176 force-pushed the eth1-fast-sync branch from fb3b5f5 to e709f7e Compare December 17, 2019 13:28

paulhauner self-assigned this Dec 17, 2019

paulhauner removed the work-in-progress PR is a work-in-progress label Dec 17, 2019

Merge branch 'master' into eth1-fast-sync

6378db8

paulhauner reviewed Dec 18, 2019

View reviewed changes

paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Dec 18, 2019

pawanjay176 added 5 commits December 18, 2019 12:41

Address some of Paul's comments

63c8616

Optimize get_deposit_root by caching DepositDataTree

0777c04

Fix comments and minor changes

39a4871

Change eth1 default config parameters

040a57d

Use Vec instead of HashMap to store deposit_roots

35ac316

pawanjay176 commented Dec 18, 2019

View reviewed changes

pawanjay176 added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Dec 18, 2019

paulhauner reviewed Dec 18, 2019

View reviewed changes

paulhauner added waiting-on-author The reviewer has suggested changes and awaits thier implementation. and removed ready-for-review The code is ready for review labels Dec 18, 2019

pawanjay176 added 2 commits December 19, 2019 09:58

Minor renaming

5beb4c8

Merge branch 'master' into eth1-fast-sync

7132c3c

pawanjay176 added ready-for-review The code is ready for review and removed waiting-on-author The reviewer has suggested changes and awaits thier implementation. labels Dec 19, 2019

paulhauner added ready-to-merge and removed ready-to-merge ready-for-review The code is ready for review labels Dec 19, 2019

paulhauner approved these changes Dec 19, 2019

View reviewed changes

paulhauner merged commit 74b327b into sigp:master Dec 19, 2019

paulhauner mentioned this pull request Dec 28, 2019

Make eth1 integration fast-sync friendly #637

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make eth1 caching work with fast synced node #709

Make eth1 caching work with fast synced node #709

pawanjay176 commented Dec 11, 2019

pawanjay176 commented Dec 11, 2019

pawanjay176 Dec 11, 2019

paulhauner Dec 12, 2019

paulhauner left a comment

paulhauner Dec 12, 2019

paulhauner left a comment

paulhauner Dec 18, 2019

paulhauner Dec 18, 2019

paulhauner Dec 18, 2019 •

edited

Loading

pawanjay176 Dec 18, 2019

pawanjay176 Dec 18, 2019

paulhauner Dec 18, 2019

paulhauner left a comment

paulhauner Dec 18, 2019

paulhauner Dec 18, 2019

paulhauner Dec 18, 2019

pawanjay176 Dec 19, 2019

paulhauner left a comment

	/// Calculates and returns `(new_eth1_data, all_eth1_data)` for the given `state`, based upon the
	/// blocks in the `block` iterator.
	///
	/// `prev_eth1_hash` is the `eth1_data.block_hash` at the start of the voting period defined by
	/// `state.slot`.
	fn eth1_data_sets<'a, I>(
	blocks: I,
	prev_eth1_hash: Hash256,
	voting_period_start_seconds: u64,
	spec: &ChainSpec,
	log: &Logger,
	) -> Option<(Eth1DataBlockNumber, Eth1DataBlockNumber)>
	where
	I: DoubleEndedIterator<Item = &'a Eth1Block> + Clone,
	{
	let eth1_follow_distance = spec.eth1_follow_distance;

	let in_scope_eth1_data = blocks
	.rev()
	.skip_while(\|eth1_block\| eth1_block.timestamp > voting_period_start_seconds)
	.skip(eth1_follow_distance as usize)
	.filter_map(\|block\| Some((block.clone().eth1_data()?, block.number)));

	if in_scope_eth1_data
	.clone()
	.any(\|(eth1_data, _)\| eth1_data.block_hash == prev_eth1_hash)
	{
	let new_eth1_data = in_scope_eth1_data
	.clone()
	.take(eth1_follow_distance as usize);
	let all_eth1_data =
	in_scope_eth1_data.take_while(\|(eth1_data, _)\| eth1_data.block_hash != prev_eth1_hash);

	Some((
	HashMap::from_iter(new_eth1_data),
	HashMap::from_iter(all_eth1_data),
	))
	} else {
	error!(
	log,
	"The previous eth1 hash is not in cache";
	"previous_hash" => format!("{:?}", prev_eth1_hash)
	);

	None
	}
	}

Make eth1 caching work with fast synced node #709

Make eth1 caching work with fast synced node #709

Conversation

pawanjay176 commented Dec 11, 2019

Issue Addressed

Proposed Changes

pawanjay176 commented Dec 11, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulhauner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulhauner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulhauner Dec 18, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulhauner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

paulhauner left a comment

Choose a reason for hiding this comment

paulhauner Dec 18, 2019 •

edited

Loading