-
Notifications
You must be signed in to change notification settings - Fork 748
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make eth1 caching work with fast synced node #709
Conversation
This reverts commit e3d0325.
Currently, the issue is eth1 block caching starts simultaneously along with deposit caching and therefore we get wrong values for the We could potentially stall the block caching until the deposit caching is in sync to get around this. |
/// and returns the deposit count as `index + 1`. | ||
/// | ||
/// Returns 0 if no logs are present. | ||
/// Note: This function assumes that `block_number` > `deposit_contract_deploy_block` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't think of a good way to handle the "before deposit contract deployed" case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will have to take note of when the deposit contract was deployed and return an error if we search prior to this. This scenario should never happen, only if a testnet has been badly configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it looks like you're on the right track!
The next step will be the hard part, that is implementing it into the beacon chain. I'm thinking you'll likely have to modify the following function such that it can read blocks from the block cache and "fill in" their deposit count and root from the deposit cache:
lighthouse/beacon_node/beacon_chain/src/eth1_chain.rs
Lines 408 to 454 in 4eba265
/// Calculates and returns `(new_eth1_data, all_eth1_data)` for the given `state`, based upon the | |
/// blocks in the `block` iterator. | |
/// | |
/// `prev_eth1_hash` is the `eth1_data.block_hash` at the start of the voting period defined by | |
/// `state.slot`. | |
fn eth1_data_sets<'a, I>( | |
blocks: I, | |
prev_eth1_hash: Hash256, | |
voting_period_start_seconds: u64, | |
spec: &ChainSpec, | |
log: &Logger, | |
) -> Option<(Eth1DataBlockNumber, Eth1DataBlockNumber)> | |
where | |
I: DoubleEndedIterator<Item = &'a Eth1Block> + Clone, | |
{ | |
let eth1_follow_distance = spec.eth1_follow_distance; | |
let in_scope_eth1_data = blocks | |
.rev() | |
.skip_while(|eth1_block| eth1_block.timestamp > voting_period_start_seconds) | |
.skip(eth1_follow_distance as usize) | |
.filter_map(|block| Some((block.clone().eth1_data()?, block.number))); | |
if in_scope_eth1_data | |
.clone() | |
.any(|(eth1_data, _)| eth1_data.block_hash == prev_eth1_hash) | |
{ | |
let new_eth1_data = in_scope_eth1_data | |
.clone() | |
.take(eth1_follow_distance as usize); | |
let all_eth1_data = | |
in_scope_eth1_data.take_while(|(eth1_data, _)| eth1_data.block_hash != prev_eth1_hash); | |
Some(( | |
HashMap::from_iter(new_eth1_data), | |
HashMap::from_iter(all_eth1_data), | |
)) | |
} else { | |
error!( | |
log, | |
"The previous eth1 hash is not in cache"; | |
"previous_hash" => format!("{:?}", prev_eth1_hash) | |
); | |
None | |
} | |
} |
Given that we might be collecting several hundred blocks whilst we're voting, having the O(log n)
binary search through the deposits would be nice.
Some( | ||
self.logs | ||
.iter() | ||
.take_while(|log| block_number >= log.block_number) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that the logs are ordered, we could do a binary search here.
fb3b5f5
to
e709f7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. A couple of minor comments and a potential optimization that I think is necessary.
Keen to hear your thoughts.
@@ -3,6 +3,8 @@ use eth2_hashing::hash; | |||
use tree_hash::TreeHash; | |||
use types::{Deposit, Hash256}; | |||
|
|||
const DEPOSIT_CONTRACT_TREE_DEPTH: usize = 32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This const tends to keep getting duplicated around, perhaps we can import this instead:
lighthouse/eth2/types/src/deposit.rs
Line 10 in 24e941d
pub const DEPOSIT_TREE_DEPTH: usize = 32; |
DepositCache { | ||
logs: Vec::new(), | ||
roots: Vec::new(), | ||
// 0 to be compatible with Service::Config. Should be ideally 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we change Service::Config
? Your suggestion seems quite reasonable.
pub fn get_deposit_root_from_cache(&self, block_number: u64) -> Option<Hash256> { | ||
let index = self.get_deposit_count_from_cache(block_number)?; | ||
let roots = self.roots.get(0..index as usize)?; | ||
let tree = DepositDataTree::create(roots, index as usize, DEPOSIT_CONTRACT_TREE_DEPTH); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned, this is going to be fairly inefficient.
For our testnet, this means that if we sync a cache of 4,096
blocks we're going to have to find the root of a list of 16,384
for each of those blocks. I just benched a tree hash root of 16k hashes at 8ms, so we're looking at 32 secs of hashing just to fill the cache.
I think a fairly easy solution to this would be to attach a DepositDataTree
and a Vec<u64, Hash256>
to the deposit cache. Each time we import a deposit we add the deposit to the DepositDataTree
and then push the new root into the Vec
. In this scenario, we incrementally build the deposit tree once and when we want to resolve a block number to a deposit root we can just binary search the Vec
.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this sounds perfect 👍
} | ||
|
||
/// Mirrors the merkle tree of deposits in the eth1 deposit contract. | ||
/// | ||
/// Provides `Deposit` objects with merkle proofs included. | ||
#[derive(Default)] | ||
pub struct DepositCache { | ||
logs: Vec<DepositLog>, | ||
roots: Vec<Hash256>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@paulhauner Perhaps this should be named leafs
instead of roots
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! There's a couple of tiny comments, but I'm happy to merge this in once you've looked at them :)
/// and queries the `deposit_roots` map to get the corresponding `deposit_root`. | ||
pub fn get_deposit_root_from_cache(&self, block_number: u64) -> Option<Hash256> { | ||
let index = self.get_deposit_count_from_cache(block_number)?; | ||
Some(self.deposit_roots.get(index as usize)?.clone()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how you indexed by deposit count, this is better than my suggestion.
} | ||
|
||
/// Mirrors the merkle tree of deposits in the eth1 deposit contract. | ||
/// | ||
/// Provides `Deposit` objects with merkle proofs included. | ||
#[derive(Default)] | ||
pub struct DepositCache { | ||
logs: Vec<DepositLog>, | ||
roots: Vec<Hash256>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
.binary_search_by(|deposit| deposit.block_number.cmp(&block_number)); | ||
match index { | ||
Ok(index) => return self.logs.get(index).map(|x| x.index + 1), | ||
Err(prev) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be next
instead of prev
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup you are right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @pawanjay176!
I didn't think this would make it into v0.1.1
but you made it happen!
Issue Addressed
Addresses #637
Proposed Changes
Adds methods to fetch
deposit_root
anddeposit_count
at any block height and uses them to populate theBlockCache
in place of the calls to read contract state directly at previous blocks.