-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: introduce cached flat storage deltas #8662
Conversation
core/store/src/flat_state.rs
Outdated
/// Merge two deltas. Values from `other` should override values from `self`. | ||
pub fn merge(&mut self, other: &Self) { | ||
pub fn merge(&mut self, other: Self) { | ||
self.0.extend(other.0.iter().map(|(k, v)| (k.clone(), v.clone()))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we consume other
here, so I guess clone()
can be avoided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, could be nice. But there is no way to extract values from container without mut
, and having mut
here feels wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you mean here, wouldn't self.0.extend(other.0.into_iter())
work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Um, right.
core/store/src/flat_state.rs
Outdated
|
||
/// Size of all entries. May be changed if we implement inlining of `ValueRef`s. | ||
fn total_size(&self) -> u64 { | ||
(self.0.len() as u64) * Self::ENTRY_SIZE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try to be a bit more precise here. std HashMap
is based on Swiss Table. In my understanding it stores all the data in flat array + a couple of bits of "control data" per entry (that is how I understood it from the talk). So in our case I suggest using self.0.capacity() * (sizeof<Key> + sizeof<Value>)
and ignore control data overhead to keep it simple.
core/store/src/flat_state.rs
Outdated
impl CachedFlatStateDelta { | ||
/// Assumed size in bytes for the cache entry. Estimated as 69 bytes for `CryptoHash` and `Option<ValueRef>` + | ||
/// 70% hashmap overhead per entry ~= 117 bytes. | ||
const ENTRY_SIZE: u64 = 117; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid hardcoding such values, in my understanding rust allows compile time computations using size_of: std::mem::size_of::< CryptoHash>() + std::mem::size_of::<Option<ValueRef>>()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Introduce
CachedFlatStateDelta
and store it in memory instead of largerFlatStateDelta
. There we store only hashes of trie keys instead of full trie keys. It still allows to serve read-only queries, but greatly reduces max size of deltas per block - from ~50 MB of raw entries to ~5 MB. Full motivation can be found here: #8436.Based on https://ntietz.com/blog/rust-hashmap-overhead/, hashmap overhead is 70% per entry, so we estimate full entry size as 117.
Proof that it doesn't impact performance: node with full blocks and no finality for 50 blocks, node without finality lags. You can see that the first node has up to 50 cached blocks very regularly, but time for getting value ref doesn't change.
Testing
Existing tests must not fail.