Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce cached flat storage deltas #8662

Merged
merged 11 commits into from
Mar 3, 2023
Merged

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Mar 1, 2023

Introduce CachedFlatStateDelta and store it in memory instead of larger FlatStateDelta. There we store only hashes of trie keys instead of full trie keys. It still allows to serve read-only queries, but greatly reduces max size of deltas per block - from ~50 MB of raw entries to ~5 MB. Full motivation can be found here: #8436.

Based on https://ntietz.com/blog/rust-hashmap-overhead/, hashmap overhead is 70% per entry, so we estimate full entry size as 117.

Proof that it doesn't impact performance: node with full blocks and no finality for 50 blocks, node without finality lags. You can see that the first node has up to 50 cached blocks very regularly, but time for getting value ref doesn't change.

Testing

Existing tests must not fail.

@Longarithm Longarithm self-assigned this Mar 1, 2023
@Longarithm Longarithm requested a review from pugachAG March 2, 2023 17:04
@Longarithm Longarithm marked this pull request as ready for review March 2, 2023 17:04
@Longarithm Longarithm requested a review from a team as a code owner March 2, 2023 17:04
@Longarithm Longarithm requested a review from nikurt March 2, 2023 17:04
@Longarithm Longarithm changed the title draft: introduce cached flat storage deltas feat: introduce cached flat storage deltas Mar 2, 2023
/// Merge two deltas. Values from `other` should override values from `self`.
pub fn merge(&mut self, other: &Self) {
pub fn merge(&mut self, other: Self) {
self.0.extend(other.0.iter().map(|(k, v)| (k.clone(), v.clone())))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we consume other here, so I guess clone() can be avoided

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, could be nice. But there is no way to extract values from container without mut, and having mut here feels wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what you mean here, wouldn't self.0.extend(other.0.into_iter()) work?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Um, right.


/// Size of all entries. May be changed if we implement inlining of `ValueRef`s.
fn total_size(&self) -> u64 {
(self.0.len() as u64) * Self::ENTRY_SIZE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to be a bit more precise here. std HashMap is based on Swiss Table. In my understanding it stores all the data in flat array + a couple of bits of "control data" per entry (that is how I understood it from the talk). So in our case I suggest using self.0.capacity() * (sizeof<Key> + sizeof<Value>) and ignore control data overhead to keep it simple.

impl CachedFlatStateDelta {
/// Assumed size in bytes for the cache entry. Estimated as 69 bytes for `CryptoHash` and `Option<ValueRef>` +
/// 70% hashmap overhead per entry ~= 117 bytes.
const ENTRY_SIZE: u64 = 117;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid hardcoding such values, in my understanding rust allows compile time computations using size_of: std::mem::size_of::< CryptoHash>() + std::mem::size_of::<Option<ValueRef>>()

@Longarithm Longarithm requested a review from pugachAG March 3, 2023 12:09
Copy link
Contributor

@pugachAG pugachAG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@Longarithm Longarithm merged commit 37793cf into master Mar 3, 2023
@Longarithm Longarithm deleted the fs-cached-deltas branch March 3, 2023 14:38
nikurt pushed a commit to nikurt/nearcore that referenced this pull request Mar 6, 2023
nikurt pushed a commit to nikurt/nearcore that referenced this pull request Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants