Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash stored accounts in bg #16157

Merged
merged 17 commits into from
Mar 31, 2021
Merged

Conversation

jeffwashington
Copy link
Contributor

@jeffwashington jeffwashington commented Mar 26, 2021

Problem
While processing a slot, the same account can be written to many times.
Hashing accounts is not free and gets expensive with large data. Hashing an account that is not the final update is not usually necessary.

Summary of Changes
When an account is stored to the accounts_cache, don't hash the contents first - instead use None. When a caller requires a hash, and a hash hasn't been calculated, calculate it and store it. A background thread processes items that are added to the accounts_cache, calculates the hash and updates the cached item in accounts_cache.

Fixes #

@jeffwashington jeffwashington changed the title Cache 2hash Hash stored accounts in bg during replay Mar 26, 2021
@jeffwashington
Copy link
Contributor Author

jeffwashington commented Mar 26, 2021

this pr: (improvement is in store_us and overall time)
ledger processing timing ExecuteTimings load_us 10909296 execute_us 55929281 store_us 5,725,080 details ExecuteDetailsTimings serialize_us 10248310 create_vm_us 1681543 execute_us 7627294 deserialize_us 13642778 changed_account_count 57938 total_account_count 1113864 total_data_size 48123505204 data_size_changed 49975074
ledger processed in 34 seconds. 4571 MB allocated. root slot is 69585630 1 fork at 69585662 with 32 frozen banks

master:
ledger processing timing ExecuteTimings load_us 10516422 execute_us 54218813 store_us 21,934,006 details ExecuteDetailsTimings serialize_us 9357363 create_vm_us 1677496 execute_us 7424016 deserialize_us 12894874 changed_account_count 57938 total_account_count 1113864 total_data_size 48123505204 data_size_changed 49975074
ledger processed in 41 seconds. 5092 MB allocated. root slot is 69585630 1 fork at 69585662 with 32 frozen banks

@jeffwashington
Copy link
Contributor Author

first prototype: #15617

@codecov
Copy link

codecov bot commented Mar 29, 2021

Codecov Report

Merging #16157 (829231e) into master (527adbe) will increase coverage by 0.0%.
The diff coverage is 85.7%.

@@           Coverage Diff           @@
##           master   #16157   +/-   ##
=======================================
  Coverage    80.0%    80.0%           
=======================================
  Files         410      410           
  Lines      109506   109592   +86     
=======================================
+ Hits        87674    87761   +87     
+ Misses      21832    21831    -1     

@jeffwashington jeffwashington requested a review from carllin March 30, 2021 03:48
runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
@jeffwashington jeffwashington requested a review from carllin March 30, 2021 06:29
runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
LoadedAccount::Stored(stored_account_meta) => &stored_account_meta.hash,
LoadedAccount::Cached((_, cached_account)) => &cached_account.hash,
LoadedAccount::Stored(stored_account_meta) => *stored_account_meta.hash,
LoadedAccount::Cached((_, cached_account)) => cached_account.hash(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah clever, so if the hash isn't computed yet by the background service by this point, (for things like bank.freeze() that need the hash), it'll compute it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I am just implementing the required blocking, lazy behavior in a way that seems correct, consistent, performant. I think it fits Rust idioms sufficiently.

runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
@jeffwashington jeffwashington marked this pull request as ready for review March 31, 2021 14:28
@jeffwashington jeffwashington changed the title Hash stored accounts in bg during replay Hash stored accounts in bg Mar 31, 2021
runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
runtime/src/accounts_db.rs Outdated Show resolved Hide resolved
runtime/src/accounts_cache.rs Outdated Show resolved Hide resolved
runtime/src/accounts_cache.rs Outdated Show resolved Hide resolved
@jeffwashington jeffwashington requested a review from sakridge March 31, 2021 19:18
Copy link
Member

@sakridge sakridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good and super cool to have higher performance

@jeffwashington jeffwashington added the automerge Merge this Pull Request automatically once CI passes label Mar 31, 2021
@mergify
Copy link
Contributor

mergify bot commented Mar 31, 2021

automerge label removed due to a CI failure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants