-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exp/ingest/io: Skip storing live entries seen in the oldest bucket #2618
exp/ingest/io: Skip storing live entries seen in the oldest bucket #2618
Conversation
// 1. Ledger keys are unique within a single bucket. | ||
// 2. This is the last bucket we process so there's no need to track | ||
// seen last entries in this bucket. | ||
if oldestBucket { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can apply the same logic to the entry.Type == xdr.BucketEntryTypeDeadentry
case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The oldest bucket does not contain any dead entries. @jonjove is it always true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to call msr.tempStore.Exist(h)
on the last bucket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because it's possible that live or dead entry for a given ledger key has been seen in one of the previous buckets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartekn It is always true that the oldest bucket does not contain any dead entries.
Confirm the change works correctly for the latest checkpoint using |
@bartekn amazing job! This is such a good optimization. 🎉 🎉 |
cool. @bartekn you also need to smoke test this with older protocol versions (before 11), before "INIT" entries got added |
Works correctly for pre-11 ledger: 20406335. |
PR Checklist
PR Structure
otherwise).
services/friendbot
, orall
ordoc
if the changes are broad or impact manypackages.
Thoroughness
.md
files, etc... affected by this change). Take a look in the
docs
folder for a given service,like this one.
Release planning
needed with deprecations, added features, breaking changes, and DB schema changes.
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.
What
This commit updates
SingleLedgerStateReader
to skip updating a temp ledger key store when processing the oldest bucket. This lowered memory usage ofSingleLedgerStateReader
from 550MB to 325MB when processing recent pubnet ledgers.Why
After checking memory usage of
SingleLedgerStateReader
andBucketEntryType
distribution in public network buckets:I observed that the last bucket contains a large number of live entries (~4.3M) but there's no point in storing them in
tempSet
as seen because this is the last bucket being processed and ledger keys are unique in each bucket (confirmed by @jonjove).