Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backfill: OOM from AddSSTable when adding SSTs that span many ranges #36769

Closed
thoszhang opened this issue Apr 11, 2019 · 0 comments · Fixed by #36765
Closed

backfill: OOM from AddSSTable when adding SSTs that span many ranges #36769

thoszhang opened this issue Apr 11, 2019 · 0 comments · Fixed by #36765
Assignees

Comments

@thoszhang
Copy link
Contributor

During an index backfill, if bulk.AddSSTable tries to add an SST spanning many ranges (and the range descriptor cache isn't populated with the ranges overlapping with that span), it will fail and recursively call itself with smaller SSTables that are split on the first range boundary discovered. When this happens, we can run out of memory, since we don't free the SST byte arrays that are allocated until the end.

These are from heap profiles for a node that crashed when I rolled back a CANCEL JOB (causing the backfiller to generate SSTs that span many nonempty ranges):

Screen Shot 2019-04-10 at 3 20 35 PM

Screen Shot 2019-04-10 at 3 21 50 PM

craig bot pushed a commit that referenced this issue Apr 14, 2019
36765: bulk: change AddSSTTable to not be recursive r=vivekmenezes a=vivekmenezes

AddSSTTable was recursive to deal with range splits.
Unfortunately the recursive call would create new SSTs
without freeing the older ones creating a memory buildup
that was quadratic. We've seen memory buildup on the order
of GBs due to this recursion.

fixes #36769
fixes #36381

Release note: None

Co-authored-by: Vivek Menezes <[email protected]>
@craig craig bot closed this as completed in #36765 Apr 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants