-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xtrabackup engine can use tons of vttablet memory #5613
Comments
Were you able to see the mem usage of vttablet separately from xtrabackup (even though they're both in the same container) to know that the unexpected growth in usage is from vttablet and not xtrabackup itself? I noticed your flamegraph is only for vttablet. Also, do you have any time-series graph of memory usage during a backup? It would be helpful to see if there is any sawtooth pattern, which would indicate we might be doing a lot of allocations that get cleaned up periodically by the GC.
Does this mean that you found more stripes leads to more memory usage? If so, then it seems like the main problem is the big chunk in your flamegraph attributed to pgzip, since we create a separate compressor for each stripe. We do expose the underlying settings that are passed to pgzip as flags, so if this is the problem, it should be solvable with some tuning. The memory used by pgzip will scale like:
If you want to increase |
Yes, when the OOM killer runs we see
Sure -- here's one recent replica pod. Guess when xtrabackup started and when the OOM killer stepped in 😁
Yes, more stripes means more usage, we've avoided the problem on a few keyspaces by lowering the # of strips. Unfortunately it's a bit of guess-and-check for where it lands in total usage. |
This is a good suggestion, I can look into trying that in our environment. Admittedly haven't dug super deep into the code here, but I'm a generally a little puzzled that |
At a practical level, there should be a soft bound based on how many concurrent users of the pool there are. So the
The pgzip lib claims that it blocks if its fixed compression buffers fill up. Do you see a way that data throughput mismatch could still cause memory bloat? |
Thanks for jumping on that -- it might be a bit before we can test it out, but I'm keen to see what happens. I'll do a bit more digging into the pgzip lib as well, perhaps it is more limited than I thought. If so, then maybe we can get some estimation of memory used which would help us determine which settings to use |
Can you share what values of Also, what was the total mem usage represented by that flamegraph at the top? I'm trying to get an idea for the absolute sizes of these things to sanity check my theory. If the usage is way over the formula I proposed, you may be right that we need to fix pgzip to introduce true bounds. |
For that one:
Unfortunately it doesn't look like I have the original dataset for that particular flamegraph. The
|
But, I'll see if I can pull a flamegraph w/ actual sizes from a tablet running into the same issues shortly |
Hm that definitely seems like a lot of the mem usage is not accounted for. Almost all of the heap space is released or idle. If the space usage is idle heap space, it might even be that the flamegraph won't show it. The graph might only be showing us objects that are still allocated. This might be a case of Go not releasing unused memory back to the OS. Usually that becomes a problem if you have tons of churn in objects being allocated and then collected, so it's possible that #5666 will help more than I thought it would. We'll have to wait and see. |
If we suspect pgzip is using too much memory, we could try https://godoc.org/golang.org/x/build/pargzip. |
We rolled out #5666 and it hasn't changed overall memory usage, though the flamegraph looks pretty different now. With:
We get this flamegraph: Total vttablet process RSS was at 1.845388GB, the total heap usage there is 444MB and the whole tree from @acharis pointed out that our S3 uploader runs with a patch (!): https://gist.github.com/pH14/caee0c2be14e5db09c69e480be9f8a42 -- I don't think it affects this particular backup, since the file size is calculable + we're doing a striped backup, but including it for full disclosure. |
Given that the heap usage according to Go is only 444MB while the RSS is 1.8GB, it does feel like this might be a case of unused memory not being released to the OS. Since this memory is unused, it won't show up in a heap snapshot. A quick-and-dirty way to test whether this is the problem would be to patch vttablet to periodically call debug.FreeOSMemory(). This will degrade performance/latency but will make the RSS more representative of actual allocated objects, which will give us evidence as to whether we're on the right track. If you don't have a safe way to run that experimental patch, we may be able to find another way to test this but I can't think of one off the top of my head. |
We'll be able to run that safely, I'll try hacking that in -- any suggested interval? Every 30s? |
If you don't care about the performance of this instance, you could do it every 1s just to be really sure that we're getting clean experimental results. If you're worried about this instance, 30s is a good compromise. |
Haven't forgotten about this... hoping to test this out later this week 🤞 |
@pH14 can you record the size of the database and the size of the largest table when you test this? |
Phew, finally got this one running. Settings were (accidentally) slightly different than earlier runs, but they were the same between running with the per-second execution of
On the default run (without the And heap flamegraph: Notably, the backup also only took ~7 minutes, but it took twice as long for the vttablet memory to come back down. e.g. we can see the CPU came down 8 minutes before the memory usage did: With the patch to run Note that the absolute memory usage is about halved from before. The flamegraph is of a similar shape, but with ~20% less overall usage (this is also sampling though): |
Looks like forcing |
That's definitely interesting! I looked up how to see more about those not-in-use objects, and it looks like it should be as simple as grabbing |
^ I don't have all the data for that one handy at this point, but for posterity we discussed that offline. It looked like Interestingly, we recently deployed our vttablets with go1.13 up from go1.11 and saw memory usage degrade even further. vttablet container memory usage, after go1.13: We can see it spike instantaneously, and go never fully released memory back to the OS, even hours after the backup finished. It remained at nearly full usage (despite very low active heap usage) until we restarted the container. To revert to the previous behavior, we set Curious if any others have seen such behavior going from go1.11 to 1.12 or 1.13 |
We've been running on go1.13 and haven't seen this ourselves, but maybe our shards are small enough to not hit it? At this point, I think the most promising route is to try swapping in https://godoc.org/golang.org/x/build/pargzip. @pH14 If I made a branch with that swap, would you be willing to test it on a shard that exhibits this problem? |
Yep! We could give that a whirl |
@pH14 Here's an experimental branch you could try: In addition to memory usage, it would also be interesting to compare how long the backup takes with pargzip vs. pgzip. |
Using the
xtrabackup
engine can skyrocket vttablet memory usage. As a result. we often run into OOM kills on thextrabackup
child process it spawns, due to container memory limits.We can use
xtrabackup_stripes
as a coarse-grain control for memory usage, but it'd be better to more precisely set an upper-bound on how much memory the backup engine can use.When looking at the heap flamegraph, we can see that, unsurprisingly, all of our allocations are going into various byte buffers along the way.
The text was updated successfully, but these errors were encountered: