Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible memory leak in tracing-subscriber #515

Closed
rklaehn opened this issue Jan 8, 2020 · 7 comments
Closed

Possible memory leak in tracing-subscriber #515

rklaehn opened this issue Jan 8, 2020 · 7 comments
Labels
crate/subscriber Related to the `tracing-subscriber` crate kind/bug Something isn't working

Comments

@rklaehn
Copy link

rklaehn commented Jan 8, 2020

Bug Report

Version

tracing-subscriber v0.2.0-alpha.2

Platform

Linux linux-box 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Description

There seems to be a memory leak or something that looks like a memory leak that is related to tracing-subscriber. Here is a screenshot of heaptrack with an app that heavily uses tracing.

image

Several of the growing wedges seem to be related to tracing. E.g.

image

and

image

image

Is it possible that there is some per-span data that is not being properly cleaned up? Even if this is not a real memory leak but just something that will grow until there is memory pressure, it makes it very difficult to find the real memory leak.

I will remove all tracing spans again and check if this makes a difference WRT mem usage.

@davidbarsky
Copy link
Member

I believe there was a memory leak and it might’ve been addressed in #514.

@hawkw
Copy link
Member

hawkw commented Jan 8, 2020

I'm fairly certain that David is correct that this is caused by the two issues fixed in PR #514, both of which resulted in some closed spans not having their per-span data removed from the registry. @rklaehn, would you mind trying against the current master, now that #514 has merged? Hopefully you won't see out-of-control heap growth.

If that's not the case, it's possible something else is going on. There is also the fact that tracing-subscriber doesn't currently ever shrink the span slab and release unused capacity back to the allocator; however, if we're not leaking per-span data, this shouldn't be an actual leak — instead, once we allocate enough storage for the maximum number of concurrent spans the system will produce, we just stop churning per-span allocations. We might want to introduce safeguards here, though...this is why tracing-subscriber 0.2 is still in alpha.

@rklaehn
Copy link
Author

rklaehn commented Jan 9, 2020

Thanks for the quick response. I will try out the latest version, but I don't have time to do it today. Will be some time.

There is also the fact that tracing-subscriber doesn't currently ever shrink the span slab and release unused capacity back to the allocator; however, if we're not leaking per-span data, this shouldn't be an actual leak — instead, once we allocate enough storage for the maximum number of concurrent spans the system will produce, we just stop churning per-span allocations.

I am currently on the hunt for an unrelated slow memory leak in a fairly complex application, so everything that slowly grows over time is annoying even though it might stabilize at some point. No real good way to distinguish this from a real leak when looking at heapcheck graphs.

Would it be possible to provide an initial size for the span slab, so I could just allocate something that should be enough for my purposes? Then I would not have this slowly growing thing...

What size are we talking about here? The per span info should not be that big, so that times the number of threads times the worst case "span depth" times some fudge factor?

@hawkw
Copy link
Member

hawkw commented Jan 9, 2020

There is also the fact that tracing-subscriber doesn't currently ever shrink the span slab and release unused capacity back to the allocator; however, if we're not leaking per-span data, this shouldn't be an actual leak — instead, once we allocate enough storage for the maximum number of concurrent spans the system will produce, we just stop churning per-span allocations.

I am currently on the hunt for an unrelated slow memory leak in a fairly complex application, so everything that slowly grows over time is annoying even though it might stabilize at some point. No real good way to distinguish this from a real leak when looking at heapcheck graphs.

Would it be possible to provide an initial size for the span slab, so I could just allocate something that should be enough for my purposes? Then I would not have this slowly growing thing...

What size are we talking about here? The per span info should not be that big, so that times the number of threads times the worst case "span depth" times some fudge factor?

Hmm...I think we could probably add a constructor that takes an initial capacity, and immediately allocates that much capacity every time a new thread tries to access the span slab. That's not a bad idea!

I would also like to add the option to eagerly release empty slab pages back to the allocator. We might allow configuring an amount of capacity over which the slab should start releasing pages, so we can still benefit from allocation reuse when the slab is "warmed up"...

@hawkw hawkw added kind/bug Something isn't working crate/subscriber Related to the `tracing-subscriber` crate labels Jan 10, 2020
@davidbarsky
Copy link
Member

davidbarsky commented Jan 12, 2020

We published tracing-subscriber 0.2.0-alpha.4, which should address the issues you’re seeing.

Thanks so much for putting this library through the paces and being diligent in cutting tickets when you see issues. It’s really hardened this library.

@hawkw
Copy link
Member

hawkw commented Jan 27, 2020

@rklaehn have you had the chance to try tracing-subscriber 0.2.0-alpha.4 yet? Would love to know if you're still seeing surprising memory usage or if this issue can be closed.

@hawkw
Copy link
Member

hawkw commented Feb 1, 2020

@rklaehn a quick update: we've found and fixed an additional memory leak in sharded-slab (see 3c35048); I strongly suspect that's what you're seeing. We've published 0.2.0-alpha.5 with a fix for that issue. If you have a chance, we'd love it if you could test that out in your application — "leak free" is definitely a blocker for 0.2.0 stable.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
crate/subscriber Related to the `tracing-subscriber` crate kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants