-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proxy RSS grows with traffic #3998
Closed
olix0r opened this issue
Jan 29, 2020
· 1 comment
· Fixed by linkerd/linkerd2-proxy#418 or linkerd/linkerd2-proxy#423
Closed
proxy RSS grows with traffic #3998
olix0r opened this issue
Jan 29, 2020
· 1 comment
· Fixed by linkerd/linkerd2-proxy#418 or linkerd/linkerd2-proxy#423
Labels
Comments
olix0r
changed the title
proxy RSS grows until being OOMKilled
proxy RSS grows with traffic
Jan 29, 2020
olix0r
added a commit
to linkerd/linkerd2-proxy
that referenced
this issue
Jan 31, 2020
linkerd/linkerd2#3998 describes an issue where the proxy's memory grows with traffic. After testing, we've identified that this is caused by logging emitted by `tracing`. 07667b8 upgraded the `tracing-subscriber` trait, which reduced memory pressure; however, we continue to observe heap usage grow in large leaps when running with the default log level `linkerd=info,warn`. This appears to be due to span creation, which occurs on every new connection. By changing uses of `info_span!` to `debug_span!`, we can avoid this allocation path at the expense of losing contextual logging. This seems like a suitable tradeoff until we can address the underling issues in `tracing`. I've tested this overnight and memory usage remains effectively flat.
One source of RSS growth with TCP connections only is probably #4006. |
hawkw
added a commit
to linkerd/linkerd2-proxy
that referenced
this issue
Feb 4, 2020
Version 0.0.7 of `sharded-slab` contains a bug where, when the `remove` method is called with the index of a slot that is not being accessed concurrently, the slot is emptied but **not** placed on the free list. This issue meant that, under `tracing-subscriber`'s usage pattern, where slab entries are almost always uncontended when reused, allocated slab pages are almost never reused, resulting in unbounded slab growth over time (i.e. a memory leak). This commit updates `tracing-subscriber`' to version 0.2.0-alpha.6, which in turn bumps the `sharded-slab` dependency to v0.0.8, which includes commit hawkw/sharded-slab@dfdd7ae. That commit fixes this bug. I've empirically verified that, after running `linkerd2-proxy` under load with a global `trace` filter that enables a *lot* of spans, heap usage remains stable, and the characteristic stair-step heap growth pattern of doubling slab allocations doesn't occur. This indicates that freed slots are actually being reused, and (once fully warmed up), the slab will only grow when the number of active spans in the system increases. ![mem_plot](https://user-images.githubusercontent.com/2796466/73581369-cd859900-443d-11ea-8522-abeace03d745.png) Closes linkerd/linkerd2#3998 Signed-off-by: Eliza Weisman <[email protected]>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
On edge-20.1.4, proxy RSS grows with both HTTP and TCP load
Similar behavior is observed on more recent branches that change buffering/backpressure behavior; so we should probably continue to debug this on master.
I've put together a small repro.
The text was updated successfully, but these errors were encountered: