-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tracing: guardrail against memory leaks #59370
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 3 of 3 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @tbg)
pkg/util/tracing/crdbspan.go, line 329 at r2 (raw file):
// Only record the child if the parent still has room. if len(s.mu.recording.children) < maxChildrenPerSpan { s.mu.recording.children = append(s.mu.recording.children, child)
Would it make sense to discard the oldest recording instead? So we cap the number of entries but let traces expose the most recent activity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @knz)
pkg/util/tracing/crdbspan.go, line 329 at r2 (raw file):
Previously, knz (kena) wrote…
Would it make sense to discard the oldest recording instead? So we cap the number of entries but let traces expose the most recent activity?
I would gently push back on trying to do anything fancier in this PR. This is a guardrail, and unfortunately we are dealing with a number of release blockers right now that are a result of this issue. So let's get this PR merged.
In a wider sense, long-running or otherwise large spans are always going to be problematic with our current "collect the trace at the gateway" model. In my view, it currently isn't that useful to focus on making large/long spans work very well. It's more about damage control and detection.
oh I see the context now. Sure go ahead. |
Although: can you perhaps replace the last entry with a synthetic one that tells the viewer of the trace it was truncated? This will avoid head scratching |
I'm limiting the number of child spans, not messages (we have that already too), so there is not a good point to add a message, and also there wouldn't really be any observability for the message unless you had an external tracer set up and you were vigorously searching for it. |
no objection to merge though - my review approval from earlier stands |
We should add some test-build-only infrastructure to detect these leaks (similar to the goroutine leak test). Not super sure how but we could brainstorm. |
We were discussing capping the number of structured events stored within a Span, and then separately introducing a limit on the tracing registry as a whole (possibly doing both through something like #59310). Something the testing harness could do then iis assert against some memory usage limit? I don't know that we would've caught this issue though, I think it would only surface for long running clusters. |
Yes, this is sort of hard to find in unit tests. It's akin to "asserting" that we don't leak goroutines. You need to know how many you are expecting, but even then you will only catch what you exercise in tests. I think (and independently should anyway) make the memory management for tracing robust enough. We already do make sure that no trace spans leak in every test that uses TestCluster (there are 1-2 exceptions, and those are understood & have issues filed). The issue at hand is easy to find in a sense, because it materializes under basically any write workload with replication. So any long-running test would have caught it, via an OOM. This is further evidence that just any long-running test would be helpful. roachtest doesn't really let us do those. CC canary deployments fill the gap to a small extent (is there even workload on those)? |
216052b
to
4b1a643
Compare
bors p=99 |
Build failed: |
In cockroachdb#59347 we saw an OOM due to more and more children being registered with a long-running trace Span (held open by a raft scheduler worker goroutine). This commit adds a simple guardrail: no more than 1000 spans can be registered directly with a single parent; any in excess are simply not registered at all. Fixes cockroachdb#59347. Release note: None
Ugh, lots of tests relied on behavior that was problematic. Going to rip the cleanups out and go for the minimal fix. bors p=99 |
Build succeeded: |
pkg/util/tracing/crdbspan.go, line 329 at r2 (raw file): Previously, tbg (Tobias Grieger) wrote…
What do you think about asserting in crdb_test builds if we reach the limit? |
I tried roughly that in TestMonotonicIncrements and it didn't fire because you needed a significantly higher number of operations than the test did (I had to run kv0 on a three node cluster for ca 100s to trigger the limit). I think reasonably what we should do is a) an on-teardown thing in roachtest that verifies that we don't have any long-open spans (via #55733) with children attached. I'll x-ref this issue to that. |
We were previously propagating RPCs across boundaries only when they were verbose. We now propagate any span (except the noop span) regardless of current verbosity. This ensures that SQL (which always creates real, non-verbose spans by default) can trace the entirety of its operations. This enables the background collection of contention metadata in 21.1 and more generally, the idea of always-on tracing. Note that we had previously considered always-on tracing to mean avoiding the noop span altogether; due to memory retention issues[1][2] encountered we are taking a step back from that idea for the 21.1 release. [1]: cockroachdb#59370 [2]: cockroachdb#59431 Release note: None iasd Release note: None
We were previously propagating RPCs across boundaries only when they were verbose. We now propagate any span (except the noop span) regardless of current verbosity. This ensures that SQL (which always creates real, non-verbose spans by default) can trace the entirety of its operations. This enables the background collection of contention metadata in 21.1 and more generally, the idea of always-on tracing. Note that we had previously considered always-on tracing to mean avoiding the noop span altogether; due to memory retention issues[1][2] encountered we are taking a step back from that idea for the 21.1 release. [1]: cockroachdb#59370 [2]: cockroachdb#59431 Release note: None
We were previously propagating RPCs across boundaries only when they were verbose. We now propagate any span (except the noop span) regardless of current verbosity. This ensures that SQL (which always creates real, non-verbose spans by default) can trace the entirety of its operations. This enables the background collection of contention metadata in 21.1 and more generally, the idea of always-on tracing. Note that we had previously considered always-on tracing to mean avoiding the noop span altogether; due to memory retention issues[1][2] encountered we are taking a step back from that idea for the 21.1 release. [1]: cockroachdb#59370 [2]: cockroachdb#59431 Release note: None
We were previously propagating RPCs across boundaries only when they were verbose. We now propagate any span (except the noop span) regardless of current verbosity. This ensures that SQL (which always creates real, non-verbose spans by default) can trace the entirety of its operations. This enables the background collection of contention metadata in 21.1 and more generally, the idea of always-on tracing. Note that we had previously considered always-on tracing to mean avoiding the noop span altogether; due to memory retention issues[1][2] encountered we are taking a step back from that idea for the 21.1 release. Somehow creating additional trace spans seems to have fouled up some tracing-based tests. I can't boil that ocean here, so I filed a follow-up issue: cockroachdb#60672 [1]: cockroachdb#59370 [2]: cockroachdb#59431 Release note: None
We were previously propagating RPCs across boundaries only when they were verbose. We now propagate any span (except the noop span) regardless of current verbosity. This ensures that SQL (which always creates real, non-verbose spans by default) can trace the entirety of its operations. This enables the background collection of contention metadata in 21.1 and more generally, the idea of always-on tracing. Note that we had previously considered always-on tracing to mean avoiding the noop span altogether; due to memory retention issues[1][2] encountered we are taking a step back from that idea for the 21.1 release. [1]: cockroachdb#59370 [2]: cockroachdb#59431 Release note: None
We were previously propagating RPCs across boundaries only when they were verbose. We now propagate any span (except the noop span) regardless of current verbosity. This ensures that SQL (which always creates real, non-verbose spans by default) can trace the entirety of its operations. This enables the background collection of contention metadata in 21.1 and more generally, the idea of always-on tracing. Note that we had previously considered always-on tracing to mean avoiding the noop span altogether; due to memory retention issues[1][2] encountered we are taking a step back from that idea for the 21.1 release. Somehow creating additional trace spans seems to have fouled up some tracing-based tests. I can't boil that ocean here, so I filed a follow-up issue: cockroachdb#60672 [1]: cockroachdb#59370 [2]: cockroachdb#59431 Release note: None
59992: tracing: propagate non-recording spans across rpc boundaries r=knz,irfansharif,raduberinde a=tbg We were previously propagating RPCs across boundaries only when they were verbose. We now propagate any span (except the noop span) regardless of current verbosity. This ensures that SQL (which always creates real, non-verbose spans by default) can trace the entirety of its operations. This enables the background collection of contention metadata in 21.1 and more generally, the idea of always-on tracing. Note that we had previously considered always-on tracing to mean avoiding the noop span altogether; due to memory retention issues[1][2] encountered we are taking a step back from that idea for the 21.1 release. [1]: #59370 [2]: #59431 Release note: None Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: Tobias Schottdorf <[email protected]>
tracing: limit number of direct children
In #59347 we saw an OOM due to more and more children being registered
with a long-running trace Span (held open by a raft scheduler worker
goroutine). This commit adds a simple guardrail: no more than 1000 spans
can be registered directly with a single parent; any in excess are
simply not registered at all.
Fixes #59347.
Fixes #59355.
Fixes #59344.
Release note: None