Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: heap, allocs, block, and mutex profiles truncate stacks at 32 frames #43669

Open
aclements opened this issue Jan 13, 2021 · 8 comments
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Milestone

Comments

@aclements
Copy link
Member

The runtime currently (as of Go 1.15) limits recorded stacks of several profile types to 32 frames. This causes deep stacks to become unrooted in pprof call graphs, leading to confusing results and interfering with aggregate statistics up the call graph.

These are hard-coded array bounds in the public runtime API for StackRecord (used in BlockProfileRecord) and MemProfileRecord. Unfortunately, this means these are part of the Go 1 runtime API, so we can't just change them. If we were to increase them, this would impact memory for relatively little gain, since stacks typically aren't this deep. Finally, we probably need some reasonable bound on stack depth just to avoid blowing up profiles.

The "right" fix, in my opinion, is to do what we did with CPU profiles: deprecate the runtime package interface in favor of runtime/pprof, create a log-based back channel from runtime to runtime/pprof that can handle variable-sized stack traces, and perform accumulation in runtime/pprof. That's quite a bit of engineering for a fairly obscure problem, though it would also help with labeling non-CPU profiles (#23458).

We don't have plans to fix this right now. This is just a tracking issue.

@aclements aclements added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jan 13, 2021
@aclements aclements added this to the Unplanned milestone Jan 13, 2021
@prattmic
Copy link
Member

prattmic commented Jan 13, 2021

Note for clarity: CPU profiles also have a limit: 64 frames. However, the back channel interface to runtime/pprof means that we could change that limit easily without breaking the API, and the variable size interface avoid extra memory use unless frames are actually that large.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/572396 mentions this issue: runtime/pprof: increase mem profile stack depth to 128

@felixge
Copy link
Contributor

felixge commented Mar 20, 2024

I come across this problem every once in a while when looking at customer profiles and profiles from internal applications, especially for memory profiles.

Today I was looking at a live heap profile with > 50% of truncated stacks and got sufficiently annoyed to actually take another look at this issue and see if there is something that could be done that wouldn't involve a major overhaul of the memory profiler as discussed in this issue.

2024-03-20 Continuous Profiler - Explorer  Datadog at 17 41 54@2x

(note: we only show 3 frames for truncated frames b/c they can cause a lot of storage/perf problems in some cases)

I ended up submitting the idea I came up with as a new CL. It increases the stack size to 128 and deals with the MemProfileRecord API problem by embedding a reference to a bigger stack trace as a private struct field when needed. This shouldn't cause any Go 1 API compatibility issues, while fixing this problem.

Memory usage will only be increased for applications that have memory profiling enabled and experience problems with truncated stack traces.

PTAL and let me know what you think. If the change seems like a reasonable improvement, I can do the remaining work for mutex/block profiles as well to bring them in sync.

@felixge
Copy link
Contributor

felixge commented Mar 20, 2024

cc @prattmic @mknyszek

@simonswine
Copy link

simonswine commented Apr 3, 2024

I remember the first time I've encountered the stack depth limit in Go profiles and I found it quite confusing and obscure.

I wonder, if we should at the same time make it more explicit that truncation has happened. Maybe that could be done fairly simply by inserting a logical root like [truncated].

@moio
Copy link

moio commented Apr 10, 2024

At the very minimum, this phenomenon could be prominently documented - it took me a while to find out this issue, although it is obvious in retrospect looking at the flame graph!

@rsc
Copy link
Contributor

rsc commented May 8, 2024

The implementation in CL 572396 is a bit odd, in that it makes Stack allocate where it didn't before.

Given that I think we agree the right fix is to establish some other kind of runtime backchannel that avoids these structs entirely, I don't think the CL's approach is the right one.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/584356 mentions this issue: runtime: increase profiling stack depth to 128

gopherbot pushed a commit that referenced this issue May 21, 2024
The current stack depth limit for alloc, mutex, block, threadcreate and
goroutine profiles of 32 frequently leads to truncated stack traces in
production applications. Increase the limit to 128 which is the same
size used by the execution tracer.

Create internal/profilerecord to define variants of the runtime's
StackRecord, MemProfileRecord and BlockProfileRecord types that can hold
arbitrarily big stack traces. Implement internal profiling APIs based on
these new types and use them for creating protobuf profiles and to act
as shims for the public profiling APIs using the old types.

This will lead to an increase in memory usage for applications that
use the impacted profile types and have stack traces exceeding the
current limit of 32. Those applications will also experience a slight
increase in CPU usage, but this will hopefully soon be mitigated via CL
540476 and 533258 which introduce frame pointer unwinding for the
relevant profile types.

For #43669.

Change-Id: Ie53762e65d0f6295f5d4c7d3c87172d5a052164e
Reviewed-on: https://go-review.googlesource.com/c/go/+/572396
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Austin Clements <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made.
Projects
None yet
Development

No branches or pull requests

8 participants
@felixge @rsc @simonswine @moio @prattmic @aclements @gopherbot and others