-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime/pprof: heap/allocs profile with legacy format yields different results from the profile with proto format #25096
Comments
What output do you see when using -inuse_space flag? (with the legacy format). The new 'alloc' profile should be identical with the 'heap' profile for the legacy format. |
They are similar but slightly different:
Neither is an accurate
Seems like that should be documented, although that is separate than the issue here. |
Why made you think alloc_space is not reporting alloc_space? The protobuf-based alloc profile reports 32809kB total. What is the heap sampling rate? |
Years of looking at this particular profile. :) The total number might be right but the listed functions definitely are not. Also the results for those same commands are different if we ask for pb format. And I know from experience the pb format is correct. I’d paste output here but I’m on my phone. |
Can you try to collect profile with memprofilerate=1? |
Attached are the profiles collected with memprofilerate=1. (tip rev ef53de8)
Proto format: $ compilebench -alloc -memprofile=proto.p -run=Template -memprofilerate=1 BenchmarkTemplate 1 2048325873 ns/op 2412000000 user-ns/op 0 B/op 0 allocs/op $ go tool pprof -alloc_space `go tool -n compile` proto.p File: compile Type: alloc_space Time: May 29, 2018 at 2:10pm (EDT) Entering interactive mode (type "help" for commands, "o" for options) (pprof) top15 Showing nodes accounting for 17077.06kB, 49.01% of 34844.14kB total Dropped 798 nodes (cum <= 174.22kB) Showing top 15 nodes out of 268 flat flat% sum% cum cum% 3138.25kB 9.01% 9.01% 3138.25kB 9.01% cmd/compile/internal/gc.nodl 1960kB 5.63% 14.63% 1960kB 5.63% cmd/compile/internal/gc.init 1718.83kB 4.93% 19.56% 2546.77kB 7.31% cmd/compile/internal/ssa.(*regAllocState).init 1565.16kB 4.49% 24.06% 1565.16kB 4.49% cmd/compile/internal/gc.newnamel 1443.92kB 4.14% 28.20% 1443.92kB 4.14% cmd/compile/internal/types.New 1404.48kB 4.03% 32.23% 1404.48kB 4.03% cmd/compile/internal/ssa.liveValues 1204.22kB 3.46% 35.69% 1758.28kB 5.05% cmd/compile/internal/ssa.cse 1010.50kB 2.90% 38.59% 1025.57kB 2.94% cmd/compile/internal/ssa.schedule 702.23kB 2.02% 40.60% 722kB 2.07% cmd/compile/internal/ssa.(*regAllocState).computeLive 586.05kB 1.68% 42.28% 586.05kB 1.68% cmd/internal/obj.(*Link).LookupInit 522.08kB 1.50% 43.78% 1714.98kB 4.92% cmd/compile/internal/ssa.(*regAllocState).regalloc 486.50kB 1.40% 45.18% 631.91kB 1.81% cmd/internal/obj.(*LSym).WriteAddr 483.39kB 1.39% 46.57% 483.39kB 1.39% cmd/compile/internal/gc.scopePCs 434.25kB 1.25% 47.81% 434.25kB 1.25% cmd/internal/obj.(*LSym).Grow 417.22kB 1.20% 49.01% 615.39kB 1.77% cmd/compile/internal/gc.(*Liveness).epilogue Legacy format: $ compilebench -alloc -memprofile=legacy.p -run=Template -memprofilerate=1 BenchmarkTemplate 1 16227441634 ns/op 16956000000 user-ns/op 36131488 B/op 346778 allocs/op $ go tool pprof -alloc_space `go tool -n compile` legacy.p File: compile Type: alloc_space Entering interactive mode (type "help" for commands, "o" for options) (pprof) top15 Showing nodes accounting for 18882.81kB, 54.19% of 34845.70kB total Dropped 826 nodes (cum <= 174.23kB) Showing top 15 nodes out of 281 flat flat% sum% cum cum% 3138.25kB 9.01% 9.01% 3138.25kB 9.01% cmd/compile/internal/gc.nodl 2214.36kB 6.35% 15.36% 2214.36kB 6.35% runtime.makeBucketArray 1960kB 5.62% 20.99% 1960kB 5.62% cmd/compile/internal/gc.init 1709.69kB 4.91% 25.89% 2546.75kB 7.31% cmd/compile/internal/ssa.(*regAllocState).init 1565.16kB 4.49% 30.38% 1565.16kB 4.49% cmd/compile/internal/gc.newnamel 1445.42kB 4.15% 34.53% 1445.42kB 4.15% cmd/compile/internal/types.New 1404.48kB 4.03% 38.56% 1404.48kB 4.03% cmd/compile/internal/ssa.liveValues 1121.23kB 3.22% 41.78% 1759.83kB 5.05% cmd/compile/internal/ssa.cse 1010.50kB 2.90% 44.68% 1025.57kB 2.94% cmd/compile/internal/ssa.schedule 702.23kB 2.02% 46.70% 721.98kB 2.07% cmd/compile/internal/ssa.(*regAllocState).computeLive 580.56kB 1.67% 48.36% 580.56kB 1.67% runtime.rawstringtmp 545.44kB 1.57% 49.93% 886.59kB 2.54% runtime.mapassign_fast64ptr 515.61kB 1.48% 51.41% 1715.38kB 4.92% cmd/compile/internal/ssa.(*regAllocState).regalloc 486.50kB 1.40% 52.80% 631.91kB 1.81% cmd/internal/obj.(*LSym).WriteAddr 483.39kB 1.39% 54.19% 483.39kB 1.39% cmd/compile/internal/gc.scopePCs Looked slightly different, but if we let pprof hide 'runtime' functions, the legacy profile's top15 looks same as the output from the proto format. $ go tool pprof -alloc_space `go tool -n compile` legacy.p File: compile Type: alloc_space Entering interactive mode (type "help" for commands, "o" for options) (pprof) hide=runtime.* (pprof) top15 Active filters: hide=runtime.* Showing nodes accounting for 17078.84kB, 49.01% of 34845.70kB total Dropped 782 nodes (cum <= 174.23kB) Showing top 15 nodes out of 267 flat flat% sum% cum cum% 3138.25kB 9.01% 9.01% 3138.25kB 9.01% cmd/compile/internal/gc.nodl 1960kB 5.62% 14.63% 1960kB 5.62% cmd/compile/internal/gc.init 1718.83kB 4.93% 19.56% 2546.75kB 7.31% cmd/compile/internal/ssa.(*regAllocState).init 1565.16kB 4.49% 24.06% 1565.16kB 4.49% cmd/compile/internal/gc.newnamel 1445.42kB 4.15% 28.20% 1445.42kB 4.15% cmd/compile/internal/types.New 1404.48kB 4.03% 32.23% 1404.48kB 4.03% cmd/compile/internal/ssa.liveValues 1205.77kB 3.46% 35.69% 1759.83kB 5.05% cmd/compile/internal/ssa.cse 1010.50kB 2.90% 38.59% 1025.57kB 2.94% cmd/compile/internal/ssa.schedule 702.23kB 2.02% 40.61% 721.98kB 2.07% cmd/compile/internal/ssa.(*regAllocState).computeLive 584.83kB 1.68% 42.29% 584.83kB 1.68% cmd/internal/obj.(*Link).LookupInit 522.08kB 1.50% 43.79% 1715.38kB 4.92% cmd/compile/internal/ssa.(*regAllocState).regalloc 486.50kB 1.40% 45.18% 631.91kB 1.81% cmd/internal/obj.(*LSym).WriteAddr 483.39kB 1.39% 46.57% 483.39kB 1.39% cmd/compile/internal/gc.scopePCs 434.22kB 1.25% 47.82% 434.22kB 1.25% cmd/internal/obj.(*LSym).Grow 417.20kB 1.20% 49.01% 615.27kB 1.77% cmd/compile/internal/gc.(*Liveness).epilogue So, this seems like a bug caused by differences in legacy/proto format profile handling (cc @aalexand) |
@hyangah Is the bug here that the legacy format does not hide the runtime functions like the proto format does or am I misreading the whole thing? |
@aalexand I don't know if that's the whole story of the issue Josh sees, but making the legacy profile handler hide the runtime functions will help investigating and verifying the original issue. FYI, https://github.com/golang/go/blob/master/src/runtime/pprof/protomem.go#L38 is where runtime functions are skipped. |
This CL fixes a long-lasting bug that prevented pprof from recognizing Legacy heap profile produced by Go. Go reports four types of samples at once so the profile includes alloc_objects, alloc_space, inuse_objects, and inuse_space. The bug caused pprof to misclassify Go heap profile data and prevent selection of correct filtering/pruning patterns in analysis. Update golang/go#25096 Tested with the profile samples included in golang.org/issues/25096 (pprof hides the runtime functions with this change)
This CL fixes a long-lasting bug that prevented pprof from recognizing Legacy heap profile produced by Go. Go reports four types of samples at once so the profile includes alloc_objects, alloc_space, inuse_objects, and inuse_space. The bug caused pprof to misclassify Go heap profile data and prevent selection of correct filtering/pruning patterns in analysis. Update golang/go#25096 Tested with the profile samples included in golang.org/issues/25096 (pprof hides the runtime functions with this change)
Change https://golang.org/cl/115295 mentions this issue: |
This includes changes in pprof to support - the new -diff_base flag - fix for a bug in handling of legacy Go heap profiles Update #25096 Change-Id: I826ac9244f31cc2c4415388c44a0cbe77303e460 Reviewed-on: https://go-review.googlesource.com/115295 Run-TryBot: Hyang-Ah Hana Kim <[email protected]> TryBot-Result: Gobot Gobot <[email protected]> Reviewed-by: Brad Fitzpatrick <[email protected]>
With the newly built pprof, I verified that the profiles in the legacy format and the proto format report almost identical results if we collect profiles using -memprofilerate=1. With the default memprofile sampling rate, however, the results look different. The results are different from the results with -memprofilerate=1, too. The order in the top15 list varies a lot. This seems to me errors from sampling-based estimation, not the bug the original report suggested. @josharian, is there anything you want to check before closing this issue? Improving the accuracy of sampling is one thing I can think of but it should be tracked separately (independent of go1.11 release). |
Thanks! I’ll take a look as soon as I can. |
@hyangah the new output seems much more plausible. Thanks! And glancing over the diff, mishandling of the -alloc_space flag strikes me as exactly the sort of thing that would have explained what I was seeing.
Indeed. I've filed #25653. I'll close this now. Thanks again! |
Reproduce:
Though the pprof output says "alloc_space" at the top, this looks conspicuously like "inuse_space" to me.
If you change writeLegacyFormat in
cmd/compile/internal/gc/util.go
from 1 to 0, so that it writes using the pb format, the results look correct again:The choice of legacy format is so that compilebench can parse the stats dumped at the top; see e8d5989.
I discovered all this because I wanted to switch the compiler to use the new "allocs" profile and was confused that it appeared not to work. :)
We could presumably change compilebench to work somehow with the new pprof format, but depending on the root cause this bug might also impact others. Is this a runtime/pprof bug? A cmd/pprof bug?
cc @hyangah
The text was updated successfully, but these errors were encountered: