Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: collecting mutex or block profiles add ~100ms of wall time #63078

Closed
rhysh opened this issue Sep 19, 2023 · 5 comments
Closed
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@rhysh
Copy link
Contributor

rhysh commented Sep 19, 2023

What version of Go are you using (go version)?

go1.21 version
go version go1.21.1 darwin/arm64

Does this issue reproduce with the latest release?

Yes, Go 1.21 is the latest release series.

What operating system and processor architecture are you using (go env)?

go env Output
$ go1.21 env
GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/rhys/Library/Caches/go-build'
GOENV='/Users/rhys/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/rhys/go/pkg/mod'
GONOPROXY='*'
GONOSUMDB='*'
GOOS='darwin'
GOPATH='/Users/rhys/go'
GOPRIVATE='*'
GOPROXY='direct'
GOROOT='/Users/rhys/go/version/go1.21'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/Users/rhys/go/version/go1.21/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.21.1'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/lf/n4ry0qv5639_3z0nhyhy3z7c0000gs/T/go-build1797745903=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

$ cat slowprof.go
package main

import (
	"flag"
	"io"
	"log"
	"runtime"
	"runtime/pprof"
)

func main() {
	cpu := flag.Bool("cpu", false, "collect a CPU profile")
	mutex := flag.Int("mutex", -1, "collect a mutex profile with specified debug level")
	block := flag.Int("block", -1, "collect a block profile with specified debug level")
	flag.Parse()

	if *cpu {
		err := pprof.StartCPUProfile(io.Discard)
		if err != nil {
			log.Fatalf("pprof.StartCPUProfile: %v", err)
		}
		pprof.StopCPUProfile()
	}

	if *mutex >= 0 {
		runtime.SetMutexProfileFraction(2)
		err := pprof.Lookup("mutex").WriteTo(io.Discard, *mutex)
		if err != nil {
			log.Fatalf(`pprof.Lookup("mutex").WriteTo: %v`, err)
		}
	}

	if *block >= 0 {
		runtime.SetBlockProfileRate(2)
		err := pprof.Lookup("block").WriteTo(io.Discard, *block)
		if err != nil {
			log.Fatalf(`pprof.Lookup("block").WriteTo: %v`, err)
		}
	}
}

$ go1.21 build -o /tmp/slowprof ./slowprof.go

$ for i in {1..10}; do time /tmp/slowprof; done                          
/tmp/slowprof  0.00s user 0.01s system 58% cpu 0.022 total
/tmp/slowprof  0.00s user 0.00s system 36% cpu 0.009 total
/tmp/slowprof  0.00s user 0.00s system 55% cpu 0.007 total
/tmp/slowprof  0.00s user 0.00s system 41% cpu 0.010 total
/tmp/slowprof  0.00s user 0.00s system 39% cpu 0.009 total
/tmp/slowprof  0.00s user 0.00s system 36% cpu 0.006 total
/tmp/slowprof  0.00s user 0.00s system 37% cpu 0.005 total
/tmp/slowprof  0.00s user 0.00s system 33% cpu 0.008 total
/tmp/slowprof  0.00s user 0.00s system 29% cpu 0.007 total
/tmp/slowprof  0.00s user 0.00s system 29% cpu 0.007 total


$ for i in {1..10}; do time GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0; done
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.01s system 7% cpu 0.120 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 4% cpu 0.117 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.01s system 7% cpu 0.120 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 6% cpu 0.120 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.115 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.113 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.112 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 6% cpu 0.113 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 5% cpu 0.111 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -mutex=0  0.00s user 0.00s system 6% cpu 0.118 total

$ for i in {1..10}; do time GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0; done
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 5% cpu 0.116 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.119 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.00s system 6% cpu 0.119 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 7% cpu 0.119 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 6% cpu 0.118 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 7% cpu 0.116 total
GODEBUG=asyncpreemptoff=1 /tmp/slowprof -block=0  0.00s user 0.01s system 7% cpu 0.119 total

What did you expect to see?

Somewhat small overhead for using block and mutex profiles in otherwise lightweight tools. Small impact to app startup (which may include the call to runtime.SetBlockProfileRate).

What did you see instead?

The first call to runtime.tickspersecond calls usleep(100 * 1000), which (if uninterrupted) takes 100ms. There's one such call in runtime.SetBlockProfileRate https://github.com/golang/go/blob/go1.21.1/src/runtime/mprof.go#L471 when configuring the block profiler, even if that profile is never read. There are others in the code to write out mutex and block profiles.

@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Sep 19, 2023
@thanm thanm added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 25, 2023
@mknyszek mknyszek added this to the Backlog milestone Oct 4, 2023
@mknyszek
Copy link
Contributor

mknyszek commented Oct 4, 2023

@rhysh Any chance you're interested in working on this? :) I added the "help wanted" tag for now since I don't think we'll get around to it for the next release.

@rhysh
Copy link
Contributor Author

rhysh commented Oct 4, 2023

I'm not sure what the right product choice is on how little (or absent) the delay can/should be. Today, it's usually much faster than this because of async preemption / #63103.

At the moment I'm happy to have an open issue that describes where the profilers are in need of improvement, instead of having to find the skeletons one by one and then describing them by word of mouth. Thanks for adding the label; I won't be able to get to this for Go 1.22 either.

@thepenguinco
Copy link

I'm happy to pick this up as a starter task, what's the expected behaviour?

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/538898 mentions this issue: runtime: improve tickspersecond

@mknyszek
Copy link
Contributor

mknyszek commented Nov 1, 2023

@thepenguinco Thanks for offering! Sorry for not replying. It turns out to be on the critical path (ish) for something I was already working on, so I tried to improve the situation a bit.

I think the goals is just to reduce the amount of wall time added.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge help wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants