runtime: long GC STW pauses (≥80ms) #19378

obeattie · 2017-03-03T11:05:03Z

What version of Go are you using (`go version`)?

go1.8 linux/amd64

What operating system and processor architecture are you using (`go env`)?

GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/go"
GORACE=""
GOROOT="/usr/local/go"
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
CXX="g++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"

The host where the application is running does not have go (just the compiled binary), so this is actually from the host which built it.

What did you do?

Following from a discussion on golang-nuts, I was advised to create an issue.

In our applications we are regularly seeing GC pauses of ≥80ms, even for extremely simple programs with light workloads. These long pauses seem to be sporadic; most GC's take fractions of a millisecond as I would expect, but in some runs it seems that the sweep termination phase (which I understand is STW) is taking a very long time. Given these applications are network RPC servers, pauses like these are extremely problematic for tail latencies.

I have captured a runtime/trace here showing one such pause, along with the output of GODEBUG=gctrace=1 below for the same period. The long GC is at 451,008.477 ms in the trace and corresponds to gc 12 below.

I'd really appreciate any help investigating what could cause this.

GC forced
gc 9 @1111.485s 0%: 79+1.8+0.037 ms clock, 159+0.60/1.7/0.85+0.074 ms cpu, 3->3->1 MB, 4 MB goal, 2 P
scvg7: 0 MB released
scvg7: inuse: 4, idle: 1, sys: 6, released: 0, consumed: 5 (MB)
GC forced
gc 10 @1232.068s 0%: 0.032+98+0.30 ms clock, 0.065+0/97/1.9+0.60 ms cpu, 2->2->1 MB, 4 MB goal, 2 P
scvg8: 0 MB released
scvg8: inuse: 4, idle: 1, sys: 6, released: 1, consumed: 4 (MB)
GC forced
gc 11 @1352.178s 0%: 0.037+5.7+0.021 ms clock, 0.074+0/5.6/0+0.043 ms cpu, 2->2->1 MB, 4 MB goal, 2 P
GC forced
gc 12 @1472.566s 0%: 98+2.7+0.025 ms clock, 197+0/2.7/1.9+0.051 ms cpu, 3->3->1 MB, 4 MB goal, 2 P
scvg9: inuse: 4, idle: 1, sys: 6, released: 1, consumed: 4 (MB)
GC forced
gc 13 @1593.066s 0%: 0.029+105+0.030 ms clock, 0.058+0/102/6.0+0.061 ms cpu, 3->3->1 MB, 4 MB goal, 2 P
scvg10: 0 MB released
scvg10: inuse: 4, idle: 1, sys: 6, released: 0, consumed: 5 (MB)
GC forced
gc 14 @1713.512s 0%: 0.092+55+0.051 ms clock, 0.18+53/53/1.7+0.10 ms cpu, 3->3->1 MB, 4 MB goal, 2 P
scvg11: inuse: 4, idle: 1, sys: 6, released: 0, consumed: 5 (MB)
GC forced
gc 15 @1834.066s 0%: 0.020+2.1+0.019 ms clock, 0.041+0/2.0/1.4+0.039 ms cpu, 2->2->1 MB, 4 MB goal, 2 P```

The text was updated successfully, but these errors were encountered:

obeattie · 2017-03-03T17:42:22Z

Here is another runtime/trace that shows this a little more clearly – the first GC captured has fast STW phases, but the second apparently spends 83ms in sweep termination. Associated gctrace output:

GC forced
gc 224 @26972.566s 0%: 0.019+104+0.15 ms clock, 0.039+0.66/103/101+0.30 ms cpu, 3->3->1 MB, 4 MB goal, 2 P
scvg179: inuse: 4, idle: 1, sys: 6, released: 0, consumed: 5 (MB)
GC forced
gc 225 @27092.882s 0%: 83+1.5+0.024 ms clock, 167+0.91/82/1.8+0.048 ms cpu, 2->2->1 MB, 4 MB goal, 2 P
scvg180: inuse: 4, idle: 1, sys: 6, released: 0, consumed: 5 (MB)

obeattie · 2017-03-10T15:29:29Z

Is there anything I can do/any more information I can gather to help debug this?

I know this probably isn't very high priority, but this is having a fairly severe effect on our system, so I'd love to do anything I can to get to the bottom of this.

bradfitz · 2017-03-10T16:25:06Z

@obeattie, sorry, I just assigned it to @aclements. It's likely nobody on the runtime team saw it until now.

aclements · 2017-03-10T17:32:35Z

@obeattie, thanks for the execution traces. Unfortunately, I spent a while looking at them and wasn't able to glean much about the actual problem.

It's interesting that these are always periodic GCs and that your heap is under 4 MB. Looking through sweep termination, I could make guesses about what's going on, but they would all be shots in the dark. Background sweeping has long finished sweeping before we enter sweep termination, so I don't think the sweep termination part of sweep termination is the problem.

Could you collect a pprof profile? Given how idle your program generally is, I suspect you would get some samples under runtime.forcegchelper that may give us a better idea of what's holding it up. You might want to crank up the profile sampling rate with runtime.SetCPUProfileRate. In the pprof tool, use something like "web forcegchelper" to focus on samples including forcegchelper.

obeattie · 2017-03-10T18:40:23Z

Thanks for getting back to me 😄 I'll get this data as soon as I can.

aclements · 2017-05-18T15:51:47Z

@obeattie, ping.

obeattie · 2017-05-19T19:37:24Z

So sorry for leaving this open for so long with no follow-up. 🤦‍♂️ Thank you for the ping!

We have been able to mitigate the impact of this issue somewhat but it is still happening. Our investigations were hampered by some infrastructure issues. We've resolved those issues in the last week though, so we should be able to look into this more deeply now.

My current thinking is that some behaviour of the Linux scheduler is involved. The fact that the GC times are so closely clustered around 100ms – which is the default cfs_period_us – is very suspicious. Getting a patch for #19497 would be super helpful to help confirm this. Is that a possibility? I'm happy to take a stab myself if need be, but I'd need some pointers.

jeffallen · 2017-05-30T14:19:35Z

Hello @obeattie,

I got interested in your issue and decided to try to see if I could prove or disprove your theory about CFS before digging into it from a Go point of view.

I wrote a little program that does network transactions (talking with itself, via localhost) and dying as soon as it sees one transaction that takes more than 80ms. It is here: https://play.golang.org/p/7ZXSoCeb18

(Put it in a directory called "server" and do "go build" on it.)

Running "./server" does not trigger the long transactions, even with significant other load on the machine. (See the -spin argument for how I was generating load.)

But it is possible to trigger the 80 ms transactions by putting the server inside of a cgroup which is using cpu.cfs_quota_us. To see it in action:

# mkdir /sys/fs/cgroup/issue19378
# cd /sys/fs/cgroup/issue19378
# echo $$ > tasks
# echo 100000 > cpu.cfs_period_us
# echo -1 > cpu.cfs_quota_us
# ./server

In another window, do the following:

# echo 400000 > cpu.cfs_quota_us
# echo 200000 > cpu.cfs_quota_us
# echo 100000 > cpu.cfs_quota_us

As quota approaches period, the max latency goes up, eventually reaching 80ms and the log.Fatal triggers.

If quota is at 200% of period (i.e. quota = 200000, period = 100000), you can even run "./server -spin" next to "./server" and you'll end up with max latencies of 50ms. (I can't quite explain why that is; I expected the max latency to still be 100 ms in that case, maybe I just got lucky.)

What's happening is that when the quota is exhausted, all processes in the cgroup are starved of time slices until the next period starts. If period is 100ms, you can eventually expect to get unlucky and have one transaction take up to 100ms.

The solution is to use a lower period, so that the worst case wait to finish any given transaction is lower. This config allows the spinner and the server to coexist with max latencies of 50 ms, at least on my machine:

# echo 10000 > cpu.cfs_period_us
# echo 10000 > cpu.cfs_quota_us

Are you running your server inside of Docker, managed by kubernetes?

If so, there's still some work to do to figure out how to get your cgroups
to use lower periods. Here's some references to it:

The second URL seems to indicate that Kubernetes will not allow you to change the period. :(

obeattie · 2017-06-02T04:21:51Z

@jeffallen Thank you so much for contributing your time to investigate this. Your findings match mine, and indeed these problems are seen when running under Kubernetes with CFS quotas enforced. I will need to work out some way to change the period under k8s…

Based on these findings I'm now confident that this is not a Go bug and I'm going to close the ticket. I think that fixing #19497 would be very helpful to anyone facing similar issues; debugging behaviour like this is somewhat like trying to find a needle in a haystack at the moment.

Thank you to everyone who's helped with this ❤️

ALTree changed the title ~~Long GC STW pauses (≥80ms)~~ runtime: long GC STW pauses (≥80ms) Mar 3, 2017

ALTree added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Mar 3, 2017

bradfitz assigned aclements Mar 10, 2017

bradfitz added this to the Go1.9 milestone Mar 10, 2017

This was referenced Mar 10, 2017

misc/trace: events disappear at high zoom #19496

Open

runtime: measure actual CPU time of GC #19497

Open

obeattie closed this as completed Jun 2, 2017

obeattie mentioned this issue Aug 10, 2017

[PLAT-713] Allow the CFS period to be tuned for containers monzo/kubernetes#1

Closed

embano1 mentioned this issue Nov 7, 2017

resources.cpu.requests + GOMAXPROCS prometheus-operator/prometheus-operator#501

Closed

golang locked and limited conversation to collaborators Jun 2, 2018

gopherbot added the FrozenDueToAge label Jun 2, 2018

rsc unassigned aclements Jun 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runtime: long GC STW pauses (≥80ms) #19378

runtime: long GC STW pauses (≥80ms) #19378

obeattie commented Mar 3, 2017 •

edited

Loading

obeattie commented Mar 3, 2017

obeattie commented Mar 10, 2017 •

edited

Loading

bradfitz commented Mar 10, 2017

aclements commented Mar 10, 2017

obeattie commented Mar 10, 2017

aclements commented May 18, 2017

obeattie commented May 19, 2017

jeffallen commented May 30, 2017 •

edited

Loading

obeattie commented Jun 2, 2017

runtime: long GC STW pauses (≥80ms) #19378

runtime: long GC STW pauses (≥80ms) #19378

Comments

obeattie commented Mar 3, 2017 • edited Loading

What version of Go are you using (go version)?

What operating system and processor architecture are you using (go env)?

What did you do?

obeattie commented Mar 3, 2017

obeattie commented Mar 10, 2017 • edited Loading

bradfitz commented Mar 10, 2017

aclements commented Mar 10, 2017

obeattie commented Mar 10, 2017

aclements commented May 18, 2017

obeattie commented May 19, 2017

jeffallen commented May 30, 2017 • edited Loading

obeattie commented Jun 2, 2017

obeattie commented Mar 3, 2017 •

edited

Loading

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?

obeattie commented Mar 10, 2017 •

edited

Loading

jeffallen commented May 30, 2017 •

edited

Loading