-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: GoroutineProfile causes latency spikes in high-goroutine-count application #33250
Comments
@aclements what do you think is the right approach here? Today, Could it instead inspect only the What level of consistency is required in the goroutine profile? Currently it's completely consistent, since the world is stopped. The function's docs also promise that the result slice won't be modified unless it writes out the full profile. The The GC has a couple mechanisms for visiting all goroutines: stack scanning and stack shrinking. Extending those seems complex, but might lead to a low execution overhead. The simpler option I can think of involves adding a field to |
I also want to see GoroutineProfile scale. On the other hand, we use GoroutineProfile often when investigating issues involving inspections on locks and relationship among consumer/producer goroutines. It would be nice if we can still perform that kind of analysis with goroutine profile after this issue is addressed. |
Change https://go.dev/cl/387414 mentions this issue: |
Change https://go.dev/cl/387415 mentions this issue: |
Change https://go.dev/cl/387416 mentions this issue: |
For #33250 Change-Id: Ic7aa74b1bb5da9c4319718bac96316b236cb40b2 Reviewed-on: https://go-review.googlesource.com/c/go/+/387414 Run-TryBot: Rhys Hiltner <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Michael Knyszek <[email protected]> Reviewed-by: David Chase <[email protected]>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
The issue reproduces on go1.12.7 and go1.13beta1.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I have a (production, latency-sensitive) Go service that has a median goroutine count of about 300,000 during steady state operations. The program periodically collects profiling data, including goroutine profiles.
What did you expect to see?
I expected the application’s performance behavior to be only slightly affected by collecting the goroutine profile (as is the case for collecting a heap profile, mutex contention profile, or CPU profile). I expected the magnitude of the latency impact to not scale with the number of goroutines (as the performance impact of heap profiling doesn’t change significantly with the size of the heap).
I expect some latency impact from a stop-the-world pause to prepare for the profile. I do not expect the duration of that pause to scale in proportion to the number of goroutines.
What did you see instead?
When Goroutine profiling is enabled, the 99th percentile latency for certain operations significantly increases. I have other programs that collect goroutine profiles with far fewer goroutines that do not incur the same latency spikes.
The benchmark attempts to replicate the latency spike by emulating the work done by the real application. The real application has one goroutine that periodically requests a goroutine profile while the other goroutines do some operation and report how long it took. The benchmark sets up a scenario with one goroutine that continuously collects goroutine profiles, n idle goroutines, and one goroutine that reports the duration of sleeping for 1 millisecond. The benchmark runs with several n values to demonstrate the latency impact as n grows. The results here show that when a goroutine profile is being collected, the p90, p99, and p99.9 wall-clock time the reproducer takes to sleep for 1ms increases with the number of live goroutines.
Reproducer:
CC @aclements @rhysh
The text was updated successfully, but these errors were encountered: