-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
profiler: Fix infinite loop in maxPauseNs #927
Conversation
I noticed this issue when running the TestProfilerInternal test by itself, which consistently produced the following error: ``` $ go test ./profiler -run TestProfilerInternal --- FAIL: TestProfilerInternal (0.20s) --- FAIL: TestProfilerInternal/collect (0.20s) profiler_test.go:189: missing batch FAIL FAIL gopkg.in/DataDog/dd-trace-go.v1/profiler 0.395s ``` The reason for this timeout is that the loop inside of maxPauseNs was not hitting its termination conditions when stats.NumGC is 0. The reason for this is subtle: - i is uint32, so it will always be <= 0 - periodStart is "1729-02-04" when running the metrics profile for the first time. This is caused by another issue which will be addressed in a follow-up patch. As far as I can tell this issue should not occur in the real world because the periodStart will not be 1729 after `profiler.run` resets the metrics collectedAt timestamp.
@pmbauer I noticed this while trying to play around with the GoLand IDE for non-work related reasons. I don't think this issue is hitting in the real world, but I think we shouldn't have any kind of loops that rely on precarious invariants for termination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay; as you pointed out, not a real world concern
also, not a bad idea to have a failing test |
You mean adding a new test? Yeah, I guess that's doable. Will try to add one and ping you. |
You mean adding a new test?
I meant failing test. The code without this change passes CI okay. The test
failure in the pull description requires specific conditions.
…On Tuesday, May 18, 2021, Felix Geisendörfer ***@***.***> wrote:
also, not a bad idea to have a failing test
You mean adding a new test? Yeah, I guess that's doable. Will try to add
one and ping you.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#927 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADOXTBPL2WOLQBELVJIH2LTOJ2FZANCNFSM4462I5VQ>
.
|
Ok, now I'm confused : p. You want me to have a failing test, but not add a new test? So are you suggesting to change the CI YAML to add Sorry if I'm being dense here 🙈 |
Sorry, yes, failing and new test. Again, sorry ... it's not a big deal.
I'm fine merging as is
…On Wed, May 19, 2021 at 2:40 AM Felix Geisendörfer ***@***.***> wrote:
I meant failing test. The code without this change passes CI okay. The test
failure in the pull description requires specific conditions.
Ok, now I'm confused : p. You want me to have a failing test, but not add
a new test? So are you suggesting to change the CI YAML to add go test
./profiler -run TestProfilerInternal?
Sorry if I'm being dense here 🙈
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#927 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADOXTED2M64OOXLWW4A2NLTONMOPANCNFSM4462I5VQ>
.
|
@pmbauer got it. I'll do this in the follow-up PR that will try to fix the "problem" that |
I noticed this issue when running the TestProfilerInternal test by
itself, which consistently produced the following error:
The reason for this timeout is that the loop inside of maxPauseNs was
not hitting its termination conditions when stats.NumGC is 0. The reason
for this is subtle:
time. This is caused by another issue which will be addressed in a
follow-up patch.
As far as I can tell this issue should not occur in the real world
because the periodStart will not be 1729 after
profiler.run
resets themetrics collectedAt timestamp.