-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util/tracing: disable span reuse under deadlock detector #75423
Conversation
In cockroachdb#73883 we started reusing tracing spans through a sync.Pool. This causes the mutexes in the spans to be acquired by a goroutine repeatedly, in a random order. Multiple spans can be locked at the same time, but the ordering well defined (a child can be locked while the parent is also locked). This erroneously looks like a possible deadlock to the deadlock detector. The following test demonstrates the problem - it's flagged by the deadlock detector even though the code is deadlock-free. This patch fixes the issue by disabling span reuse under deadlock builds. I guess the moral of the story is that recycling structs with locks through sync.Pool is incompatible with the deadlock detector? One option, I guess, is to contribute a way to mark some mutexes as excluded from the deadlock detector. func TestXXX(t *testing.T) { p := sync.Pool{New: func() interface{} { return new(syncutil.Mutex) }, } var m1, m2 *syncutil.Mutex m1 = p.Get().(*syncutil.Mutex) m2 = p.Get().(*syncutil.Mutex) m1.Lock() m2.Lock() m2.Unlock() m1.Unlock() p.Put(m1) p.Put(m2) m2 = p.Get().(*syncutil.Mutex) m1 = p.Get().(*syncutil.Mutex) m1.Lock() m2.Lock() m2.Unlock() m1.Unlock() } Fixes cockroachdb#75351 and a million others. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than disabling span reuse, we could add a syncutil.ReusableMutex
or something like that that is always a simple mutex, even for deadlock builds. We should file an issue for github.com/sasha-s/go-deadlock to provide some kind of Reset function and then we could expose that through ReusableMutex
.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @erikgrinaker, and @nvanbenschoten)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was thinking similar things. But let's do this patch first, I'd say.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @erikgrinaker, and @nvanbenschoten)
bors r+ |
Build failed: |
bors r+ |
Build failed: |
keeps flaking on #68395 bors r+ |
Build failed (retrying...): |
Build succeeded: |
As explained in cockroachdb#75423, span reuse is incompatible with the deadlock detector, which flags this reuse as false-positives. That patch generally disabled reuse when running under the test detector. This patch skips a few tests that explicitly ask for span reuse. I'll separately try to improve the deadlock detector such that it can be used with our code. Fixes cockroachdb#75433 Fixes cockroachdb#75488 Release note: None
75439: util/tracing: skip a few tests under deadlock detector r=andreimatei a=andreimatei As explained in #75423, span reuse is incompatible with the deadlock detector, which flags this reuse as false-positives. That patch generally disabled reuse when running under the test detector. This patch skips a few tests that explicitly ask for span reuse. I'll separately try to improve the deadlock detector such that it can be used with our code. Fixes #75433 Fixes #75488 Release note: None Co-authored-by: Andrei Matei <[email protected]>
In #73883 we started reusing tracing spans through a sync.Pool. This
causes the mutexes in the spans to be acquired by a goroutine
repeatedly, in a random order. Multiple spans can be locked at the same
time, but the ordering well defined (a child can be locked while the
parent is also locked). This erroneously looks like a possible deadlock
to the deadlock detector. The following test demonstrates the problem -
it's flagged by the deadlock detector even though the code is
deadlock-free.
This patch fixes the issue by disabling span reuse under deadlock
builds.
I guess the moral of the story is that recycling structs with locks
through sync.Pool is incompatible with the deadlock detector? One
option, I guess, is to contribute a way to mark some mutexes as excluded
from the deadlock detector.
Fixes #75351 and a million others.
Release note: None