-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(servstate): don't hold both servicesLock and state lock at once #359
Conversation
This avoids the 3-lock deadlock described in canonical#314. Other goroutines may be holding the state lock and waiting for the services lock, so it's problematic to acquire both locks at once. Break that part of the cycle. We could do this inside serviceForStart/serviceForStop by manually calling Unlock() sooner, but that's error-prone, so continue using defer, but have the caller write the task log (which needs the state lock) after the services lock is released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a reasonable compromise, but I'd like to see something like #356 actually solve the problem. The nature of logging is that it can occur in a variety of places, and as you can see here, it is not the most convenient of things to pass around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot find any issues with the proposal.
This test consistently FAILs without the fix in this PR, but consistently PASSes with the fix in this PR. The repro is basically as per the instructions at canonical#314 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestDeadlock looks reasonable too. Tested locally.
…359) * fix(servstate): don't hold both servicesLock and state lock at once This avoids the 3-lock deadlock described in #314. Other goroutines may be holding the state lock and waiting for the services lock, so it's problematic to acquire both locks at once. Break that part of the cycle. We could do this inside serviceForStart/serviceForStop by manually calling Unlock() sooner, but that's error-prone, so continue using defer, but have the caller write the task log (which needs the state lock) after the services lock is released. * Add regression test for the deadlock issue This test consistently FAILs without the fix in this PR, but consistently PASSes with the fix in this PR. The repro is basically as per the instructions at #314 (comment)
…anonical#359) * fix(servstate): don't hold both servicesLock and state lock at once This avoids the 3-lock deadlock described in canonical#314. Other goroutines may be holding the state lock and waiting for the services lock, so it's problematic to acquire both locks at once. Break that part of the cycle. We could do this inside serviceForStart/serviceForStop by manually calling Unlock() sooner, but that's error-prone, so continue using defer, but have the caller write the task log (which needs the state lock) after the services lock is released. * Add regression test for the deadlock issue This test consistently FAILs without the fix in this PR, but consistently PASSes with the fix in this PR. The repro is basically as per the instructions at canonical#314 (comment)
This avoids the 3-lock deadlock described in
#314. Other goroutines
may be holding the state lock and waiting for the services lock, so
it's problematic to acquire both locks at once. Break that part of the cycle.
We could do this inside serviceForStart/serviceForStop by manually
calling Unlock() sooner, but that's error-prone, so continue using defer,
but have the caller write the task log (which needs the state lock) after
the services lock is released.
This is in preference to the more invasive change in #356