-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
syscall: cgo calling of glibc nptl:setxid functions timing out in the face of pthread termination pressure #42494
Comments
Please feel free to assign this issue to me. |
It would appear that this is an issue at HEAD and at least as far back as go1.10.8 (I've not looked further back): The test code, a cgo program that performs one privileged system call via glibc:
For the testing, I downloaded sdk builds for
This is the test log:
As can be seen above:
|
I get essentially the same result using a modified version of the same program which uses the
The result being:
I think I'm going to explore this issue using this |
So these tests are just stand alone cgo programs to focus on getting to the bottom of these failures from the #42462 (comment) :
When I filed this present bug, I had assumed the issue was generic to all Linux architectures, but on the face of it 21 build tests passed. Is there a list of what the other 21 tests are? Do any others include cgo builds executed as uid=0 ? Could this issue just be a x86 thing? |
I guess for completeness, here is the same code which runs the new (targeted to 1.16) syscall.AllThreadsSyscall() function (this is just a more explicit variant of what the syscall.Seteuid() function does when not needing cgo):
The result being the intended one:
|
Exploring the behavior against the As such, the workaround for the I've tagged that one
Runs as follows:
Success! Next up, try to figure out how to make the |
Change https://golang.org/cl/269799 mentions this issue: |
The above performs as follows:
|
I'm wondering if this should be marked a release blocker? Also, is there an annotation to consider back-porting the fix to one of the earlier releases? |
It seems like this is not a new problem, so it shouldn't be a release blocker. Of course it would be good if we can fix it for 1.16, but we shouldn't block the release over it. Similarly, this doesn't meet our usual backport criteria. |
What version of Go are you using (
go version
)?HEAD (1.16) release branch, but suspect earlier go toolchains might also suffer from this.
Does this issue reproduce with the latest release?
Don't know yet, but plan to start my investigation by writing a standalone go test case.
Update 2020-11-25: Yes, all versions of Go at least as far back as 1.10 (but probably earlier too, untested).
What operating system and processor architecture are you using (
go env
)?All variants of Linux architectures.
What did you do?
Aggressively terminate threads while invoking glibc-backed ntpl:setxid functions linked with cgo.
What did you expect to see?
No crashes or timeouts.
What did you see instead?
In the 1.16 build tree, this yields timeouts when running the syscall.TestSetuidEtc() as root with a modification to the test explored in:
#42462 (comment)
The nocgo support for this test, featuring a new syscall.AllThreadsSyscall() implementation does not lockup. It is just the cgo redirection to glibc implementation of the various functions that times out in the face of this thread pressure - inside glibc code.
Example:
The text was updated successfully, but these errors were encountered: