Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: musl uses signal 34 for setgid #39343

Closed
nmeum opened this issue Jun 1, 2020 · 19 comments
Closed

runtime: musl uses signal 34 for setgid #39343

nmeum opened this issue Jun 1, 2020 · 19 comments
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@nmeum
Copy link

nmeum commented Jun 1, 2020

What version of Go are you using (go version)?

$ go version
go version go1.14.3 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/soeren/.cache/go-build"
GOENV="/home/soeren/.config/go/env"
GOEXE=""
GOFLAGS="-buildmode=pie"
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/soeren/src/go"
GOPRIVATE=""
GOPROXY="direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD=""
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build010166581=/tmp/go-build -gno-record-gcc-switches"

What did you do?

Started the go test suite using:

./run.bash -no-rebuild -run='cgo_test'

What did you expect to see?

A successful test run.

What did you see instead?

A failed test run:

##### ../misc/cgo/test
SIGQUIT: quit
PC=0x46e261 m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex(0x7409e8, 0x80, 0x0, 0x0, 0x7fff00000000, 0x7fffd799a608, 0x440ef3, 0xc000034848, 0x7fffd799a628, 0x41249f, ...)
	/home/soeren/src/aports/community/go/src/go/src/runtime/sys_linux_amd64.s:567 +0x21
runtime.futexsleep(0x7409e8, 0x7fff00000000, 0xffffffffffffffff)
	/home/soeren/src/aports/community/go/src/go/src/runtime/os_linux.go:45 +0x46
runtime.notesleep(0x7409e8)
	/home/soeren/src/aports/community/go/src/go/src/runtime/lock_futex.go:151 +0x9f
runtime.stoplockedm()
	/home/soeren/src/aports/community/go/src/go/src/runtime/proc.go:1971 +0x88
runtime.schedule()
	/home/soeren/src/aports/community/go/src/go/src/runtime/proc.go:2454 +0x4a6
runtime.park_m(0xc000082600)
	/home/soeren/src/aports/community/go/src/go/src/runtime/proc.go:2690 +0x9d
runtime.mcall(0x5)
	/home/soeren/src/aports/community/go/src/go/src/runtime/asm_amd64.s:318 +0x5b

goroutine 1 [chan receive]:
testing.(*T).Run(0xc0000a4120, 0x5cd2ae, 0xa, 0x5d8420, 0x486c26)
	/home/soeren/src/aports/community/go/src/go/src/testing/testing.go:1043 +0x37e
testing.runTests.func1(0xc0000a4000)
	/home/soeren/src/aports/community/go/src/go/src/testing/testing.go:1284 +0x78
testing.tRunner(0xc0000a4000, 0xc000043e10)
	/home/soeren/src/aports/community/go/src/go/src/testing/testing.go:991 +0xdc
testing.runTests(0xc0000a0000, 0x726e20, 0x59, 0x59, 0x0)
	/home/soeren/src/aports/community/go/src/go/src/testing/testing.go:1282 +0x2a7
testing.(*M).Run(0xc00009a000, 0x0)
	/home/soeren/src/aports/community/go/src/go/src/testing/testing.go:1199 +0x15f
main.main()
	_testmain.go:226 +0x135

goroutine 18 [select]:
misc/cgo/test.runTestSetgid(0xc0000a8180)
	/home/soeren/src/aports/community/go/src/go/misc/cgo/test/setgid_linux.go:30 +0xeb
misc/cgo/test.testSetgid(0xc0000a4120)
	/home/soeren/src/aports/community/go/src/go/misc/cgo/test/setgid_linux.go:46 +0x98
misc/cgo/test.TestSetgid(0xc0000a4120)
	/home/soeren/src/aports/community/go/src/go/misc/cgo/test/cgo_linux_test.go:16 +0x2b
testing.tRunner(0xc0000a4120, 0x5d8420)
	/home/soeren/src/aports/community/go/src/go/src/testing/testing.go:991 +0xdc
created by testing.(*T).Run
	/home/soeren/src/aports/community/go/src/go/src/testing/testing.go:1042 +0x357

goroutine 20 [runnable]:
os/signal.loop()
	/home/soeren/src/aports/community/go/src/go/src/os/signal/signal_unix.go:21
created by os/signal.Notify.func1
	/home/soeren/src/aports/community/go/src/go/src/os/signal/signal.go:127 +0x44

goroutine 22 [syscall]:
misc/cgo/test._Cfunc_setgid(0x0, 0x0)
	_cgo_gotypes.go:2114 +0x49
misc/cgo/test.runTestSetgid.func1(0xc0000a2360)
	/home/soeren/src/aports/community/go/src/go/misc/cgo/test/setgid_linux.go:27 +0x29
created by misc/cgo/test.runTestSetgid
	/home/soeren/src/aports/community/go/src/go/misc/cgo/test/setgid_linux.go:26 +0x6a

rax    0xca
rbx    0x7408a0
rcx    0x46f273
rdx    0x0
rdi    0x7409e8
rsi    0x80
rbp    0x7fffd799a5f0
rsp    0x7fffd799a5a8
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0xc0000a2300
r13    0xc0000a2300
r14    0xc0000a22a0
r15    0x0
rip    0x46f271
rflags 0x286
cs     0x33
fs     0x0
gs     0x0
*** Test killed with quit: ran too long (11m0s).
exit status 2
FAIL	misc/cgo/test	660.005s
2020/06/01 16:22:03 Failed: exit status 1
PASS
ok  	misc/cgo/testtls	0.009s
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
skipped due to earlier error
go tool dist: FAILED

Not sure if this is just a timeout problem on my hardware (i.e. if the timeout needs to be increased) or if this is some kind of deadlock. Any ideas how I could debug this further? I am unfourtunatly not farmilar with go internals but currently trying to make the entire go testsuite pass with our Alpine Linux package and this is the only remaining test which keeps failing.

@ianlancetaylor
Copy link
Member

According to the stack traces, it looks like the test calls the C function setgid and setgid never returns. Why would that happen?

Separately, it looks like select statement that has a 5 second timeout never times out in 11 minutes. That doesn't make any sense at all.

You can put this in a separate program by creating two files in an empty directory:

x.go

package cgotest

/*
#include <sys/types.h>
#include <unistd.h>
*/
import "C"

import (
	"os"
	"os/signal"
	"syscall"
	"testing"
	"time"
)

func runTestSetgid() bool {
	c := make(chan bool)
	go func() {
		C.setgid(0)
		c <- true
	}()
	select {
	case <-c:
		return true
	case <-time.After(5 * time.Second):
		return false
	}

}

func testSetgid(t *testing.T) {
	if !runTestSetgid() {
		t.Error("setgid hung")
	}

	// Now try it again after using signal.Notify.
	signal.Notify(make(chan os.Signal, 1), syscall.SIGINT)
	if !runTestSetgid() {
		t.Error("setgid hung after signal.Notify")
	}
}

x_test.go:

package cgotest

import "testing"

func TestSetgid(t *testing.T) {
	testSetgid(t)
}

What happens when you run go test in that directory?

@ianlancetaylor ianlancetaylor added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 1, 2020
@nmeum
Copy link
Author

nmeum commented Jun 1, 2020

What happens when you run go test in that directory?

Doesn't return, outputs nothing at all.

Output with SIGQUIT:

SIGQUIT: quit
PC=0x558020e66791 m=0 sigcode=0

goroutine 0 [idle]:
runtime.futex(0x558021046dc8, 0x80, 0x0, 0x0, 0x7ffe00000000, 0x7ffe41a0ab18, 0x558020e38655, 0xc000032bc8, 0x7ffe41a0ab38, 0x558020e0a273, ...)
	/usr/lib/go/src/runtime/sys_linux_amd64.s:567 +0x21
runtime.futexsleep(0x558021046dc8, 0x7ffe00000000, 0xffffffffffffffff)
	/usr/lib/go/src/runtime/os_linux.go:45 +0x46
runtime.notesleep(0x558021046dc8)
	/usr/lib/go/src/runtime/lock_futex.go:151 +0xa3
runtime.stoplockedm()
	/usr/lib/go/src/runtime/proc.go:1971 +0x8c
runtime.schedule()
	/usr/lib/go/src/runtime/proc.go:2454 +0x4aa
runtime.park_m(0xc000001c80)
	/usr/lib/go/src/runtime/proc.go:2690 +0xa3
runtime.mcall(0x2)
	/usr/lib/go/src/runtime/asm_amd64.s:318 +0x53

goroutine 1 [chan receive]:
testing.(*T).Run(0xc000110120, 0x558020f010f5, 0xa, 0x558020f70ad8, 0x558020e789a8)
	/usr/lib/go/src/testing/testing.go:1043 +0x380
testing.runTests.func1(0xc000110000)
	/usr/lib/go/src/testing/testing.go:1284 +0x7a
testing.tRunner(0xc000110000, 0xc000041e10)
	/usr/lib/go/src/testing/testing.go:991 +0xde
testing.runTests(0xc00000e0c0, 0x55802102caa0, 0x1, 0x1, 0x0)
	/usr/lib/go/src/testing/testing.go:1282 +0x2a9
testing.(*M).Run(0xc00010e000, 0x0)
	/usr/lib/go/src/testing/testing.go:1199 +0x161
main.main()
	_testmain.go:44 +0x137

goroutine 6 [select]:
_/tmp/test.runTestSetgid(0xc000058300)
	/tmp/test/x.go:23 +0xed
_/tmp/test.testSetgid(0xc000110120)
	/tmp/test/x.go:39 +0x9a
_/tmp/test.TestSetgid(0xc000110120)
	/tmp/test/x_test.go:6 +0x2d
testing.tRunner(0xc000110120, 0x558020f70ad8)
	/usr/lib/go/src/testing/testing.go:991 +0xde
created by testing.(*T).Run
	/usr/lib/go/src/testing/testing.go:1042 +0x359

goroutine 8 [runnable]:
os/signal.loop()
	/usr/lib/go/src/os/signal/signal_unix.go:21
created by os/signal.Notify.func1
	/usr/lib/go/src/os/signal/signal.go:127 +0x46

goroutine 10 [syscall]:
_/tmp/test._Cfunc_setgid(0x0, 0x0)
	_cgo_gotypes.go:47 +0x4b
_/tmp/test.runTestSetgid.func1(0xc000064420)
	/tmp/test/x.go:20 +0x2b
created by _/tmp/test.runTestSetgid
	/tmp/test/x.go:19 +0x6c

rax    0xca
rbx    0x558021046c80
rcx    0x558020e66793
rdx    0x0
rdi    0x558021046dc8
rsi    0x80
rbp    0x7ffe41a0ab00
rsp    0x7ffe41a0aab8
r8     0x0
r9     0x0
r10    0x0
r11    0x286
r12    0xc0000643c0
r13    0xc0000643c0
r14    0xc000064360
r15    0x39
rip    0x558020e66791
rflags 0x286
cs     0x33
fs     0x0
gs     0x0
exit status 2
FAIL	_/tmp/test	34.767s

Not sure if helpful but here is a strace -f output: strace-go-test.txt

@ianlancetaylor
Copy link
Member

Thanks for the strace output. What would be really helpful is to run

go test -c
strace -f ./x.test

Right now the strace output is mixing in all the go test system calls. Thanks.

@nmeum
Copy link
Author

nmeum commented Jun 2, 2020

While doing that I noticed that this is definitely flaky sometimes the program terminates and sometimes (most of the times) it doesn't here is the strace -f ./test.test output of both runs:

@ianlancetaylor
Copy link
Member

Thanks. Looks like the C library is using signal 34 to implement setgid, perhaps in the way that glibc uses signal 33. That is fairly annoying.

I see it here in the musl sources: https://git.musl-libc.org/cgit/musl/tree/src/internal/pthread_impl.h#n117

The fix will be to change runtime/sigtab_linux_generic.go and runtime/sigtab_linux_mipsx.go to handle signal 34 the way we handle signals 32 and 33.

That will have a chance of breaking some existing programs, so marking this for 1.16.

@ianlancetaylor ianlancetaylor added help wanted NeedsFix The path to resolution is known, but the work has not been done. labels Jun 2, 2020
@ianlancetaylor ianlancetaylor added this to the Go1.16 milestone Jun 2, 2020
@ianlancetaylor ianlancetaylor changed the title cgo_test fails on Alpine Linux Edge runtime: musl uses signal 34 for setgid Jun 2, 2020
@ianlancetaylor ianlancetaylor removed the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label Jun 2, 2020
algitbot pushed a commit to alpinelinux/aports that referenced this issue Jun 3, 2020
@GeorgeTsilias
Copy link
Contributor

Hello,
I'm looking to start contributing in Go and after your investigation this seems like a straightforward issue to start with.
Is it okay with you if I try to open a PR and get it fixed?
I'll get a standalone reproduction in an Alpine container, ensure that the fix works as intended and see if I can add any tests.
Thanks in advance!

@ianlancetaylor
Copy link
Member

@GeorgeTsilias Sure, go for it. Note that at this stage in the release process any change will not appear until the 1.16 release. Thanks.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/236518 mentions this issue: runtime: handle signal 34 for musl setgid

@fabled
Copy link

fabled commented Jun 20, 2020

This was reported earlier in musl list, and their comment is at https://www.openwall.com/lists/musl/2018/10/09/3. The suggestion there was to use SIGRTMIN which resolves to a function call to check what is the range for libc reserved signals.

@fabled
Copy link

fabled commented Jun 20, 2020

With the suggested patch, I don't see the hangs anymore. And most of the time things work. But on my application I also see sometimes gopanic with backtrace:

go version go1.14.3 linux/amd64

fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x55813a8 pc=0x465b65]

goroutine 4 [running]:
runtime.throw(0xc43ebe, 0x2a)
	/usr/lib/go/src/runtime/panic.go:1116 +0x72 fp=0xc00005cdf0 sp=0xc00005cdc0 pc=0x4343d2
runtime.sigpanic()
	/usr/lib/go/src/runtime/signal_unix.go:679 +0x46a fp=0xc00005ce20 sp=0xc00005cdf0 pc=0x44a47a
runtime.memmove(0xc0036a8000, 0x0, 0x5581428)
	/usr/lib/go/src/runtime/memmove_amd64.s:392 +0x485 fp=0xc00005ce28 sp=0xc00005ce20 pc=0x465b65
runtime.growslice(0xb04520, 0x0, 0xab0285, 0x0, 0xab0286, 0x406842, 0xc000070058, 0xc000070060)
	/usr/lib/go/src/runtime/slice.go:188 +0x19e fp=0xc00005ce90 sp=0xc00005ce28 pc=0x44b70e
runtime.doaddtimer(0xc000041800, 0xc00007a000)
	/usr/lib/go/src/runtime/time.go:273 +0x131 fp=0xc00005cee8 sp=0xc00005ce90 pc=0x4537d1
runtime.modtimer(0xc00007a000, 0x18def59750, 0x0, 0xc5c640, 0x0, 0x0, 0x0)
	/usr/lib/go/src/runtime/time.go:462 +0x22c fp=0xc00005cf28 sp=0xc00005cee8 pc=0x4540fc
runtime.resettimer(...)
	/usr/lib/go/src/runtime/time.go:514
time.resetTimer(...)
	/usr/lib/go/src/runtime/time.go:223
runtime.scavengeSleep(0xd19458, 0x134b140)
	/usr/lib/go/src/runtime/mgcscavenge.go:210 +0x8b fp=0xc00005cf78 sp=0xc00005cf28 pc=0x421f2b
runtime.bgscavenge(0xc000070000)
	/usr/lib/go/src/runtime/mgcscavenge.go:337 +0x1db fp=0xc00005cfd8 sp=0xc00005cf78 pc=0x42216b
runtime.goexit()
	/usr/lib/go/src/runtime/asm_amd64.s:1373 +0x1 fp=0xc00005cfe0 sp=0xc00005cfd8 pc=0x464531
created by runtime.gcenable
	/usr/lib/go/src/runtime/mgc.go:215 +0x7e

@nmeum
Copy link
Author

nmeum commented Jun 20, 2020

I also tested CL 236518 by backporting it to our Alpine go 1.14.3 package. When doing so the cgo_test does not hang anymore but it fails with the following error message now:

--- FAIL: TestCrossPackageTests (3.72s)
    pkg_test.go:67: go test -short: exit status 1
        --- FAIL: Test9400 (0.00s)
            issue9400_linux.go:55: entry 844 of test pattern is wrong; 0x7fd858b79c6d != 0x123456789abcdef
        FAIL
        exit status 1
        FAIL    cgotest 0.008s
FAIL
exit status 1
FAIL    misc/cgo/test   5.779s
2020/06/20 12:18:19 Failed: exit status 1
scatter = 0x4027f0
--- FAIL: TestCrossPackageTests (2.18s)
    pkg_test.go:67: go test -short: exit status 1
        --- FAIL: Test9400 (0.00s)
            issue9400_linux.go:55: entry 844 of test pattern is wrong; 0x7f575db46c6d != 0x123456789abcdef
        FAIL
        exit status 1
        FAIL    cgotest 0.004s
FAIL

@ianlancetaylor
Copy link
Member

@nmeum Thanks for the comments. It's best if we can keep issues focused on specific topics, so that we know when they are fixed. For problems with musl that are not related to the use of signal 34 for setgid, please open separate issues. Thanks.

@nmeum
Copy link
Author

nmeum commented Jul 11, 2020

With CL 236518 applied there still seem to be occasional crashes in the runTestSetgid code. For instance, the following backtrace is from our Alpine x86 CI. I wasn't yet able to reliable reproduce this, just wanted to post this here in case anyone has an idea what might be causing this.

##### ../misc/cgo/test
fatal error: index out of range

goroutine 34 [running]:
runtime.throw(0x81cf9ec, 0x12)
	/builds/nmeum/aports/community/go/src/go/src/runtime/panic.go:1116 +0x6a fp=0x899a610 sp=0x899a5fc pc=0x807c72a
runtime.panicCheck1(0x808df24, 0x81cf9ec, 0x12)
	/builds/nmeum/aports/community/go/src/go/src/runtime/panic.go:34 +0xb9 fp=0x899a624 sp=0x899a610 pc=0x807a1c9
runtime.goPanicIndex(0x8d88, 0x2)
	/builds/nmeum/aports/community/go/src/go/src/runtime/panic.go:87 +0x37 fp=0x899a648 sp=0x899a624 pc=0x807a277
runtime.sellock(0x899a748, 0x2, 0x2, 0x899a73c, 0x2, 0x2)
	/builds/nmeum/aports/community/go/src/go/src/runtime/select.go:48 +0x84 fp=0x899a65c sp=0x899a648 pc=0x808df24
runtime.selectgo(0x899a748, 0x899a738, 0x2, 0x0, 0x80b270b)
	/builds/nmeum/aports/community/go/src/go/src/runtime/select.go:319 +0xca8 fp=0x899a724 sp=0x899a65c pc=0x808ed18
runtime: unexpected return pc for misc/cgo/test.runTestSetgid called from 0xf0
stack: frame={sp:0x899a724, fp:0x899a77c} stack=[0x899a000,0x899a800)
0899a6a4:  00000000  08810010  00000018  00000002 
0899a6b4:  00000004  00000001  00000001  00000000 
0899a6c4:  088980c0  0808444d <runtime.wakeNetPoller+77>  08314ff0  0898c1ec 
0899a6d4:  0899a73c  089880e0  00000000  0881a480 
0899a6e4:  08898040  08898040  08898040  0882c000 
0899a6f4:  08988124  08898084  08988194  0899a760 
0899a704:  08898080  96220681  0019c9b0  96220681 
0899a714:  0019c9b0  08898080  08898040  0817b5f0 <misc/cgo/test.runTestSetgid+192> 
0899a724: <0899a748  0899a738  00000002  00000000 
0899a734:  080b270b <sync.(*Mutex).Unlock+43>  00010000  e7b98d88  00000000 
0899a744:  00000000  f7f391a8  f7ef827b  e7b98db0 
0899a754:  000000f0  f7f391a8  00000080  ffffffff 
0899a764:  00000000  00000000  00000000  00000000 
0899a774:  00000000 !000000f0 >00000000  0899a78c 
0899a784:  00000001  00000001  00000080  ffffffff 
0899a794:  f7f36f98  00000000  f7ef75de  000000f0 
0899a7a4:  f7f391a8  00000080  ffffffff  00000000 
0899a7b4:  00000000  00000000  0899a824  00000000 
0899a7c4:  00000000  00000000  00000000  00000000 
0899a7d4:  089920a0  081d9c60  f7ef83b3  0899a824 
0899a7e4:  f7f39198  f7efab9b  f7efac34  f7f391a8 
0899a7f4:  f7f36f98  f7f36f98 
misc/cgo/test.runTestSetgid(0x0)
	/builds/nmeum/aports/community/go/src/go/misc/cgo/test/setgid_linux.go:30 +0xc0 fp=0x899a77c sp=0x899a724 pc=0x817b5f0
created by testing.(*T).Run
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1042 +0x2ad

goroutine 1 [chan receive]:
testing.(*T).Run(0x89920a0, 0x81cd72d, 0xa, 0x81d8490, 0x37e)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1043 +0x2cc
testing.runTests.func1(0x8992000)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1284 +0x5a
testing.tRunner(0x8992000, 0x887070c)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:991 +0xb4
testing.runTests(0x898e000, 0x8300000, 0x59, 0x59, 0x0)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1282 +0x22d
testing.(*M).Run(0x898c000, 0x0)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1199 +0x114
main.main()
	_testmain.go:226 +0x104

goroutine 36 [syscall]:
os/signal.signal_recv(0x80ab0d6)
	/builds/nmeum/aports/community/go/src/go/src/runtime/sigqueue.go:147 +0x12f
os/signal.loop()
	/builds/nmeum/aports/community/go/src/go/src/os/signal/signal_unix.go:23 +0x1a
created by os/signal.Notify.func1
	/builds/nmeum/aports/community/go/src/go/src/os/signal/signal.go:127 +0x33
exit status 2
FAIL	misc/cgo/test	0.006s
2020/07/04 11:47:20 Failed: exit status 1
fatal error: index out of range

goroutine 18 [running]:
runtime.throw(0x81cbe0c, 0x12)
	/builds/nmeum/aports/community/go/src/go/src/runtime/panic.go:1116 +0x6a fp=0x886c610 sp=0x886c5fc pc=0x807d27a
runtime.panicCheck1(0x808ea74, 0x81cbe0c, 0x12)
	/builds/nmeum/aports/community/go/src/go/src/runtime/panic.go:34 +0xb9 fp=0x886c624 sp=0x886c610 pc=0x807ad19
runtime.goPanicIndex(0xad88, 0x2)
	/builds/nmeum/aports/community/go/src/go/src/runtime/panic.go:87 +0x37 fp=0x886c648 sp=0x886c624 pc=0x807adc7
runtime.sellock(0x886c748, 0x2, 0x2, 0x886c73c, 0x2, 0x2)
	/builds/nmeum/aports/community/go/src/go/src/runtime/select.go:48 +0x84 fp=0x886c65c sp=0x886c648 pc=0x808ea74
runtime.selectgo(0x886c748, 0x886c738, 0x2, 0x0, 0x80b327b)
	/builds/nmeum/aports/community/go/src/go/src/runtime/select.go:319 +0xca8 fp=0x886c724 sp=0x886c65c pc=0x808f868
runtime: unexpected return pc for misc/cgo/test.runTestSetgid called from 0xf0
stack: frame={sp:0x886c724, fp:0x886c77c} stack=[0x886c000,0x886c800)
0886c6a4:  00000000  08810230  00000018  00000002 
0886c6b4:  00000004  00000001  00000001  00000000 
0886c6c4:  08898000  08084f9d <runtime.wakeNetPoller+77>  08312150  089181ec 
0886c6d4:  0886c73c  089022a0  00000000  0881a5c0 
0886c6e4:  08898100  08898100  08898100  0882c000 
0886c6f4:  089022e4  08898144  08902354  0886c760 
0886c704:  08898140  8e478faf  0019c9b0  8e478faf 
0886c714:  0019c9b0  08898140  08898100  0817c160 <misc/cgo/test.runTestSetgid+192> 
0886c724: <0886c748  0886c738  00000002  00000000 
0886c734:  080b327b <sync.(*Mutex).Unlock+43>  00010000  e7c0ad88  00000000 
0886c744:  00000000  f7f881a8  f7f4727b  e7c0adb0 
0886c754:  000000f0  f7f881a8  00000080  ffffffff 
0886c764:  00000000  00000000  00000000  00000000 
0886c774:  00000000 !000000f0 >00000000  0886c78c 
0886c784:  00000001  00000001  00000080  ffffffff 
0886c794:  f7f85f98  00000000  f7f465de  000000f0 
0886c7a4:  f7f881a8  00000080  ffffffff  00000000 
0886c7b4:  00000000  00000000  0886c824  00000000 
0886c7c4:  00000000  00000000  00000000  00000000 
0886c7d4:  0891e0a0  081d6080  f7f473b3  0886c824 
0886c7e4:  00000000  f7f49b9b  f7f49c34  f7f881a8 
0886c7f4:  f7f85f98  f7f85f98 
misc/cgo/test.runTestSetgid(0x0)
	/builds/nmeum/aports/community/go/src/go/misc/cgo/test/setgid_linux.go:30 +0xc0 fp=0x886c77c sp=0x886c724 pc=0x817c160
created by testing.(*T).Run
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1042 +0x2ad

goroutine 1 [chan receive]:
testing.(*T).Run(0x891e0a0, 0x81c9b4d, 0xa, 0x81d48b0, 0x37e)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1043 +0x2cc
testing.runTests.func1(0x891e000)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1284 +0x5a
testing.tRunner(0x891e000, 0x887070c)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:991 +0xb4
testing.runTests(0x891a000, 0x830ef40, 0x59, 0x59, 0x0)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1282 +0x22d
testing.(*M).Run(0x8918000, 0x0)
	/builds/nmeum/aports/community/go/src/go/src/testing/testing.go:1199 +0x114
main.main()
	_testmain.go:226 +0x104

goroutine 20 [syscall]:
os/signal.signal_recv(0x80abc46)
	/builds/nmeum/aports/community/go/src/go/src/runtime/sigqueue.go:147 +0x12f
os/signal.loop()
	/builds/nmeum/aports/community/go/src/go/src/os/signal/signal_unix.go:23 +0x1a
created by os/signal.Notify.func1
	/builds/nmeum/aports/community/go/src/go/src/os/signal/signal.go:127 +0x33
exit status 2
FAIL	misc/cgo/test	0.008s
2020/07/04 11:47:20 Failed: exit status 1
scatter = 0x8186e38
PASS
ok  	misc/cgo/test	1.917s
PASS
ok  	misc/cgo/testtls	0.003s

@richfelker
Copy link

The fix here was not correct and will likely break when musl changes/reduces implementation-internal use of signals. See 49b017f#commitcomment-47808694

@ianlancetaylor
Copy link
Member

How else could we fix it?

@richfelker
Copy link

In the strace, I see

[pid 14291] rt_sigprocmask(SIG_SETMASK, ~[HUP INT QUIT ILL TRAP ABRT BUS FPE SEGV TERM STKFLT CHLD PROF SYS RTMIN RT_1], NULL, 8) = 0

which indicates go code passed a sigset_t that it bitwise-manipulated itself, bypassing libc sigaddset, sigfillset, etc., to sigprocmask. Ideally don't do this (musl does not have a contract that the bit representation of sigset_t have any correspondence to the kernel type; it's an opaque type), but if you really musl, at least ensure that no bits except the standard non-RT signals and the range SIGRTMIN..SIGRTMAX are set.

In particular, musl does not guarantee that 32-34 are reserved for impl-internal use. We might free one or two of them up at some point in the future (the timer one is gratuitous and not needed), or even allocate them dynamically (probing for one that works) so that the implementation-internal signals work with qemu-user emulation.

@ianlancetaylor
Copy link
Member

Go on Linux doesn't use the C library. It makes system calls directly.

Here we are dealing with cases in which Go code is linked with C code. The C code naturally does use the C library. We're trying to make the Go code, making direct system calls, work well with that C code, using the C library.

According to this issue and my looking at the source code, some versions of musl use signal 34 to implement setgid. This is similar to how glibc uses signal 33 to implement setgid. We already had a workaround for glibc's signal 33, so we just extended it to musl's signal 34.

I don't see any perfect solutions here.

@richfelker
Copy link

If you want to make FFI calls to C code using the C std library (or to libc itself), you can't be bypassing that runtime and changing its state out from under it, which is what happens if you manipulate the signal mask this way. What you're doing is fine in programs without C FFI, but not fine when you call to C.

@ianlancetaylor
Copy link
Member

I don't see any perfect solutions here.

@golang golang locked and limited conversation to collaborators Mar 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge help wanted NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

6 participants