Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: "traceback stuck" during TestTraceCPUProfile #52829

Closed
rhysh opened this issue May 10, 2022 · 6 comments
Closed

runtime: "traceback stuck" during TestTraceCPUProfile #52829

rhysh opened this issue May 10, 2022 · 6 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Milestone

Comments

@rhysh
Copy link
Contributor

rhysh commented May 10, 2022

Looking over the failures for #52693, I see a new variety that also doesn't match #52764.

In this crash, runtime.gentraceback as called from runtime.sigprof complains of "fatal error: traceback stuck". The failure is on 32-bit linux/arm. That assertion is new in 74f0009, for #52116. CC @cherrymui

Both the assertion in gentraceback and the test that triggered the crash are new in the Go 1.19 cycle, which I think makes this a release blocker. CC @bcmills

greplogs -dashboard -l -e 'traceback stuck' --since=2022-04-01 -md
2022-05-10T14:22:16-59ef3a9/linux-arm-aws

ok  	runtime/internal/sys	0.002s
ok  	runtime/metrics	0.001s
ok  	runtime/pprof	8.903s
runtime: traceback stuck. pc=0x86748 sp=0x18edc30
stack: frame={sp:0x18edc30, fp:0x18edc30} stack=[0x18ed000,0x18ee000)
0x018edbb0:  0x00000000  0x00000000  0x41409000  0x00000000 
0x018edbc0:  0x41f00000  0x00000000  0x00000000  0x00000000 
0x018edbd0:  0x00000000  0x00000000  0x00000000  0x00000000 
0x018edbe0:  0x00000000  0x00000000  0x00000000  0x00000000 
0x018edbf0:  0x00000000  0x00000000  0x00000000  0x00000000 
0x018edc00:  0x00000000  0x00000000  0x00000000  0x00000000 
0x018edc10:  0x00000000  0x00000000  0x00000000  0x00000000 
0x018edc20:  0x00000000  0x00212000  0x00000000  0x00000000 
0x018edc30: >0x001889e8 <runtime/trace_test.cpuHog1+0x00000024>  0x001d2d4c  0x0000afc8  0x00000000 
0x018edc40:  0x0018892c <runtime/trace_test.cpuHogger+0x00000040>  0x00000000  0x000186a0 <runtime.gogetenv+0x00000100>  0x00000000 
0x018edc50:  0x0018887c <runtime/trace_test.TestTraceCPUProfile.func1.1+0x00000034>  0x00000000  0x00000000  0x2eb6be37 
0x018edc60:  0x00000000  0x002e9988  0x01cb6e00  0x0013682c <runtime/pprof.SetGoroutineLabels+0x00000034> 
0x018edc70:  0x000001ed  0x002fb888  0x882ac3f9  0xc096bbbb 
0x018edc80:  0x2eb6be37  0x00000000  0x002e9988  0x00136924 <runtime/pprof.Do+0x000000b8> 
0x018edc90:  0x001d2edc  0x002fb890  0x05f5e100  0x00000000 
0x018edca0:  0x001887b8 <runtime/trace_test.TestTraceCPUProfile.func1+0x00000224>  0x00202778  0x01950210  0x01cb6e00 
fatal error: traceback stuck

goroutine 0 [idle]:
runtime.throw({0x1c730b, 0xf})
	/workdir/go/src/runtime/panic.go:1028 +0x4c fp=0x18f7888 sp=0x18f7874 pc=0x4ec04
runtime.gentraceback(0x86748, 0x18edc30, 0x86748, 0x19064b0, 0x0, 0x18f7ab4, 0x40, 0x0, 0x0, 0x6)
	/workdir/go/src/runtime/traceback.go:488 +0x10d4 fp=0x18f7a7c sp=0x18f7888 pc=0x7918c
runtime.sigprof(0x86748, 0x18edc30, 0x86748, 0x19064b0, 0x1831440)
	/workdir/go/src/runtime/proc.go:4556 +0x1dc fp=0x18f7bc8 sp=0x18f7a7c pc=0x5c16c
runtime.sighandler(0x1b, 0x18f7c90, 0x18f7d10, 0x19064b0)
	/workdir/go/src/runtime/signal_unix.go:617 +0x588 fp=0x18f7c24 sp=0x18f7bc8 pc=0x67c98
runtime.sigtrampgo(0x1b, 0x18f7c90, 0x18f7d10)
	/workdir/go/src/runtime/signal_unix.go:479 +0x190 fp=0x18f7c5c sp=0x18f7c24 pc=0x673bc
runtime.sigtramp()
	/workdir/go/src/runtime/sys_linux_arm.s:585 +0x30 fp=0x18f7c70 sp=0x18f7c5c pc=0x865a4

goroutine 111 [running]:
runtime.Gosched(...)
	/workdir/go/src/runtime/proc.go:318
runtime/trace_test.cpuHog0(0x0, 0x186a0)
	/workdir/go/src/runtime/trace/trace_test.go:760 +0x60 fp=0x18edc40 sp=0x18edc30 pc=0x188a68
runtime/trace_test.cpuHog1(0x0)
	/workdir/go/src/runtime/trace/trace_test.go:752 +0x24 fp=0x18edc50 sp=0x18edc40 pc=0x1889e8
runtime/trace_test.cpuHogger(0x1d2edc, 0x2fb890, 0x5f5e100)
	/workdir/go/src/runtime/trace/trace_test.go:739 +0x40 fp=0x18edc8c sp=0x18edc50 pc=0x18892c
runtime/trace_test.TestTraceCPUProfile.func1.1({0x202778, 0x1950210})
	/workdir/go/src/runtime/trace/trace_test.go:613 +0x34 fp=0x18edca0 sp=0x18edc8c pc=0x18887c
runtime/pprof.Do({0x202738, 0x192a000}, {{0x1cb6e00, 0x1, 0x1}}, 0x18edd18)
	/workdir/go/src/runtime/pprof/runtime.go:40 +0xb8 fp=0x18edcdc sp=0x18edca0 pc=0x136924
runtime/trace_test.TestTraceCPUProfile.func1(0x5f5e100)
	/workdir/go/src/runtime/trace/trace_test.go:612 +0x224 fp=0x18edd4c sp=0x18edcdc pc=0x1887b8
runtime/trace_test.TestTraceCPUProfile(0x1803a40)
	/workdir/go/src/runtime/trace/trace_test.go:619 +0x19c fp=0x18edf98 sp=0x18edd4c pc=0x1877c4
testing.tRunner(0x1803a40, 0x1d2e90)
	/workdir/go/src/testing/testing.go:1449 +0x118 fp=0x18edfe0 sp=0x18edf98 pc=0xf29ac
testing.(*T).Run.func1()
	/workdir/go/src/testing/testing.go:1496 +0x30 fp=0x18edfec sp=0x18edfe0 pc=0xf3804
runtime.goexit()
	/workdir/go/src/runtime/asm_arm.s:824 +0x4 fp=0x18edfec sp=0x18edfec pc=0x853a4
created by testing.(*T).Run
	/workdir/go/src/testing/testing.go:1496 +0x3a0
@bcmills bcmills added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker labels May 10, 2022
@bcmills bcmills added this to the Go1.19 milestone May 10, 2022
@bcmills
Copy link
Contributor

bcmills commented May 10, 2022

(CC @golang/runtime)

@bcmills
Copy link
Contributor

bcmills commented May 10, 2022

@cherrymui
Copy link
Member

Thanks. I'll take a look.

@rhysh
Copy link
Contributor Author

rhysh commented May 10, 2022

@cherrymui
Copy link
Member

I think the signal PC is

runtime.save_g:
  mov     r11, lr
  bl      #-64 <runtime.read_tls_fallback>
  mov     lr, r11                             <-- here

At that very instruction, PC=LR. save_g is a leaf function with no frame, so the unwinder uses the return PC from the LR register, which equals PC, which would not make any progress.

That call is injected by the assembler at https://cs.opensource.google/go/go/+/master:src/cmd/internal/obj/arm/obj5.go;l=69

I'll see if we can make save_g have a frame and save the LR on stack in the usual way.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/405482 mentions this issue: runtime: not mark save_g NOFRAME on ARM

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker
Projects
None yet
Development

No branches or pull requests

4 participants