Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodes crashing on stack overflow due to context finalizer endless recursive call #3312

Closed
asafm opened this issue Mar 2, 2021 · 8 comments · Fixed by #3401
Closed

Nodes crashing on stack overflow due to context finalizer endless recursive call #3312

asafm opened this issue Mar 2, 2021 · 8 comments · Fixed by #3401
Assignees

Comments

@asafm
Copy link
Contributor

asafm commented Mar 2, 2021

We've had 12 db nodes, crashing, one after another, in a span on ~10min, due to stack-overflow (see stderr below).

2021-03-01T12:45:13.853Z 2021/03/01 12:45:13 Go Runtime version: go1.13.15
2021-03-01T12:45:13.853Z 2021/03/01 12:45:13 Build Version:      v1.0.0
2021-03-01T12:45:13.853Z 2021/03/01 12:45:13 Build Revision:     a3853ee56
2021-03-01T12:45:13.853Z 2021/03/01 12:45:13 Build Branch:       HEAD
2021-03-01T12:45:13.853Z 2021/03/01 12:45:13 Build Date:         2020-11-17-23:19:06
2021-03-01T12:45:13.853Z 2021/03/01 12:45:13 Build TimeUnix:     1605655146
2021-03-01T12:45:11.516Z runtime: goroutine stack exceeds 1000000000-byte limit
2021-03-01T12:45:11.516Z fatal error: stack overflow
2021-03-01T12:45:11.519Z 
2021-03-01T12:45:11.519Z runtime stack:
2021-03-01T12:45:11.520Z runtime.throw(0x1f08d7a, 0xe)
2021-03-01T12:45:11.520Z     /usr/local/go/src/runtime/panic.go:774 +0x72
2021-03-01T12:45:11.520Z runtime.newstack()
2021-03-01T12:45:11.520Z     /usr/local/go/src/runtime/stack.go:1047 +0x6e9
2021-03-01T12:45:11.520Z runtime.morestack()
2021-03-01T12:45:11.520Z     /usr/local/go/src/runtime/asm_amd64.s:449 +0x8f
2021-03-01T12:45:11.520Z 
2021-03-01T12:45:11.520Z goroutine 2771452249 [running]:
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).parentCtx(0xc17207fdc0, 0x0, 0x0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:318 +0xad fp=0xc5666c8378 sp=0xc5666c8370 pc=0xbac0fd
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:100 +0x2f fp=0xc5666c83b0 sp=0xc5666c8378 pc=0xbab50f
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c83e8 sp=0xc5666c83b0 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8420 sp=0xc5666c83e8 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8458 sp=0xc5666c8420 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8490 sp=0xc5666c8458 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c84c8 sp=0xc5666c8490 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8500 sp=0xc5666c84c8 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8538 sp=0xc5666c8500 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8570 sp=0xc5666c8538 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c85a8 sp=0xc5666c8570 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c85e0 sp=0xc5666c85a8 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8618 sp=0xc5666c85e0 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8650 sp=0xc5666c8618 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8688 sp=0xc5666c8650 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c86c0 sp=0xc5666c8688 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c86f8 sp=0xc5666c86c0 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8730 sp=0xc5666c86f8 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8768 sp=0xc5666c8730 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.520Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c87a0 sp=0xc5666c8768 pc=0xbab53c
2021-03-01T12:45:11.520Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c87d8 sp=0xc5666c87a0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8810 sp=0xc5666c87d8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8848 sp=0xc5666c8810 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8880 sp=0xc5666c8848 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c88b8 sp=0xc5666c8880 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c88f0 sp=0xc5666c88b8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8928 sp=0xc5666c88f0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8960 sp=0xc5666c8928 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8998 sp=0xc5666c8960 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c89d0 sp=0xc5666c8998 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8a08 sp=0xc5666c89d0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8a40 sp=0xc5666c8a08 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8a78 sp=0xc5666c8a40 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8ab0 sp=0xc5666c8a78 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8ae8 sp=0xc5666c8ab0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8b20 sp=0xc5666c8ae8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8b58 sp=0xc5666c8b20 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8b90 sp=0xc5666c8b58 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8bc8 sp=0xc5666c8b90 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8c00 sp=0xc5666c8bc8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8c38 sp=0xc5666c8c00 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8c70 sp=0xc5666c8c38 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8ca8 sp=0xc5666c8c70 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8ce0 sp=0xc5666c8ca8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8d18 sp=0xc5666c8ce0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8d50 sp=0xc5666c8d18 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8d88 sp=0xc5666c8d50 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8dc0 sp=0xc5666c8d88 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8df8 sp=0xc5666c8dc0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8e30 sp=0xc5666c8df8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8e68 sp=0xc5666c8e30 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8ea0 sp=0xc5666c8e68 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8ed8 sp=0xc5666c8ea0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8f10 sp=0xc5666c8ed8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8f48 sp=0xc5666c8f10 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8f80 sp=0xc5666c8f48 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8fb8 sp=0xc5666c8f80 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c8ff0 sp=0xc5666c8fb8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9028 sp=0xc5666c8ff0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9060 sp=0xc5666c9028 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9098 sp=0xc5666c9060 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c90d0 sp=0xc5666c9098 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9108 sp=0xc5666c90d0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9140 sp=0xc5666c9108 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9178 sp=0xc5666c9140 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c91b0 sp=0xc5666c9178 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c91e8 sp=0xc5666c91b0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9220 sp=0xc5666c91e8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9258 sp=0xc5666c9220 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9290 sp=0xc5666c9258 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c92c8 sp=0xc5666c9290 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9300 sp=0xc5666c92c8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9338 sp=0xc5666c9300 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9370 sp=0xc5666c9338 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c93a8 sp=0xc5666c9370 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c93e0 sp=0xc5666c93a8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9418 sp=0xc5666c93e0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9450 sp=0xc5666c9418 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9488 sp=0xc5666c9450 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c94c0 sp=0xc5666c9488 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c94f8 sp=0xc5666c94c0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9530 sp=0xc5666c94f8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9568 sp=0xc5666c9530 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c95a0 sp=0xc5666c9568 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c95d8 sp=0xc5666c95a0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9610 sp=0xc5666c95d8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9648 sp=0xc5666c9610 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9680 sp=0xc5666c9648 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c96b8 sp=0xc5666c9680 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c96f0 sp=0xc5666c96b8 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9728 sp=0xc5666c96f0 pc=0xbab53c
2021-03-01T12:45:11.521Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.521Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9760 sp=0xc5666c9728 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9798 sp=0xc5666c9760 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c97d0 sp=0xc5666c9798 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9808 sp=0xc5666c97d0 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dabc70, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9840 sp=0xc5666c9808 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc17207fdc0, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9878 sp=0xc5666c9840 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc1b2407b20, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c98b0 sp=0xc5666c9878 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab730, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c98e8 sp=0xc5666c98b0 pc=0xbab53c
2021-03-01T12:45:11.522Z github.com/m3db/m3/src/x/context.(*ctx).RegisterFinalizer(0xc022dab960, 0x22aefc0, 0xc04b8d74a0)
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/context/context.go:102 +0x5c fp=0xc5666c9920 sp=0xc5666c98e8 pc=0xbab53c
2021-03-01T12:45:11.522Z ...additional frames elided...
2021-03-01T12:45:11.522Z created by github.com/m3db/m3/src/x/sync.(*workerPool).GoWithTimeout
2021-03-01T12:45:11.522Z     /go/src/github.com/m3db/m3/src/x/sync/worker_pool.go:82 +0x195
2021-03-01T12:45:11.522Z 

Attaching complete stder
stderr.txt

@asafm
Copy link
Contributor Author

asafm commented Mar 2, 2021

Seems that golang/go#7181 prevents from knowing the exact origin of the problem. The callers of RegisterFinalizer are quite vast, and as the stacktrace can't reveal it self, I was thinking on rewriting the below function using loops instead of recursion, and add sensible depth protection, and failing if so, this way we'll have the original source of the error. WDYT?

	parent := c.parentCtx()
	if parent != nil {
		parent.RegisterFinalizer(f)
		return
	}

	c.registerFinalizeable(finalizeable{finalizer: f})
}

@gibbscullen gibbscullen self-assigned this Mar 2, 2021
@gibbscullen
Copy link
Collaborator

@asafm -- are you able to reproduce this issue? This may be an issue related to a high number of index blocks being accessed for a single query. There may be something that can be done with the child / parent relationship of our context objects.

@asafm
Copy link
Contributor Author

asafm commented Mar 3, 2021

@gibbscullen This error is from production, 2 days ago. It happened yesterday, 3 times, on same cluster, knowing out, 8 nodes, after 1 hours 5 nodes, after 1 hours 4 nodes, 3 times overall.

I can't reproduce as the panic stack-trace doesn't reveal the actual source of the code running inside that go routine, which call register finalizer endlessly.

Can you please explain how did you come up with many index blocks as reason?

@benraskin92 I saw in Git History you wrote the original recursive call. WDYT about my refactor idea to help find the cause?

@asafm
Copy link
Contributor Author

asafm commented Mar 8, 2021

After reading the code and some background knowledge context, it seems that:

  1. The huge stacktrace we see above is caused by registerFinalizer. That method keeps going up the context tree, until it reaches the root context and registers the finalizer there. In this issue case, it seems that the context tree is way longer than usual, hence the stack exceeds its max size.
  2. Given (1), the root cause is the state in which that a very long context tree is created.
  3. in context.go it seems that only to create parent-child relationship is through StartSampledTraceSpan or its caller StartTraceSpan.
  4. Thus the only way to create a long context tree is some place in the code in which we either create by mistake a loop (a -> b -> c -> a) or some how get into a state in which we called Start Trace too many times.

Since the amount of callers to StartSampledTraceSpan and StartTraceSpan is quite large, I thought to add depth counter to context, and error when we pass reasonable depth (100?). Since both functions don't return an error, one way to go at it is to log an error when we cross that limit, with the span names concatenated, so we can pin-point the problematic code that creates that long context tree and use that to fix the bug.

WDYT? @benraskin92

@asafm
Copy link
Contributor Author

asafm commented Mar 14, 2021

OK, here's the proposed fix I will write in my branch. Any feedback would be appreciated.

  1. Add depth in ctx, measuring how deep this context is in the tree of contexts
  2. in StartSampledTraceSpan, if depth==100, add a special tag to span and mark it as error. If depth > 100, use no-op span. This will prevent creation of very long chain (above 100).
  3. a) RegisterFinalizer will be changed to be non-recursive.
    b) When climbing up the context tree, to find the root, we can use the depth to detect cycle in the tree, if the depth suddenly increases instead of decrease, but I think it's impossible, since when you create a new child context, you set the parent, thus there is no code which inject an existing parent to an existing node.

@gibbscullen
Copy link
Collaborator

@asafm thanks for the proposal -- we'll take a look and follow up with any feedback.

@asafm
Copy link
Contributor Author

asafm commented Apr 6, 2021

@gibbscullen I implemented my proposal. Can someone from the maintainers review this?

@gibbscullen
Copy link
Collaborator

@asafm yes will do. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants