Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: "found bad pointer in Go heap" when running net/http test #42944

Closed
bcmills opened this issue Dec 2, 2020 · 7 comments
Closed

runtime: "found bad pointer in Go heap" when running net/http test #42944

bcmills opened this issue Dec 2, 2020 · 7 comments
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker
Milestone

Comments

@bcmills
Copy link
Contributor

bcmills commented Dec 2, 2020

2020-12-01T23:14:14-6f84993/linux-386

CC @aclements @mknyszek

@bcmills bcmills added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Dec 2, 2020
@mknyszek
Copy link
Contributor

mknyszek commented Dec 2, 2020

Looks like it found an invalid pointer on the stack. This could be a compiler liveness thing, possibly?

@cherrymui
Copy link
Member

It could be. But it seems weird, though. The bad pointer is found at 0x9cf38f4, which is the argument area of this frame:

goroutine 10864 [preempted (scan)]:
runtime.slicebytetostring(0x0, 0x9e1d0f6, 0xaf, 0xaf82077, 0xed6f2ae7)
	/workdir/go/src/runtime/string.go:80 +0xbb fp=0x9cf38e8 sp=0x9cf38e4 pc=0x8096abb

It is FP+12, i.e. the fourth word. The function signature is

func slicebytetostring(buf *tmpBuf, ptr *byte, n int) (str string)

So it is the first word of the result. PC 0xbb is the instruction right after the morestack call. So it is preempted synchronously at function entry, where the result shouldn't be live. And it is indeed not, if I decode the liveness map correctly.

@cherrymui
Copy link
Member

cherrymui commented Dec 2, 2020

Since it is not statically live, another possibility is that it is scanned as a stack object. From the stack trace

runtime.scanstack(0x9e35a40, 0x9c2cb84)
	/workdir/go/src/runtime/mgcmark.go:833 +0x3e6 fp=0xe725e228 sp=0xe725e0d8 pc=0x8066596

It is actually scanning stack objects.

str in slicebytetostring is address-taken, so it is a stack object. From the object dump,

*(object+180) = 0x9cf38f4

points to the object. object+180 is 0x9cf39a8, which is SP+192 in this frame

crypto/x509.(*CertPool).AppendCertsFromPEM(0x9c96db0, 0xa5c861a, 0x1fbc4, 0x1fdc4, 0x3a3de)
	/workdir/go/src/crypto/x509/cert_pool.go:226 +0x1d8 fp=0x9cf39b8 sp=0x9cf38e8 pc=0x81f2cb8

which is autotmp_45. autotmp_45 is actually live at the call of runtime.slicebytetostring, and it is holding the value of SP+12 (=0x9cf38f4). I think it is autotmp_45 being live, which holds the stack object (str in slicebytetostring) live, which isn't initialized at function entry.

I can see a few possibilities for fixing:

  • autotmp_45 is just a spill of LEAL 12(SP), an address in the callee's arg area. I think it is live so it can be reused in the loop in AppendCertsFromPEM. Why it is not rematerializable? Typically things like this is rematerialized so it never spilled and picked up by the GC. Maybe we should make it rematerializable.
  • When scanning stack objects, a reference in the caller's frame (or above) should not hold the stack object live, because we cannot install a stack object's address into the caller's frame, as the escape analysis should treat that as an escape.
  • Stack objects should never be live at function entry.

Any one of them would fix this.

@randall77
Copy link
Contributor

autotmp_45 is just a spill of LEAL 12(SP), an address in the callee's arg area. I think it is live so it can be reused in the loop in AppendCertsFromPEM. Why it is not rematerializable? Typically things like this is rematerialized so it never spilled and picked up by the GC. Maybe we should make it rematerializable.

It should be, it is marked as such in 386Ops.go.
The correctness of generated code shouldn't depend on whether rematerialization happens, though.

I think it's basically never a good idea to have pointers into the callee args area. At least, those pointers should be very short lived - just enough to be passed to memmove. Maybe we could type such pointers as uintptr instead of a real pointer, as the GC should never care about them.

@cherrymui
Copy link
Member

cherrymui commented Dec 2, 2020

It should be, it is marked as such in 386Ops.go.

LEAL is rematerializeable. But in this case it is ADDLconst <*[28]byte> [12] v2 (v2 is SP). Maybe we want to rewrite ADDLconst SP to LEAL.

Maybe we could type such pointers as uintptr instead of a real pointer, as the GC should never care about them.

Sounds like a good idea.

@cherrymui
Copy link
Member

Maybe we could type such pointers as uintptr instead of a real pointer, as the GC should never care about them.

Hmmm, if the stack moves, these pointers still need to be adjusted. So we probably cannot simply type them as uintptr. Maybe we really need to ensure they are short lived (e.g. always rematerialized, never spilled).

@aclements aclements added this to the Go1.16 milestone Dec 3, 2020
@gopherbot
Copy link
Contributor

Change https://golang.org/cl/275174 mentions this issue: cmd/compile: make sure address of offset(SP) is rematerializeable

@aclements aclements added the okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 label Dec 3, 2020
@golang golang locked and limited conversation to collaborators Dec 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FrozenDueToAge NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. okay-after-beta1 Used by release team to mark a release-blocker issue as okay to resolve either before or after beta1 release-blocker
Projects
None yet
Development

No branches or pull requests

6 participants