Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: inconsistent integer arithmetic result on Go 1.22+arm64 with/without -race #68227

Closed
mathetake opened this issue Jun 28, 2024 · 9 comments
Assignees
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. Critical A critical problem that affects the availability or correctness of production systems built using Go
Milestone

Comments

@mathetake
Copy link
Contributor

Go version

go version go1.22.3 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE='on'
GOARCH='arm64'
GOBIN=''
GOCACHE='/Users/mathetake/Library/Caches/go-build'
GOENV='/Users/mathetake/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/mathetake/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/mathetake/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.22.3'
GCCGO='gccgo'
AR='ar'
CC='clang'
CXX='clang++'
CGO_ENABLED='1'
GOMOD='/Users/mathetake/wazero/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/zv/cj1015lx5bndpw3n_qsx9mh40000gn/T/go-build2746109211=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

The following is the small repro that I managed to craft, and I run go test . and go test -race ..

type someType []uint64

func (s *someType) push(v uint64) {
	*s = append(*s, v)
}

func (s *someType) problematicFn(x1Lo, x1Hi, x2Lo, x2Hi uint64) {
	r1 := int32(int16(x1Lo>>0)) * int32(int16(x2Lo>>0))
	r3 := int32(int16(x1Lo>>32)) * int32(int16(x2Lo>>32))
	r4 := int32(int16(x1Lo>>48)) * int32(int16(x2Lo>>48))
	r5 := int32(int16(x1Hi>>0)) * int32(int16(x2Hi>>0))
	r7 := int32(int16(x1Hi>>32)) * int32(int16(x2Hi>>32))
	r8 := int32(int16(x1Hi>>48)) * int32(int16(x2Hi>>48))
	s.push(uint64(uint32(r1)) | (uint64(uint32(r3+r4)) << 32))
	s.push(uint64(uint32(r5)) | (uint64(uint32(r7+r8)) << 32))
}

func TestProblematicFn(t *testing.T) {
	s := &someType{}
	s.problematicFn(281479271743489, 281479271743489, 18446744073709551615, 18446744073709551615)
	if (*s)[0] != 18446744069414584319 {
		t.Fatal("unexpected result")
	}
	if (*s)[1] != 18446744073709551615 {
		t.Fatal("unexpected result")
	}
}

What did you see happen?

I got the assertion failure only for the case with -race. Note that this also happens for linux/arm64 but only when Go 1.22.
When I briefly run with the different Go version, I got the consistent result but the assertion was failing, meaning that the compiled machine code has some undefined behavior. I suspect that seems the root case.

What did you expect to see?

Consistent results regardless of with or withour -race flag.

@mathetake mathetake changed the title Inconsistent arithmetic computation result on Go 1.22+arm64 with or without -race Inconsistent integer arithmetic result on Go 1.22+arm64 with/without -race Jun 28, 2024
@ianlancetaylor
Copy link
Contributor

CC @randall77 @golang/arm

@randall77
Copy link
Contributor

randall77 commented Jun 28, 2024

I can reproduce. -race is not necessary, any function call causes the problem. Adding fmt.Printf("%x\n", r1) after calculating r1 is enough.

I see the problem, it's a missing zero-extension.

We start with this code:

v21 (15) = MULW <int32> v20 v17 (r1[int32])
v108 (+22) = MOVWUreg <uint64> v21

which is fine.

Then in late lower, we decide that because MULW already zeroes the high 32 bits, we don't need the zero extension. All uses of v108 are changed to use v21 directly.

But then during regalloc, we decide that we need to save/restore v21 around a call, and we use a 32-bit store to save it and a 32-bit signed load to restore it. Boom, that value no longer has zeroed high bits. That causes one of the ORs at the very end to set a bit it shouldn't.

We've seen this kind of error before #21963 #66066, and also kind of similar is #59367. My commit message text in CL 568616, which fixed #66066, reads

I am also worried about spilling and restoring. Spilling and restoring
doesn't preserve the high bits, but instead sets them to a known value
(often 0, but in some cases it could be sign-extended). I am unable
to come up with a case that would cause a problem here, so leaving for
another time.

This issue is the case I needed :)

The underlying confusion is that when a type only fills part of a register, the upper part is junk. So just because MULW clears the upper 32 bits, it doesn't mean that those upper bits will be preserved from the MULW to the use. The MULW would have to be typed 64 bits wide to preserve its upper bits. (Which is one possible fix here, but I will look further.)

@randall77 randall77 self-assigned this Jun 28, 2024
@randall77 randall77 added this to the Go1.23 milestone Jun 28, 2024
@mathetake
Copy link
Contributor Author

cool, yeah I also noticed that this happens with or without fmt.Println, so I suspected something wrong around regalloc. Glad to hear my repro was what you needed!

@randall77 randall77 added the Critical A critical problem that affects the availability or correctness of production systems built using Go label Jun 28, 2024
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/595675 mentions this issue: cmd/compile: don't elide zero extension on top of signed values

@randall77
Copy link
Contributor

@gopherbot please open backport issue for 1.22.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #68230 (for 1.22).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@seankhliao seankhliao changed the title Inconsistent integer arithmetic result on Go 1.22+arm64 with/without -race cmd/compile: inconsistent integer arithmetic result on Go 1.22+arm64 with/without -race Jun 28, 2024
@seankhliao seankhliao added compiler/runtime Issues related to the Go compiler and/or runtime. arch-amd64 arch-arm64 and removed arch-amd64 labels Jun 28, 2024
@thanm
Copy link
Contributor

thanm commented Jul 3, 2024

DIscussed during the weekly release checkin -- does this bug need to be fixed in 1.21 as well?

@randall77
Copy link
Contributor

I don't believe so. The rules that triggered the bug were added in 1.22.

Mchnan pushed a commit to Mchnan/go-sylixos that referenced this issue Jul 9, 2024
v = ... compute some value, which zeros top 32 bits ...
w = zero-extend v

We want to remove the zero-extension operation, as it doesn't do anything.
But if v is typed as a signed value, and it gets spilled/restored, it
might be re-sign-extended upon restore. So the zero-extend isn't actually
a NOP when there might be calls or other reasons to spill in between v and w.

Fixes golang#68227

Change-Id: I3b30b8e56c7d70deac1fb09d2becc7395acbadf8
Reviewed-on: https://go-review.googlesource.com/c/go/+/595675
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Joedian Reid <[email protected]>
Reviewed-by: Cuong Manh Le <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. Critical A critical problem that affects the availability or correctness of production systems built using Go
Projects
None yet
Development

No branches or pull requests

7 participants