Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compress/lwz hangs building tests for target=wasi gc=conservative #2525

Closed
dgryski opened this issue Jan 14, 2022 · 29 comments
Closed

compress/lwz hangs building tests for target=wasi gc=conservative #2525

dgryski opened this issue Jan 14, 2022 · 29 comments
Labels
wasm WebAssembly

Comments

@dgryski
Copy link
Member

dgryski commented Jan 14, 2022

~/go/src/github.com/tinygo-org/tinygo $ time tinygo test  -x -target=wasi compress/lzw

^C

real	66m56.459s
user	66m31.609s
sys	0m11.294s

Just realized I had this running for over a hour trying to build. This isn't the interp phase (which would time out) nor the test itself (since we would see the wasmtime command line.). Just building. Maybe an infinite-ish loop somewhere?

@dgryski dgryski added the wasm WebAssembly label Jan 14, 2022
@dgryski
Copy link
Member Author

dgryski commented Jan 16, 2022

With -gc=leaking, it passes. With -scheduler=none it fails with panic: expected inttoptr. The hang seems to be associated with -gc=conservative.

@dgryski dgryski changed the title compress/lwz hangs building tests for target=wasi compress/lwz hangs building tests for target=wasi gc=conservative Jan 16, 2022
@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

How I usually debug these issues is by running it for a while and then killing it with SIGABRT. That will result in a panic-like message with stack trace.

It seems to be stuck in tinygo.org/x/go-llvm._Cfunc_LLVMTargetMachineEmitToMemoryBuffer, which is interesting. Sounds like a LLVM bug.

@dgryski
Copy link
Member Author

dgryski commented Jan 17, 2022

Yeah, I found that last night. I wonder if something about the data structure we create is creating a loop in the EmitToMemoryBuffer code, although given that the memory size doesn't seem to be increasing at a huge rate, it's possible it's looping trying to look something up rather than looping trying to add something into the buffer.

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

...but I can produce bitcode, which is useful for independent debugging:

FAIL
Command exited with non-zero status 1
83.64user 0.37system 1:23.57elapsed 100%CPU (0avgtext+0avgdata 417308maxresident)k
0inputs+8808outputs (0major+157320minor)pagefaults 0swaps

(The FAIL there is probably a bug, it shouldn't print that with -c, @niaow).

I wonder if something about the data structure we create is creating a loop in the EmitToMemoryBuffer code

That step is basically the whole machine code generation step, so a lot of stuff is happening there.

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

Huh, that bitcode file is 4.1MB. That's pretty big.
Here is what happens when I run it through llc and kill it after a while:

~/src/github.com/tinygo-org/tinygo$ llc-13 test.bc
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: llc-13 test.bc
1.	Running pass 'Function Pass Manager' on module 'test.bc'.
2.	Running pass 'WebAssembly Instruction Selection' on function '@"(*compress/lzw.Reader).Reset"'
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm3sys15PrintStackTraceERNS_11raw_ostreamEi+0x31)[0x7f2d87a13bb1]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm3sys17RunSignalHandlersEv+0x50)[0x7f2d87a11d70]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(+0xdd60b2)[0x7f2d87a140b2]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12730)[0x7f2d86c2f730]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(+0x135b56a)[0x7f2d87f9956a]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(+0x1359eea)[0x7f2d87f97eea]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm12SelectionDAG7CombineENS_12CombineLevelEPNS_9AAResultsENS_10CodeGenOpt5LevelE+0x903)[0x7f2d87f99233]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm16SelectionDAGISel17CodeGenAndEmitDAGEv+0xf6)[0x7f2d88163656]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm16SelectionDAGISel20SelectAllBasicBlocksERKNS_8FunctionE+0x1865)[0x7f2d88162f35]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm16SelectionDAGISel20runOnMachineFunctionERNS_15MachineFunctionE+0x7cc)[0x7f2d88160a8c]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE+0xfe)[0x7f2d87d5b55e]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE+0x390)[0x7f2d87b48940]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE+0x33)[0x7f2d87b4e533]
/lib/x86_64-linux-gnu/libLLVM-13.so.1(_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE+0x3a8)[0x7f2d87b48f48]
llc-13(main+0x209a)[0x40fc1a]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb)[0x7f2d8675f09b]
llc-13(_start+0x2a)[0x40d1ba]

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

Trying bugpoint now. It's rather frustrating because I need to keep killing the process every time after 10 seconds or so.

@dgryski
Copy link
Member Author

dgryski commented Jan 17, 2022

I'm trying to see if it happens with llvm 12, looking at the commit just before llvm13 support was merged into tinygo.

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

Getting close. The problem is (*compress/lzw.Reader).Reset.

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

FYI: I'm running bugpoint-13 --compile-custom llc-13 test.bc (with test.bc from tinygo test -c ...).

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

Actually, there's a much better command: bugpoint-13 --compile-custom llc-13 --timeout 1 test.bc. That timeout flag is very useful! I can now go and watch it figure out what kind of code is causing the crash.

@dgryski
Copy link
Member Author

dgryski commented Jan 17, 2022

Still happens with llvm12 at the checkout before llvm13 change landed. Building llvm11 and we'll see if it happens there too..

@dgryski
Copy link
Member Author

dgryski commented Jan 17, 2022

Still happens with llvm11 when the testenv stub was merged. Seems like this never worked, we just never got around to testing / noticing.

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

Here is the buggy file: test.zip

Extract it, and run llc-13 test.bc (or however your variant of llc is called).

@dgryski
Copy link
Member Author

dgryski commented Jan 17, 2022

Trying to minimize the lwz test source code, after a while I decided to try with -opt=0 and got the following error instead:

Caused by:
    0: WebAssembly translation error
    1: Invalid input WebAssembly code at offset 2573209: locals exceed maximum

Will keep plugging away at TestReaderReset ...

@aykevl
Copy link
Member

aykevl commented Jan 17, 2022

I think I might see what's going on here. Let me check...

@aykevl
Copy link
Member

aykevl commented Jan 18, 2022

Ok, so I think what's going wrong here.

The Reset function itself looks pretty harmless:

func (r *Reader) Reset(src io.Reader, order Order, litWidth int) {
	*r = Reader{}
	r.init(src, order, litWidth)
}

The issue however is that Reader is a really, really large struct. Like 20kB in size.
Unfortunately, in Go SSA it looks like this:

# Name: (*compress/lzw.Reader).Reset
# Package: compress/lzw
# Location: /usr/local/go1.17/src/compress/lzw/reader.go:237:18
# Locals:
#   0:	t0 Reader
func (r *Reader) Reset(src io.Reader, order Order, litWidth int):
0:                                                                entry P:0 S:0
	; var r *compress/lzw.Reader @ 238:3 is r
	; address of *ast.StarExpr @ 238:2 is r
	t0 = local Reader ()                                            *Reader
	t1 = *t0                                                         Reader
	; address of *ast.CompositeLit @ 238:7 is r
	*r = t1
	; var r *compress/lzw.Reader @ 239:2 is r
	; var src io.Reader @ 239:9 is src
	; var order compress/lzw.Order @ 239:14 is order
	; var litWidth int @ 239:21 is litWidth
	t2 = (*Reader).init(r, src, order, litWidth)                         ()
	; *ast.CallExpr @ 239:2 is t2
	return

That would be roughly equivalent to the following Go code:

func (r *Reader) Reset(src io.Reader, order Order, litWidth int):
	var t0 Reader{}
	t1 = *t0
	*r = t1
	r.init(src, order, litWidth)                         ()
	return

So it allocates a whole Reader{} (of 20kB!), loads it, and stores it to the r.
This is lowered more or less directly to LLVM IR as-is, but LLVM has an optimization that converts large loads/stores to lots of small loads/stores because processors can't do these large loads/stores directly anyway and it usually makes later optimizations more effective.

Clang does this a bit differently. For any zero store, it calls the LLVM memset function intrinsic which is later converted back to a regular store instruction or to an actual memset call. So Clang avoids this issue.

...I can't think of a good fix right away.

@dgryski
Copy link
Member Author

dgryski commented Jan 18, 2022

Can we call the memset intrinsic ourselves somehow?

@aykevl
Copy link
Member

aykevl commented Jan 18, 2022

The problem is that this information is lost in Go SSA form, unless we recompute it. Which is a bit difficult.

@aykevl
Copy link
Member

aykevl commented Jan 18, 2022

But yeah, once we know that a given store stores a zero value, we can trivially convert that to a memset. In fact, I already did that (but of course it didn't help):

diff --git a/compiler/compiler.go b/compiler/compiler.go
index 6c0c98b2..14375e48 100644
--- a/compiler/compiler.go
+++ b/compiler/compiler.go
@@ -1192,6 +1199,14 @@ func (b *builder) createInstruction(instr ssa.Instruction) {
                        // nothing to store
                        return
                }
+               if llvmVal == llvm.ConstNull(llvmVal.Type()) {
+                       if llvmAddr.Type() != b.i8ptrType {
+                               llvmAddr = b.CreateBitCast(llvmAddr, b.i8ptrType, "")
+                       }
+                       size := llvm.ConstInt(b.uintptrType, b.targetData.TypeAllocSize(llvmVal.Type()), false)
+                       b.createMemset0Call(llvmAddr, size)
+                       return
+               }
                b.CreateStore(llvmVal, llvmAddr)
        default:
                b.addError(instr.Pos(), "unknown instruction: "+instr.String())
--- a/compiler/intrinsics.go
+++ b/compiler/intrinsics.go
@@ -33,6 +33,13 @@ func (b *builder) createMemoryCopyCall(fn *ssa.Function, args []ssa.Value) (llvm
 // memory, declaring the function if needed. These calls will be lowered to
 // regular libc memset calls if they aren't optimized out in a different way.
 func (b *builder) createMemoryZeroCall(args []ssa.Value) (llvm.Value, error) {
+       ptr := b.getValue(args[0])
+       len := b.getValue(args[1])
+       b.createMemset0Call(ptr, len)
+       return llvm.Value{}, nil
+}
+
+func (b *builder) createMemset0Call(ptr, len llvm.Value) {
        fnName := "llvm.memset.p0i8.i" + strconv.Itoa(b.uintptrType.IntTypeWidth())
        llvmFn := b.mod.NamedFunction(fnName)
        if llvmFn.IsNil() {
@@ -40,13 +47,12 @@ func (b *builder) createMemoryZeroCall(args []ssa.Value) (llvm.Value, error) {
                llvmFn = llvm.AddFunction(b.mod, fnName, fnType)
        }
        params := []llvm.Value{
-               b.getValue(args[0]),
+               ptr,
                llvm.ConstInt(b.ctx.Int8Type(), 0, false),
-               b.getValue(args[1]),
+               len,
                llvm.ConstInt(b.ctx.Int1Type(), 0, false),
        }
        b.CreateCall(llvmFn, params, "")
-       return llvm.Value{}, nil
 }
 
 var mathToLLVMMapping = map[string]string{

@dgryski
Copy link
Member Author

dgryski commented Jan 18, 2022

So this could hit any other large stack-allocated structure?

@aykevl
Copy link
Member

aykevl commented Jan 18, 2022

Yes.

@dgryski
Copy link
Member Author

dgryski commented Jan 18, 2022

So why doesn't this trigger with gc=leaking?

@dgryski
Copy link
Member Author

dgryski commented Jan 18, 2022

The reason your patch didn't work was because the if wasn't triggering. It looks like the llvm.ConstNull that's stored from the ssa.Alloc in createExpr isn't matching when we check later in the ssa.Store case.

@aykevl
Copy link
Member

aykevl commented Jan 19, 2022

The reason your patch didn't work was because the if wasn't triggering. It looks like the llvm.ConstNull that's stored from the ssa.Alloc in createExpr isn't matching when we check later in the ssa.Store case.

Yes. That's because it's not a constant (as I expected) but a load from that alloca (ssa.Alloc). A load is an instruction, not a constant.

@dgryski
Copy link
Member Author

dgryski commented Jan 19, 2022

Last night I was playing with trying to detect "if the previous use of this ssa value was in an alloc" , but rapidly ran into my knowledge of both llvm and Go's SSA package. Just an idea, anyway.

@dgryski
Copy link
Member Author

dgryski commented Feb 2, 2023

(Just to update: This is still happening)

@dgryski
Copy link
Member Author

dgryski commented Apr 14, 2023

This is now fixed (but not sure why/how).

@dgryski
Copy link
Member Author

dgryski commented Apr 14, 2023

#3628 <-- compress/lzw now runs tests under wasi.

@dgryski dgryski added the next-release Will be part of next release label Apr 14, 2023
@deadprogram
Copy link
Member

This is part of the v0.28 release so now closing this issue. Thanks!

@deadprogram deadprogram removed the next-release Will be part of next release label Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasm WebAssembly
Projects
None yet
Development

No branches or pull requests

3 participants