Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arm64: Segfault when enabling use_egraphs=true #5126

Closed
alexcrichton opened this issue Oct 25, 2022 · 4 comments
Closed

arm64: Segfault when enabling use_egraphs=true #5126

alexcrichton opened this issue Oct 25, 2022 · 4 comments
Labels
fuzz-bug Bugs found by a fuzzer

Comments

@alexcrichton
Copy link
Member

Given this input wasm module:

(module
  (type (;0;) (func))
  (func (;0;) (type 0)
    global.get 0
    i32.eqz
    if  ;; label = @1
      unreachable
    end
    global.get 0
    i32.const 1
    i32.sub
    global.set 0
    i32.const -320017172
    v128.load32_splat
    i8x16.bitmask
    v128.load32_zero align=1
    i64x2.extend_high_i32x4_u
    i64x2.extend_high_i32x4_u
    v128.const i32x4 0x9bececec 0x00007171 0x00000000 0x00000000
    drop
    drop
  )
  (memory (;0;) 0 3)
  (global (;0;) (mut i32) i32.const 1000)
  (export "" (func 0))
  (export "1" (memory 0))
  (data (;0;) (i32.const 0) "")
)

I get these results on a local arm64 marchine:

$ cargo run -q testcase0.wat --invoke ''
Error: failed to run main module `testcase0.wat`

Caused by:
    0: failed to invoke ``
    1: wasm trap: out of bounds memory access
       wasm backtrace:
           0:   0x44 - <unknown>!<wasm function 0>

vs

$ cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true
zsh: segmentation fault  cargo run -q testcase0.wat --invoke '' --cranelift-set use_egraphs=true

The faulting instruction appears to be:

(gdb) disas $pc,$pc+20
Dump of assembler code from 0xfffff7b25038 to 0xfffff7b2504c:
=> 0x0000fffff7b25038:  ld1r    {v1.4s}, [x5]
   0x0000fffff7b2503c:  sshr    v25.16b, v1.16b, #7
   0x0000fffff7b25040:  mov     x4, #0x201                      // #513
   0x0000fffff7b25044:  movk    x4, #0x804, lsl #16
   0x0000fffff7b25048:  movk    x4, #0x2010, lsl #32
End of assembler dump.
(gdb) print/x $x5
$1 = 0xffff5cececec

so I think this may be an alignment issue? I'm not 100% familiar with vector instructions on arm myself.

cc @cfallin

@alexcrichton
Copy link
Member Author

Oh actually reading over the code for more than half a second I realize that this is expected to be an out-of-bounds segfault, so the actual issue here may be that the metadata for "this instruction is allowed to trap" may be getting accidentally lost during the egraphs translation

@alexcrichton alexcrichton added the fuzz-bug Bugs found by a fuzzer label Oct 25, 2022
@alexcrichton
Copy link
Member Author

Perhaps related, but https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=52721 is an issue on x86_64 which also may be related to egraphs since the fuzz configuration there has egraphs enabled. The problem in the fuzz bug is that it's failing to execute this function where it should be successful but a trap is showing up. The fuzz test case is using a CustomUnaligned memory which means that the base address of linear memory is forcibly unaligned. That may or may not be a valid thing to do at runtime, but it seems like it may be related to this egraphs issue?

@cfallin
Copy link
Member

cfallin commented Oct 25, 2022

Thanks!

I have a meta-thought here actually: we've had some discussions in the last few days about a way to restructure the egraph storage to be more directly integrated into the DataFlowGraph (core CLIF structure) which would alter the specific details leading to some of these recent fuzzbugs (storing types appropriately, storing trap metadata). We're pretty sure we want to do it, we just don't have the time quite yet (perhaps in the next few weeks). Rather than spending time chasing down these fuzzbugs and then refactoring again in a few weeks, I think it might make more sense to just turn off egraph mode in fuzzing for now; to be re-enabled later. I'll go ahead and create a PR for this and we can discuss further there...

cfallin added a commit to cfallin/wasmtime that referenced this issue Oct 25, 2022
As per [this comment], with a few recent discussions it's become clear
that we want to refactor egraphs in a way that will subsume, or make
irrelevant, some of the recent fuzzbugs that have arisen (and likely
lead to others, which we'll want to fix!). Rather than chase these down
then refactor later, it probably makes sense not to spend the human time
or fuzzing time doing so. This PR turns off egraphs support in fuzzing
configurations for now, to be re-enabled later.

[this comment]: bytecodealliance#5126 (comment)
cfallin added a commit that referenced this issue Oct 25, 2022
* Cranelift: disable egraphs in fuzzing for now.

As per [this comment], with a few recent discussions it's become clear
that we want to refactor egraphs in a way that will subsume, or make
irrelevant, some of the recent fuzzbugs that have arisen (and likely
lead to others, which we'll want to fix!). Rather than chase these down
then refactor later, it probably makes sense not to spend the human time
or fuzzing time doing so. This PR turns off egraphs support in fuzzing
configurations for now, to be re-enabled later.

[this comment]: #5126 (comment)

* Disable in cranelift-fuzzgen as well.
@alexcrichton
Copy link
Member Author

Handled by #5128 for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fuzz-bug Bugs found by a fuzzer
Projects
None yet
Development

No branches or pull requests

2 participants