Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: wasip1 wasmtime builder is broken #59583

Closed
johanbrandhorst opened this issue Apr 12, 2023 · 33 comments
Closed

build: wasip1 wasmtime builder is broken #59583

johanbrandhorst opened this issue Apr 12, 2023 · 33 comments
Labels
arch-wasm WebAssembly issues FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@johanbrandhorst
Copy link
Member

johanbrandhorst commented Apr 12, 2023

This is a tracking issue for problems relating to the wasip1-wasmtime builder. The wasmtime runner fails a number of tests, because of differences in how wasmtime and wazero (the primary wasip1 runner) have implemented the wasi_snapshot_preview1 syscall API.

This issue will track efforts to make the wasip1 port and wasmtime runner pass the standard library tests, or document which tests are impossible to fix because of runtime issues.

List of known failing tests on wasmtime:

Package Test name Error Test log Fixed
archive/tar TestFileInfoHeaderSymlink Symlink permission error Test log ✔️
archive/zip TestOpenReaderInsecurePath Directory open error Test log ✔️
archive/zip FuzzReader Directory open error Test log ✔️
net/http/pprof Unknown Text file busy Test log ✔️
cmd/internal/testdir Test/env.go PATH is empty Test log ✔️
@johanbrandhorst johanbrandhorst added the arch-wasm WebAssembly issues label Apr 12, 2023
@ianlancetaylor
Copy link
Member

ianlancetaylor commented Apr 12, 2023

Just checking: is wasmtime supposed to implement wasi preview 1? We already confused ourselves quite a bit before coming up with GOOS=wasip1. I would want a really good reason before we start having to support multiple different versions of wasip1. Thanks.

@johanbrandhorst
Copy link
Member Author

Yes, wasmtime does support preview 1, we did test it extensively during the implementation phase, but ran into some tests that were hard or impossible to satisfy on both runtimes (mostly because the Go tests tend to assume POSIX like behavior from the OS and wasmtime seems to have a slightly different philosophy to implementing certain things). We'll document each of the failings here once we get the runner up and running, so we can discuss how to proceed.

@ianlancetaylor
Copy link
Member

Thanks.

@dr2chase
Copy link
Contributor

CC @golang/wasm (I realize this is redundant for the bug submitter).

@dr2chase dr2chase added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 13, 2023
@mathetake
Copy link
Contributor

mathetake commented Apr 18, 2023

@johanbrandhorst would you mind listing the failing tests here (or notable ones)? That would be helpful for both of wazero and wasmtime to recognize the differences.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/485615 mentions this issue: misc/wasm: support wasmtime in wasip1

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/485556 mentions this issue: dashboard: set RUNTIME in wasmtime builder

gopherbot pushed a commit to golang/build that referenced this issue Apr 19, 2023
The new GOWASIRUNTIME variable will be used to select the WASI runtime
in the executable script.

For golang/go#59583

Change-Id: Ie2d5f9373393b4c40ba592f956e5ad1e4b1fd207
Reviewed-on: https://go-review.googlesource.com/c/build/+/485556
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Heschi Kreinick <[email protected]>
Auto-Submit: Johan Brandhorst-Satzkorn <[email protected]>
Run-TryBot: Johan Brandhorst-Satzkorn <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/487015 mentions this issue: os/user: skip tests that invoke Current if it returns an expected error

gopherbot pushed a commit that referenced this issue Apr 21, 2023
Today Current may fail if the binary is not built with cgo
and USER and/or HOME is not set in the environment.
That should not cause the test to fail.

After this change,

	GOCACHE=$(go env GOCACHE) CGO_ENABLED=0 USER= HOME= go test os/user

now passes on linux/amd64.

For #59583.

Change-Id: Id290cd1088051e930d73f0dd554177124796e8f2
Reviewed-on: https://go-review.googlesource.com/c/go/+/487015
Reviewed-by: Michael Pratt <[email protected]>
Run-TryBot: Bryan Mills <[email protected]>
Reviewed-by: Johan Brandhorst-Satzkorn <[email protected]>
Auto-Submit: Bryan Mills <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
@johanbrandhorst
Copy link
Member Author

johanbrandhorst commented Apr 21, 2023

As @bcmills noted in https://go-review.googlesource.com/c/go/+/485615/comments/7b08ced3_cf6d1646, the first real error on wasmtime was due to a symlink permission error. He suggested we add a runtime probe on wasip1 to internal/testenv.hasSymlink to check for this error and skip tests accordingly. I will try this in a follow up CL.

I've also added a table to track test failures to the original topic.

gopherbot pushed a commit that referenced this issue Apr 21, 2023
Allow switching to wasmtime through the GOWASIRUNTIME variable. This
will allow builders to run the wasip1 standard library tests against
the wasmtime WASI runtime.

For #59583

Change-Id: I4d5200df7bb27b66e041f00e89d4c2e585f5da7c
Reviewed-on: https://go-review.googlesource.com/c/go/+/485615
Reviewed-by: Bryan Mills <[email protected]>
TryBot-Bypass: Johan Brandhorst-Satzkorn <[email protected]>
Run-TryBot: Johan Brandhorst-Satzkorn <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/490115 mentions this issue: internal/testenv: probe for symlink on wasip1

@johanbrandhorst
Copy link
Member Author

The workaround for the symlink problem implemented in https://go.dev/cl/490115 seems to have worked and has given us another set of test failures:

--- FAIL: TestOpenReaderInsecurePath (0.00s)
    testing.go:1225: TempDir RemoveAll cleanup: open /tmp/TestOpenReaderInsecurePath1427476482: Is a directory
--- FAIL: FuzzReader (0.00s)
    fuzz_test.go:18: failed to read testdata directory: open testdata: Is a directory
FAIL
FAIL	archive/zip	2.827s

I'll continue to investigate these errors, and I've updated the table at the top.

@jakebailey
Copy link

jakebailey commented Apr 28, 2023

I've been trying to get the new wasip1 target working in the browser for a few days (to get esbuild in the browser with a VFS) and have experienced many of the same "is a directory" / "not capable" errors in effectively every (non-wazero) runtime or shim I've tried. (e.g. the npm package @wasmer/wasi is itself WASM code from the wasmer runtime to implement preview1 in JS).

There seems to be a somewhat universal problem among runtimes where they don't seem to support os.Open on directories. I was planning on reporting these to wasmtime/wasmer, but just haven't had the time. (That, or it actually is a Go bug?)

I didn't report something on this tracker since I think these are all issues with the runtimes, but here's a quick example if it's helpful getting a minimal repro: https://github.com/jakebailey/go-wasip1-directory-test

@achille-roussel
Copy link
Contributor

achille-roussel commented Apr 29, 2023

Thanks for reporting @jakebailey, your code example was very useful to track down the issue!

I submitted https://go.dev/cl/490755 which should address the issue (Go was requesting too many rights when opening directories).

@jakebailey
Copy link

Excellent! I'll give the CL a try later.

@jakebailey
Copy link

That patch does make wasmtime succeed with my example, though doesn't change wasmer (which is unfortunately the most complete browser shim as well). I have to retry everything but it's probable that there's something off in wasmer / the various shims (I'm still testing), but great to have wasmtime fixed. I know the list of WASI runtimes is huge, but it'd be great to keep expanding what's tested in CI.

(I know this isn't the thread for either; I only commented since the builder had the same directory error I was experiencing. Sorry for the other off-topic stuff!)

@jakebailey
Copy link

jakebailey commented Apr 30, 2023

The CL doesn't appear to be quite perfect; in the dir branch, it accidentally(?) uses fileRights, probably a copy-paste error.

I recognized after writing the below that this issue is specifically about wasmtime, so I'm happy to move this to a new issue instead (e.g. add a wasmer builder? or just a general issue about file/dir Open issues in wasip1?), but for completeness:

wasmer returns ENOTCAPABLE on the initial path_open, so the second branch of the CL doesn't actually apply. It seems like some runtimes (probably correctly?) check capabilities first, likely to avoid leaking information about the filesystem. So roughly:

  • wazero just lets you open a directory as a file, so was working before as-is. This feels weird.
  • wasmtime will return EISDIR when you try to open a directory as a file.
  • wasmer rejects the first open because the passed rights don't apply to a directory.

It might be enough to just treat ENOTCAPABLE as EISDIR here to try a fallback, except that if the file really was a file but failed because of rights issues, errno is replaced with the directory error instead.

For now, I'm just cheating by changing os.Open to first Stat to figure out what to do, leaving me with the next problem; that Open returns ENOTCAPABLE which confuses code which assumes that it can see EISDIR as an indiciator that it was actually a file. I have to back through my rotation of WASI shims to see if this is universal, though.

@johanbrandhorst
Copy link
Member Author

Thanks for the thorough investigating @jakebailey, I do think we'll want a separate issue for the wasmer errors. I've created #59907 to track the progress on supporting the wasmer runtime.

@johanbrandhorst
Copy link
Member Author

johanbrandhorst commented May 1, 2023

With https://go.dev/cl/490755 merged we have an exciting new error:

Error: failed to run main module `/workdir/tmp/go-build2450374490/b001/pprof.test`

Caused by:
    0: failed to invoke command default
    1: error while executing at wasm backtrace:
           0: 0x11fe06 - <unknown>!<wasm function 1413>
           1: 0x121d9f - <unknown>!<wasm function 1421>
           2: 0x166140 - <unknown>!<wasm function 1695>
           3: 0x16753c - <unknown>!<wasm function 1704>
           4: 0x164cb6 - <unknown>!<wasm function 1688>
           5: 0x2b4157 - <unknown>!<wasm function 2783>
           6: 0x2cef40 - <unknown>!<wasm function 2851>
           7: 0x2cf508 - <unknown>!<wasm function 2853>
           8: 0x2c7ef0 - <unknown>!<wasm function 2842>
           9: 0x2d4f57 - <unknown>!<wasm function 2860>
          10: 0x2bd0a0 - <unknown>!<wasm function 2811>
          11: 0x10d5e9 - <unknown>!<wasm function 1235>
          12: 0x10d681 - <unknown>!<wasm function 1239>
    2: Unknown OS error
    3: Text file busy (os error 26)
FAIL	net/http/pprof	1.924s
FAIL

I've updated the table at the top. Full test log here.

@codefromthecrypt
Copy link

wazero just lets you open a directory as a file, so was working before as-is. This feels weird.

This isn't entirely true. wasi has an extended flag O_DIRECTORY to clarify if it should fail if the file is a directory.

We have tests to show we return ENOTDIR on that flag when it is a file not a directory.

@codefromthecrypt
Copy link

p.s. I don't think anyone should assume a heuristic to use rights to hack semantics that layer on other flags. As "secret handshakes" come up, we can evaluate them pragmatically. One example of wandering this line is @achille-roussel recently in tetratelabs/wazero#1421

Just remember there are other compilers out there who may not have the same heuristics. Doing things like this without supporting tests in wasi-testsuite may lead to surprises for more people later, and possibly backtracking.

@jakebailey
Copy link

wazero just lets you open a directory as a file, so was working before as-is. This feels weird.

This isn't entirely true. wasi has an extended flag O_DIRECTORY to clarify if it should fail if the file is a directory.

We have tests to show we return ENOTDIR on that flag when it is a file not a directory.

Sure, but os.Open doesn't use it. Perhaps it should if this would guarantee detection of a directory regardless of rights, such that ENOTCAPABLE can be avoided?

Regardless, I finally got a working WASI in the browser via wasi-js, so I'm already very happy with the state of tip! Thank you all for your hard work!

@codefromthecrypt
Copy link

Sure, but os.Open doesn't use it. Perhaps it should if this would guarantee detection of a directory regardless of rights, such that ENOTCAPABLE can be avoided?

I think that wasmtime should use O_DIRECTORY and not assume the opposite heuristically regardless of fdRights.

@codefromthecrypt
Copy link

soapbox warning:

At least wasip1 should stick to POSIX in other words, and not go further than that when interpreting flags. It shouldn't be the case that the entire ecosystem should do something weird vs say out loud it is weird. It is effectively POSIX that at least gives us a way out of subjectivity in calls like this.. we can refer to https://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html and see if it is wrong in general to allow open to succeed on a file or a directory.

It is exactly things like this that scare me as WASI as an org distance themselves from POSIX, perhaps with a goal to not be like POSIX. We get more subjectivity than most OS the farther we go from things like this. WASI did cite some faults in the p1 design, but didn't at the same time say it is a goal for the successor to align closer to POSIX. This worries me.

gopherbot pushed a commit that referenced this issue May 2, 2023
Certain WASI runtimes do not support generic symlinks, and
instead return permission errors when they are attempted.
Perform a runtime probe of symlink support in hasSymlink
on wasip1 to determine whether the runtime supports
generic symlinks.

Also perform the same probe on android.

For #59583

Change-Id: Iae5b704e670650d38ee350a5a98f99dcce8b5b28
Reviewed-on: https://go-review.googlesource.com/c/go/+/490115
Auto-Submit: Johan Brandhorst-Satzkorn <[email protected]>
Reviewed-by: Achille Roussel <[email protected]>
TryBot-Bypass: Johan Brandhorst-Satzkorn <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Run-TryBot: Johan Brandhorst-Satzkorn <[email protected]>
Reviewed-by: Bryan Mills <[email protected]>
@johanbrandhorst
Copy link
Member Author

I'm unable to reproduce the net/http/pprof test faliure locally on wasmtime 8.0.1. I'm going to create a CL with some debug logging to see if I can get more information out of the runners.

@johanbrandhorst
Copy link
Member Author

With some help from the wasmtime team I added some runtime debug logging to the most recent test run. We can now see the following logs preceding the failure:

 TRACE wasi_common::snapshots::preview_1::wasi_snapshot_preview1 > wiggle abi; module="wasi_snapshot_preview1" function="path_open"
 TRACE wasi_common::snapshots::preview_1::wasi_snapshot_preview1 > fd=Fd(3) dirflags=SYMLINK_FOLLOW path=*guest 0x1848621/22 oflags=(empty) fs_rights_base=FD_READ | FD_SEEK | FD_FDSTAT_SET_FLAGS | FD_SYNC | FD_TELL | FD_ADVISE | PATH_CREATE_DIRECTORY | PATH_CREATE_FILE | PATH_LINK_SOURCE | PATH_LINK_TARGET | PATH_OPEN | FD_READDIR | PATH_READLINK | PATH_RENAME_SOURCE | PATH_RENAME_TARGET | PATH_FILESTAT_GET | PATH_FILESTAT_SET_TIMES | FD_FILESTAT_GET | FD_FILESTAT_SET_SIZE | FD_FILESTAT_SET_TIMES | PATH_SYMLINK | PATH_REMOVE_DIRECTORY | PATH_UNLINK_FILE | POLL_FD_READWRITE fs_rights_inheriting=FD_DATASYNC | FD_READ | FD_SEEK | FD_FDSTAT_SET_FLAGS | FD_SYNC | FD_TELL | FD_WRITE | FD_ADVISE | FD_ALLOCATE | PATH_CREATE_DIRECTORY | PATH_CREATE_FILE | PATH_LINK_SOURCE | PATH_LINK_TARGET | PATH_OPEN | FD_READDIR | PATH_READLINK | PATH_RENAME_SOURCE | PATH_RENAME_TARGET | PATH_FILESTAT_GET | PATH_FILESTAT_SET_SIZE | PATH_FILESTAT_SET_TIMES | FD_FILESTAT_GET | FD_FILESTAT_SET_SIZE | FD_FILESTAT_SET_TIMES | PATH_SYMLINK | PATH_REMOVE_DIRECTORY | PATH_UNLINK_FILE | POLL_FD_READWRITE fdflags=(empty)
 TRACE wasi_common::snapshots::preview_1::wasi_snapshot_preview1 > result=Err(Error { inner: Unknown OS error

Caused by:
    Text file busy (os error 26) })
Error: failed to run main module `/workdir/tmp/go-build3551866731/b001/pprof.test`

Caused by:
    0: failed to invoke command default
    1: error while executing at wasm backtrace:
           0: 0x1237f4 - <unknown>!<wasm function 1451>
           1: 0x12591e - <unknown>!<wasm function 1462>
           2: 0x16a6d2 - <unknown>!<wasm function 1741>
           3: 0x16bbe9 - <unknown>!<wasm function 1750>
           4: 0x169264 - <unknown>!<wasm function 1734>
           5: 0x2b98b3 - <unknown>!<wasm function 2808>
           6: 0x2da622 - <unknown>!<wasm function 2876>
           7: 0x2dabea - <unknown>!<wasm function 2878>
           8: 0x2d08b8 - <unknown>!<wasm function 2867>
           9: 0x2e0153 - <unknown>!<wasm function 2884>
          10: 0x2c33ca - <unknown>!<wasm function 2836>
          11: 0x110a34 - <unknown>!<wasm function 1265>
          12: 0x110acc - <unknown>!<wasm function 1269>
    2: Unknown OS error
    3: Text file busy (os error 26)
FAIL	net/http/pprof	1.732s
FAIL
go tool dist: Failed: exit status 1

Not sure it helps us that much, the error still seems to be Text file busy (os error 26). It's still unclear why it only seems to happen on the runners, and also why wazero isn't failing in the same way.

@codefromthecrypt
Copy link

codefromthecrypt commented May 18, 2023

why wazero isn't failing in the same way

a lot of wasm runtimes share implementation code, but wazero has no deps, everything is from scratch. so we will fail in different ways than others ;)

@johanbrandhorst
Copy link
Member Author

It appears this failure may have exposed an issue in wasmtime, I've been told the team is looking into it.

@dgryski
Copy link
Contributor

dgryski commented May 19, 2023

The next wasmtime release (in ~two weeks) that includes bytecodealliance/wasmtime#6265 will fix the net/http/pprof ETXTBSY issue.

@johanbrandhorst
Copy link
Member Author

I'll look out for the next release, thanks!

@dmitshur
Copy link
Contributor

dmitshur commented May 25, 2023

https://build.golang.org/log/a1da7514866e849d7af09fd7ff1af7fc8d11030f looks like the first build using a builder image rebuilt with wasmtime v9.0.1 (CL 496920).

It made it past net/http/pprof and got as far cmd/internal/testdir.Test/env.go:

--- FAIL: Test (0.01s)
    --- FAIL: Test/env.go (4.19s)
        testdir_test.go:132: exit status 1
            PATH is empty
            exit status 1
            
FAIL
FAIL	cmd/internal/testdir	33.624s

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/498616 mentions this issue: misc/wasm: set PATH variable in exec

@johanbrandhorst
Copy link
Member Author

https://go-review.googlesource.com/c/go/+/498616 unexpectedly fixed all the remaining test failures, so this issue is fixed 🎉.

@dmitshur dmitshur added this to the Go1.21 milestone May 26, 2023
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels May 26, 2023
gopherbot pushed a commit that referenced this issue May 26, 2023
The PATH variable is required to run the testenv tests.
Set it for all the runtime invocations where we don't
already set it by inheriting from the environment.

For #59583
For #59907
For #60097

Change-Id: If582dd8f086e3f40bc58d555f6034dcffe6f8e5f
Reviewed-on: https://go-review.googlesource.com/c/go/+/498616
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Reviewed-by: Bryan Mills <[email protected]>
Run-TryBot: Johan Brandhorst-Satzkorn <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/498655 mentions this issue: dashboard: remove wasmtime known issue

gopherbot pushed a commit to golang/build that referenced this issue May 30, 2023
The wasmtime WASI runtime now passes the standard library test suite.

For golang/go#59583.

Change-Id: I8e87cb46f3b7799996d4f1312478885e22fca1f8
Reviewed-on: https://go-review.googlesource.com/c/build/+/498655
Reviewed-by: Dmitri Shuralyov <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Auto-Submit: Johan Brandhorst-Satzkorn <[email protected]>
Reviewed-by: Achille Roussel <[email protected]>
Reviewed-by: Michael Knyszek <[email protected]>
Run-TryBot: Johan Brandhorst-Satzkorn <[email protected]>
@golang golang locked and limited conversation to collaborators May 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly issues FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

10 participants