Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust 1.67.0 stopped initializing the WASI environment for exported functions on target="wasm32-wasi" #107635

Open
bkolobara opened this issue Feb 3, 2023 · 11 comments
Labels
C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ P-medium Medium priority

Comments

@bkolobara
Copy link

bkolobara commented Feb 3, 2023

The code:

fn main() {}

#[export_name="test"]
pub fn test() {}

would produce the following Wasm assembly on Rust 1.66.1 compiled with cargo build --target="wasm32-wasi":

;; ... truncated ...
(func $test.command_export (type 0)
    call $__wasm_call_ctors ;; <- WASI initialized here
;; ... truncated ...
(export "test" (func $test.command_export))
(export "main" (func $main.command_export))
;; ... truncated ...

On 1.67.0 it produces this one:

;; ... truncated ...
(func $test (type 0)
    return)
;; ... truncated ...
(export "test" (func $test))
(export "_start" (func $_start))
;; ... truncated ...

Notice that the call to the $__wasm_call_ctors function is missing in the exported test. This means that when entering the Wasm instance through the exported function, it will not have a correctly set up WASI environment and calls to open files or look up environment variables will fail. The _start function in both versions calls $__wasm_call_ctors and works correctly.

I'm not sure what commit broke this. The only PR that is related to this seems to be this one #105405.

EDIT: Or more likely, this one #105395

@bkolobara bkolobara added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Feb 3, 2023
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Feb 3, 2023
@workingjubilee
Copy link
Member

cc @sunfishcode re: apparent wasm32-wasi functionality regression

@workingjubilee workingjubilee added the O-wasm Target: WASM (WebAssembly), http://webassembly.org/ label Feb 3, 2023
@sunfishcode
Copy link
Member

I recognize that it worked in wasm32-wasi in earlier versions, however it was by accident. Rust isn't meant to support having programs with a main function which can be entered without calling the main function.

Are you able to switch the crate to a cdylib, by eg. adding this to Cargo.toml, and removing the main function?

[lib]
crate-type = ["cdylib"]

This configuration is not entirely stable yet, but may work as a temporary workaround.

@bkolobara
Copy link
Author

The workaround works, but it has some significant drawbacks for us. We have been relying on this functionality for the last 2 years. Just to give a bit more context about our use case. We expose an API to wasm modules so that they can create new instances from the same module, but use a different entry point. This requires us to allow entering the module without calling main. And this is one of our core features.

The update to 1.67.0 is a breaking change for all our users. Changing to the [lib] approach without main will also require updating all documentation and tutorials. It also significantly reduces the developer experience around our tooling. For example, it would not be possible to just use cargo run with a set runner anymore to start up applications. The setup would need to be much more complicated.

We have a fairly big community around our products. Just our discord consists of 800+ members and Rust is our primary language. So this change has a significant impact on us. The best outcome for us would be to allow the previous behaviour, both main and explicitly exported functions being allowed to be used as entry points into the instance.

@sunfishcode
Copy link
Member

Directly reverting the PR in question would cause regressions for other users, including users who specifically need the change to not run the initializers on arbitrary exports, and who requested this change. I'm looking into other ways to fix this.

@alexcrichton had the idea that we could remove all use of __attribute__((constructor)) from wasi-libc and convert the last few eager initializations to lazy initialization. That will require some care to make sure we can initialize preopens at the right time, but I'll look into it.

@workingjubilee
Copy link
Member

It's worth noting that while WASI is definitely expected to land in some form, it's not a finished, stable part of WebAssembly, because it's big enough as an idea to have its own WASI subproposals.

If I understand correctly, WASI programs have to initialize their own environment, rather than the system function definitions being included via some dynamic linker equivalent transcluding libc-or-whatnot? And I would guess the same for other things, instead of simply having certain things happen implicitly due to the operating system's program loader? Interesting. On most systems, somehow bypassing those two components of the system could have equally annoying consequences, it just doesn't happen because it's (mostly) not possible.

@apiraino
Copy link
Contributor

apiraino commented Feb 8, 2023

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-medium

@rustbot rustbot added P-medium Medium priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Feb 8, 2023
sunfishcode added a commit to sunfishcode/rust that referenced this issue Feb 10, 2023
Use `__wasilibc_get_environ()` to read the environment variable list
from wasi-libc instead of using `environ`. `environ` is a global
variable which effectively requires wasi-libc to initialize the
environment variables eagerly, and `__wasilibc_get_environ()` is
specifically designed to be an alternative that lets wasi-libc
intiailize its environment variables lazily.

This should have the side effect of fixing at least some of the cases
of rust-lang#107635.
@sunfishcode
Copy link
Member

I've now submitted #107866 which changes Rust's use of the global environ variable, which requires wasi-libc to eagerly initialize environment variables to use the wasi-libc __wasilibc_get_environ function which allows it to lazily initialize them. This should have the side effect of fixing at least some instances of the regression here, as eager initialization means that initialization is never done if _start isn't called, while lazy initialization allows it to be done on demand when a function calls std::env::vars() or similar.

@hbina
Copy link
Contributor

hbina commented Feb 10, 2023

I recognize that it worked in wasm32-wasi in earlier versions, however it was by accident. Rust isn't meant to support having programs with a main function which can be entered without calling the main function.

Are you able to switch the crate to a cdylib, by eg. adding this to Cargo.toml, and removing the main function?

[lib]
crate-type = ["cdylib"]

This configuration is not entirely stable yet, but may work as a temporary workaround.

This workaround doesn't seem to work for me. I compiled, linked and initialize a wasm blob here https://github.com/lapce/lapce/blob/master/lapce-proxy/src/plugin/wasi.rs#L360-L520

And it is trying to compile a wasm of this repo https://github.com/hbina/lapce-prettier

And with some grep-foo, I can see that it is trying to initialize the environment variable eagerly?

Compiling as a main.rs,

hbina@akarin ~/g/lapce-prettier (master)> just diagnose
cargo +nightly build --target="wasm32-wasi"
   Compiling lapce_prettier v0.0.0 (/home/hbina/git/lapce-prettier)
    Finished dev [unoptimized + debuginfo] target(s) in 0.75s
wasm-tools print target/wasm32-wasi/debug/lapce_prettier.wasm > ./lapce_prettier.txt
rg -w "export" lapce_prettier.txt
1262454:  (export "memory" (memory 0))
1262455:  (export "_start" (func $_start))
1262456:  (export "__main_void" (func $__main_void))
1262457:  (export "handle_rpc" (func $handle_rpc))

Where _start looks like this

  (func $_start (;9;) (type 2)
    (local i32)
    block ;; label = @1
      block ;; label = @2
        i32.const 0
        i32.load offset=1353400
        br_if 0 (;@2;)
        i32.const 0
        i32.const 1
        i32.store offset=1353400
        call $__wasm_call_ctors
        call $__main_void
        local.set 0
        call $__wasm_call_dtors
        local.get 0
        br_if 1 (;@1;)
        return
      end
      unreachable
      unreachable
    end
    local.get 0
    call $__wasi_proc_exit
    unreachable
  )

And handle_rpc here goes into the actual function immediately without initializing anything.

Compiling as lib.rs with the suggestion above,

hbina@akarin ~/g/lapce-prettier (master)> just diagnose
cargo +nightly build --target="wasm32-wasi"
    Finished dev [unoptimized + debuginfo] target(s) in 0.42s
wasm-tools print target/wasm32-wasi/debug/lapce_prettier.wasm > ./lapce_prettier.txt
rg -w "export" lapce_prettier.txt
1260914:  (export "memory" (memory 0))
1260915:  (export "handle_rpc" (func $handle_rpc.command_export))

Where handle_rpc looks like

  (func $handle_rpc.command_export (;8781;) (type 10)
    call $handle_rpc
    call $__wasm_call_dtors
  )

It doesn't call ctors because....the dtor is just a dummy?

  (func $dummy (;8593;) (type 10))
  (func $__wasm_call_dtors (;8594;) (type 10)
    call $dummy
    call $dummy
  )

but it doesn't seem to work still.

Edit: Is there a tool to do this for me? Something like wasm-tools print --export $name to print the body of the exported function called $name. Is there a jq equivalent for WASM yet?

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 10, 2023
…environ, r=workingjubilee

Allow wasi-libc to initialize its environment variables lazily.

Use `__wasilibc_get_environ()` to read the environment variable list from wasi-libc instead of using `environ`. `environ` is a global variable which effectively requires wasi-libc to initialize the environment variables eagerly, and `__wasilibc_get_environ()` is specifically designed to be an alternative that lets wasi-libc intiailize its environment variables lazily.

This should have the side effect of fixing at least some of the cases of rust-lang#107635.
@workingjubilee
Copy link
Member

@hbina Can you try again with the latest nightly? It should be carrying the partial fix mentioned, which should cover your case.

@workingjubilee workingjubilee removed the regression-untriaged Untriaged performance or correctness regression. label Feb 12, 2023
@hbina
Copy link
Contributor

hbina commented Feb 12, 2023

Hi @workingjubilee I have updated my previous post to add some clarity and use +nightly and it still doesn't work :( Whatever happened to _initialize?

@sunfishcode
Copy link
Member

The latest nightly now includes #107866 and lazily initializes environment variables, at least in simple examples, which I hope will fix the immediate bug with environment variables.

For the more general fix, I've now submitted #108097 which switches Rust's cdylib on Wasm to always contain an _initialize function. This is debatable, and my comment here discusses the options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-wasm Target: WASM (WebAssembly), http://webassembly.org/ P-medium Medium priority
Projects
None yet
Development

No branches or pull requests

6 participants