Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Underlying Wasm runtime, wasmer -> wasmtime #336

Closed
brooksmtownsend opened this issue Jun 9, 2022 · 17 comments
Closed

Underlying Wasm runtime, wasmer -> wasmtime #336

brooksmtownsend opened this issue Jun 9, 2022 · 17 comments

Comments

@brooksmtownsend
Copy link
Contributor

Slightly related to #282

So wasmer is the underlying runtime at the moment for wasmex, would you consider switching the underlying runtime of this project to wasmtime like https://github.com/viniarck/wasmtime-ex has?

We've loved using this project and are looking to take advantage of the features that come first to wasmtime, and in a perfect world this would be done through swappable runtimes like #282. In the shorter term, is there any reservation to changing the runtime?

Thanks @tessi 🚀

@tessi
Copy link
Owner

tessi commented Jun 10, 2022

Thanks for asking! 💜
I would love to have wasmtime as the first other engine. I even have a local (currently not working) fork porting wasmex to wasmtime - I think having a fork is the first step to figure out the differences we need to be aware of when implementing #282

The only problem is my limited free time. Do you have the time to take over and/or help here?

@brooksmtownsend
Copy link
Contributor Author

@tessi would love to help out here, especially if you're close! Would you be willing to publish that fork and give a quick TL;DR on what you've tried and what's not working?

@tessi
Copy link
Owner

tessi commented Jul 27, 2022

Hey @brooksmtownsend 👋
sorry for letting you hang so long. But I have mildly good news for a wasmtime fork of wasmex. I used some of my vacation time to get my fork working enough to make it public.
Please find it here.

It turned out to be a way bigger refactoring than anticipated and, unfortunately, it is not fully working at the moment. This is where I could really use some of your help :)

  • the biggest issue right now is a deadlock where I am not sure how to solve it best. The reason is that wasmtime needs a store (a place that hold memory for WASM instances/modules) for almost all API calls. I added elixir APIs to create new stores so that we can pass down the store to all wasm calls. The problem is that we need to place the store (on the rust side) behind a mutex to make it thread safe. Now, when calling a WASM function we hold a mutex-lock on the store. When that WASM function call requires a callback (an elixir func being executed), that elixir function callback needs the same store for many API calls - e.g. manipulating WASM memory. But these API calls fail since we cannot unlock the mutex again. It's hard to explain, but there is a test that runs this scenario and results in the described deadlock. If you could solve that (and I don't know how exactly yet) that would be huge
  • I haven't tested (or fixed tests for) WASI support yet. The same probably applies for other areas (std in/out/err, pipes).
  • docs are blindly copied from wasmex and are half-wrong after the refactoring

@tessi
Copy link
Owner

tessi commented Aug 4, 2022

Regarding the store-deadlock problem, I just realized that our wrapped functions receive a Caller which happens to implement AsContextMut/AsContext and can act like a Store when called within the function scope.

I think we need to find a way to extract this caller into the elixir fn callback_context that we create here.If we have that done, we could pass the caller (acting as a store) instead of the original deadlocked store.
I stumbled across lifetime issues when attempting that, but maybe you can figure out a way to forward the Caller to Elixir somehow?

@tessi
Copy link
Owner

tessi commented Aug 4, 2022

Maybe we can get inspiration from wasmtime-py, they interface the C API, but they do exactly what I proposed above. They take the caller and pass them as a reference to the python function. I guess they can get around lifetime issues since C doesn't have lifetimes :D but maybe there is a (unsafe?) Rust way to do similar.

@brooksmtownsend
Copy link
Contributor Author

Thank you for the great updates @tessi, I'm following along here but haven't had much free time to look into it yet

@tessi
Copy link
Owner

tessi commented Aug 12, 2022

no worries. I'm in the same boat :)

@tessi
Copy link
Owner

tessi commented Aug 31, 2022

I think i can report progress. I just pushed another (still very broken) version of wasmex-wasmtime which introduces a new concept

  • during function execution we get the caller struct passed from wasmtime, that caller is only valid during function execution and can not safely be passed into elixir land because of lifetimes
  • I introduced a "caller registry" where a caller reference (using an unsafe pointer hack) can be stored in exchange for a "token". That token can be passed up into elixir land. We wrap that token into a StoreOrCaller struct because stores and callers can be passed interchangeably into many functions (e.g. Memory.read).
  • from elixir you can now use that StoreOrCaller within a function call to read/write memory (or do other things where you'd normally need a store) without running into deadlocks.

So far for the theory :D In practice, when doing the above, wasmtime panics because it thinks we're using a different store somehow and I don't know why yet.

I added some debug code here which illustrates the problem and pointer-hack, but simplifies it as much as possible.

Within a wrapped function, I can read memory using the originally provided caller:

let mut buffer = [0];
memory.read(&caller, 0, &mut buffer).unwrap();
println!("buffer 1: {:?}", buffer);

however, after doing that pointer hack - which should give us the same caller because we read from the same memory address - we get a panic:

let raw_ptr: *mut Caller<StoreData> = &mut caller as *mut Caller<StoreData>;
let the_other_caller: &mut Caller<StoreData> = unsafe { &mut *raw_ptr as &mut Caller<StoreData> };

memory.read(the_other_caller, 0, &mut buffer).unwrap();
println!("buffer 2: {:?}", buffer);

// thread '<unnamed>' panicked at 'object used with the wrong store',
// /Users/tessi/.cargo/registry/src/github.com-1ecc6299db9ec823/wasmtime-0.39.1/src/store/data.rs:258:5

you can play with that scenario, when checking out main and running RUST_BACKTRACE=1 mix test test/wasmex_wasmtime_test.exs:125

I hope someone reading this can help me find out why wasmtime thinks this is a new/different store and maybe even fix the problem 🙏

@tessi
Copy link
Owner

tessi commented Sep 1, 2022

Good news! together with @hamptokr we could fix the memory problem and reading/writing memory from within function calls now fully works. In fact, all tests - with the exception of WASI tests as I didn't look at WASI yet - work \o/

Next steps: Fixing WASI and cleaning things up

@tessi
Copy link
Owner

tessi commented Sep 5, 2022

Keeping you updated here: WASI seems to work roughly - I still have to fix file access permissions, but 5/10 wasi tests work. This is getting closer :)

@brooksmtownsend
Copy link
Contributor Author

Perfect timing here, Wasmtime just announced 1.0! Very exciting 😄

@tessi
Copy link
Owner

tessi commented Oct 6, 2022

super exciting indeed! :)

I worked some more on the wasmtime fork and all test are green now 🎉

Screenshot 2022-10-06 at 18 55 13

However, it works only on my wasmtime fork. Next task is to contribute my wasmtime changes back so we can have proper access permissions for preopened directories. I'm currently asking the wasmtime folks how to contribute my patch back here.

@tessi
Copy link
Owner

tessi commented Oct 6, 2022

However, @brooksmtownsend it would be great if you could give that fork a try and see if it works for you.

Some things (especially the WASI API) changed, but I hope the changes are easy to spot reading the source. But if not, please, ask :) Writing docs and a migration guide is another open TODO.

@tessi
Copy link
Owner

tessi commented Dec 30, 2022

Next task is to contribute my wasmtime changes back so we can have proper access permissions for preopened directories.

Turns out that the wasmtime folks recommended to not have access permissions for WASI preopens yet but rather wait it to be implemented "proper" in the future (ticket).

So I removed use of my custom wasmtime fork and instead upgraded to wasmtime 4.0. Any preopened directory will come with full access permissions until wasmtime allows us to restrict access.

In addition, I moved all WASI opts (but especially the directory preopen options) into Elixir structs. This makes it easier to upgrade them in the future because we will get proper compile-time errors whith missing or changed keys (e.g. for those future access permissions). This probably even improves usability.

@tessi
Copy link
Owner

tessi commented Dec 30, 2022

@brooksmtownsend did you have the time to try things out yet? No worries if not :)

I'll soon merge my work back into this repo (in some dev-branch, I guess) to prepare docs, polish, and release the new version

@tessi
Copy link
Owner

tessi commented Jan 2, 2023

The MR that brings the changes back is here and looks good. Docs are updated, changelog is written.

@tessi
Copy link
Owner

tessi commented Jan 3, 2023

Alright, merged the MR 🚢

@tessi tessi closed this as completed Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants