-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Fix WASM executor without instance reuse; cleanups and refactoring #10313
Fix WASM executor without instance reuse; cleanups and refactoring #10313
Conversation
@koute should we expect some performance impact changing this ? |
Nope. It might change the performance of the executor without instance reuse, however:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I welcome the clean up
|
||
// Scan all imports, find the matching host functions, and create stubs that adapt arguments | ||
// and results. | ||
let imports = crate::imports::resolve_imports( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
do you think it would be better to push imports
to be a submodule of instance_wrapper
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... to be honest I don't think it's worth it in this case.
We could either make it a proper submodule (as in - dedicated directory with a dedicated file in it), but it seems silly for only a single 300 line file, or we could just inline it into instance_wrapper.rs
, but it's kinda nice that the imports
is in its own dedicated file which only does one thing and exports only one thing (the Imports
struct + a function to create it).
client/executor/wasmtime/src/host.rs
Outdated
self.host_state_mut().sandbox_store.0 = Some(store); | ||
|
||
let instance_idx_or_err_code = | ||
match result.expect("instantiating the sandbox does not panic") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, the sandbox module or runtimes can actually panic!
When state-machine runs, it needs to have access to storage backend. This storage backend can be trusted, i.e. the on-disk storage. We assume it never panics and if it does, then that's an unrecoverable error.
However, when running a light-client, a backend could be based on a witness - basically data, and a merkle proof that that data actually comes from where it is claimed to come from. The caveat is that we do not eagerly check this proof, as far as I understand, and there are situations when the runtime may request a storage entry that actually does not present in the witness. In such case, a panic is emitted. You can get more details and my complaints here under the heading "panics". I assumed we could remove this code by now I am not sure → we will need at least some way to communicating such conditions through the sandbox layer (cc @athei) (assuming we will continue to support sandbox API)
To tie back it to this specific case, I assume the following can happen:
- on a light client we execute some call with a proof. The proof is corrupted and the backend will panic if X is accessed.
- the runtime is spawned.
- the runtime instantiates a new sandbox with some wasm code.
- In the
start
function calls back into the runtime and makes the runtime fetch X from storage and thus panics. - it unwinds here and here we panic due to unwrap losing the context of the original panic.
Soi I am wondering, if were better to resume_unwind
or not catching at all?
cc @bkchr
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good to know! Okay, so since this is technically outside of the scope of this PR I'll just do a resume_unwind
to keep the original behavior. (I don't know, this might not be be necessary and maybe we could just ignore the panic altogether, however I think it's just simpler to catch it and restore the sandbox store anyway even if might not be necessary now. One less thing you have to think about whether it'll break something or not.)
One more minor cleanup - I've moved the utility functions for reading/writing memory from |
bot merge |
…aritytech#10313) * Fix WASM executor without instance reuse; cleanups and refactoring * Align to review comments * Move the functions for reading/writing memory to `util.rs` * Only `#[ignore]` the test in debug builds * More review comments and minor extra comments
…aritytech#10313) * Fix WASM executor without instance reuse; cleanups and refactoring * Align to review comments * Move the functions for reading/writing memory to `util.rs` * Only `#[ignore]` the test in debug builds * More review comments and minor extra comments
The current codepath for the WASM executor without instance reuse is broken. We're constantly reusing the same
Store
, which according to thewasmtime
docs we're not supposed to do:This results in a memory leak, and after 10k calls into the runtime the executor will get permabricked and will panic on each call since it won't be able to spawn any new instances anymore. (Fortunately this is not a problem in practice since the codepath which reuses the instances is the default, and that one is unaffected.)
So this PR fixes the problem by bundling the
Store
along with each instance of the executor.Also, since this code was originally written
wasmtime
grew a bunch of new APIs which we can use to clean up the code a little, so that's what I did. MultipleRc
s andRefCell
s were removed which makes it easier to reason about the lifetimes now. We also had a blanketunsafe impl Send for WasmtimeInstance {}
which was also removed. Unfortunately we still need anunsafe impl Send
for the sandbox store, but it's still an improvement since theimpl
now touches strictly less code. (It'd be nice to get rid of that one too, but unfortunately that would necessitate even bigger changes.)