-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move function names out of Module
#3789
Move function names out of Module
#3789
Conversation
This commit moves function names in a module out of the `wasmtime_environ::Module` type and into separate sections stored in the final compiled artifact. Spurred on by bytecodealliance#3787 to look at module load times I noticed that a huge amount of time was spent in deserializing this map. The `spidermonkey.wasm` file, for example, has a 3MB name section which is a lot of unnecessary data to deserialize at module load time. The names of functions are now split out into their own dedicated section of the compiled artifact and metadata about them is stored in a more compact format at runtime by avoiding a `BTreeMap` and instead using a sorted array. Overall this improves deserialize times by up to 80% for modules with large name sections since the name section is no longer deserialized at load time and it's lazily paged in as names are actually referenced.
For reference, the 80% number comes from the benchmark in #3790 and looks like:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Another thing we could do to take this even further is have a header on this new section (or add another new section) that contains the function index -> (offset, length) pairs. This way the |
Subscribe to Label Actioncc @peterhuene
This issue or pull request has been labeled: "wasmtime:api"
Thus the following users have been cc'd because of the following labels:
To subscribe or unsubscribe from this label, edit the |
I was considering doing that but ended up deciding against it because we dont have much other laziness in
where here it's on the order of the size of the text section which was massive. This PR actually doesn't help this too too much and instead yields:
where I'll leave optimizing |
Need to not only sort afterwards but also first to ensure the data of the name section is consistent.
Wouldn't even need to be lazy inside #[repr(C)]
struct OffsetAndLength {
offset: usize,
length: usize,
}
let header_section_ptr: *const OffsetAndLengthPair = ...;
let header_section_len = ...;
let header_section = slice::from_raw_parts(...);
return header_section[i]; |
But yeah no need to hold up this PR or do that now, if we don't want to. |
Oh actually that's a great point. We already do that for the address map section where there's a special encoder followed by a specialized lookup routine for the data produced by that encoder. While we need to be somewhat careful about alignments and such it's not really the end of the world. Arguably almost everything in Anyway I think I'll probably go with this for now since it's relatively lightweight, but there's definitely still more fruit to pick on the accelerate-load-precompiled-modules tree! |
* Move function names out of `Module` This commit moves function names in a module out of the `wasmtime_environ::Module` type and into separate sections stored in the final compiled artifact. Spurred on by bytecodealliance#3787 to look at module load times I noticed that a huge amount of time was spent in deserializing this map. The `spidermonkey.wasm` file, for example, has a 3MB name section which is a lot of unnecessary data to deserialize at module load time. The names of functions are now split out into their own dedicated section of the compiled artifact and metadata about them is stored in a more compact format at runtime by avoiding a `BTreeMap` and instead using a sorted array. Overall this improves deserialize times by up to 80% for modules with large name sections since the name section is no longer deserialized at load time and it's lazily paged in as names are actually referenced. * Fix a typo * Fix compiled module determinism Need to not only sort afterwards but also first to ensure the data of the name section is consistent.
This commit moves function names in a module out of the
wasmtime_environ::Module
type and into separate sections stored in thefinal compiled artifact. Spurred on by #3787 to look at module load
times I noticed that a huge amount of time was spent in deserializing
this map. The
spidermonkey.wasm
file, for example, has a 3MB namesection which is a lot of unnecessary data to deserialize at module load
time.
The names of functions are now split out into their own dedicated
section of the compiled artifact and metadata about them is stored in a
more compact format at runtime by avoiding a
BTreeMap
and insteadusing a sorted array. Overall this improves deserialize times by up to
80% for modules with large name sections since the name section is no
longer deserialized at load time and it's lazily paged in as names are
actually referenced.