Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(onnxruntime-web): Allow the WASM backend to import the emscripten Module via a user-land defined loader #21430

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kyr0
Copy link

@kyr0 kyr0 commented Jul 20, 2024

Description

This minimal change will allow defining a custom loader for the WASM backend to be passed via flags.importWasmModule (aka env), allowing for Inversion of Control (IoC) and increased flexibility in loading the WASM backend. The change is minimally invasive.

Motivation and Context

As described at length in #20876, and confirmed by other users as well, specifically @asadm the current WASM backend implementation suffers from the inflexibility of not being able to load the WASM module in user-land code. This leads to issues in contexts like Web Extensions, specifically their background scripts, where dynamic loading isn't possible, as it is prohibited by W3C standards. The issue expands to other areas where complexities in downstream build processes lead to the loader code not being able to load the Module and, in effect, to the WASM backend not being available.

Considerations

Regarding this implementation, I'm concerned that the type import from onnxruntime-web to onnxruntime-common (for the Env interface) which imports the type OrtWasmModule from onnxruntime-web leads to a cyclic dependency and a package boundary issue. This isn't an issue at runtime, as it only affects the type system, but it's clearly not what I'd wish to see in my projects. Maybe it would be a good idea to move the specific Env-extending type definitions of the WASM backend implementation into the onnxruntime-web package, or the OrtWasmModule into common? Duplicate type definitions aren't a good idea either, nor is any a good idea.

As this is my first PR to this repository and as I'm not familiar with the processes, I'm kindly asking for advice. Please guide me in finishing this PR. Thank you in advance!

@kyr0
Copy link
Author

kyr0 commented Jul 20, 2024

@microsoft-github-policy-service agree

@guschmue
Copy link
Contributor

looking

@kyr0
Copy link
Author

kyr0 commented Jul 22, 2024

Alongside this PR and solution, I see potential to improve on the complexities that come with the current WASM runtime code:

The idea would be, to, as part of another, new PR, enhance the current build system for the emscripten based runtime files. I'm imagining additional runtime files that bundle the WASM binary using inline data URI's, so that there is no need to dynamically load these binary files at all, reducing complexities with runtime environments and bundlers downstream.

The file size of these additional runtime file variants would roughly be the size of the current JS runtime + (WASM runtime size * ~1,33). So the initial file size increases, but as the runtime JS files are usually dynamically loaded, this shouldn't be an issue. Additional benefits of this approach would be, that all code regarding WASM file loading can be removed entirely from the emscripten-generated JS runtime. All the imports from "module" for example, that lead to issues with bundlers downstream, when onnxruntime-web is bundled or when userland code bundles it all together, wouldn't be an issue anymore.

Finally, the top-level await issue downstream libraries and users suffer from, is only relevant for Node.js environments. Why not build dedicated runtimes for the web platform and server-side JS? I demonstrate a quite "hacky" and pragmatic approach here: https://github.com/kyr0/fast-dotproduct/blob/main/scripts/build.ts#L51

Proposal (for another PR):

  • Enhancing the build system to spit out 4 new variants of each WASM runtime:
    • *-inlined-web.${"mjs"|"cjs"}
    • *-inlined-node.${"mjs"|"cjs"}
      As explained above, these would not come with an additional wasm file, as these would be inlined.

One con of those new runtime variants would be that instantiateStreaming would make no sense at all, as the whole binary is loaded with the JS code that instantiates the WebAssembly module. However, I believe, that this wouldn't be a true con in real-life. Loading a model takes much longer initially than loading a small WASM binary runtime... and it's pretty fast.

@fs-eire
Copy link
Contributor

fs-eire commented Jul 28, 2024

The story of wasm loading for onnxruntime-web is a little bit complicated. Let's go to the conclusion (the current status) first:

For ort-wasm-*.mjs file:

  1. using import statement to import onnxruntime-web will resolve to the "bundle" version, which no longer use dynamic loading for the .mjs file.

    • "bundle" version means the final .mjs file generated in build steps:
      • ort.all.bundle.min.mjs, if import onnxruntime-web/all
      • ort.webgpu.bundle.min.mjs, if import onnxruntime-web/webgpu
      • ort.bundle.min.mjs, if import onnxruntime-web
  2. using the non-"bundle" version directly (eg. import from CDN), will load the ort-wasm-*.mjs dynamically (via import())

  3. using UMD will also load the ort-wasm-*.mjs dynamically (via import())

For ort-wasm-*.wasm file:

=========================

Now answers to the questions:

Q: Using a huge data URL for onnxruntime webassembly binary to avoid dynamic loading of .wasm

The idea is to make it working in environments where dynamic loading is not available. This is contradict to "but as the runtime JS files are usually dynamically loaded, this shouldn't be an issue.". And onnxruntime wasm is big: it's already > 20MB for the JSEP build.
There are also other considerations: wasm streaming loading, cache, memory usage (a huge JS file with a LONG string inside is not fun for the JS VM). If everything is bundled then it means user's main-bundle.min.js will also be huge, which is going to be a noticeable loading time and memory usage regression.

As a summary, the idea is very good solution for small wasm files but IMO onnxruntime-web is not small enough to turn the pros on cons.

Q: Why not build dedicated runtimes for the web platform and server-side JS?

In short, Emscripten does not allow to build multiple JS glue files for one webassembly output. See emscripten-core/emscripten#21899

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants