-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Web] WebGPU and WASM Backends Unavailable within Service Worker #20876
Comments
Than you for reporting this issue. I will try to figure out how to fix this problem. |
So it turns out to be that dynamic import (ie. Currently, the WebAssembly factory (wasm-factory.ts) uses dynamic import to load the JS glue. This does not work in service worker. A few potential solutions are also not available:
I am now trying to make a JS bundle that does not use dynamic import for usage of service worker specifically. Still working on it |
Thanks, I appreciate your efforts around this. It does seem like some special-case bundle will need to be built after all; you might need |
I have considered this option. However, Emscripten does not offer an option to output both UMD(IIFE+CJS) & ESM for JS glue (emscripten-core/emscripten#21899). I have to choose either. I choose the ES6 format output for the JS glue, because of a couple of problems when import UMD from ESM, and I found a way to make ORT web working, - yes this need the build script to do some special handling. And this will only work for ESM, because the JS glue is ESM and it seems no way to import ESM from UMD in service worker. |
### Description <!-- Describe your changes. --> This PR allows to build ORT web to `ort{.all|.webgpu}.bundle.min.mjs`, which does not have any dynamic import. This makes it possible to use ort web via static import in service worker. Fixes #20876
@ggaabe Could you please help to try |
@fs-eire my project is dependent on transformersjs, which imports onnxruntime webgpu backend like this here: https://github.com/xenova/transformers.js/blob/v3/src/backends/onnx.js#L24 Is this the right usage? In my project I've added this to my package.json to resolve onnx-runtime to this new version though the issue is still occurring:
|
Maybe also important: The same error is still occurring in same spot in inference session in the onnx package and not from transformersjs. Do I need to add a resolver for onnxruntime-common as well? |
#20991 makes default ESM import to use non-dynamic-import and hope this change may fix this problem. PR is still in progress |
Hi @fs-eire, is the newly-merged fix in a released build I can try? |
Please try 1.19.0-dev.20240612-94aa21c3dd |
@fs-eire EDIT: Nvm the comment I just deleted, that error was because I didn't set the webpack However, I'm getting a new error now (progress!):
|
Update: Found the error is happening in here: onnxruntime/js/common/lib/backend-impl.ts Lines 83 to 86 in fff68c3
For some reason the webgpu backend.init promise is rejecting due to the |
Could you share me the reproduce steps? |
@fs-eire You'll need to run the webGPU setup in a chrome extension.
|
@ggaabe I did some debug on my box and made some fixes -
|
Awesome, thank you for your thoroughness in explaining this and tackling this head on. Is there a dev channel version I can test out? |
Not yet. Will update here once it is ready. |
sorry to bug; is there any dev build number? wasn't sure how often a release runs |
Please try 1.19.0-dev.20240621-69d522f4e9 |
@fs-eire I'm getting one new error:
I pushed the code changes to my repo and fixed the call to the tokenizer. To reproduce, just type 1 letter in the chrome extension’s text input and wait |
Hey, I also need this. I am struggling with importing this version. So far I have been importing ONNX using |
just replace |
@kyr0 thank you a lot for your willing to help. I am currently in vacation but I will pick up this thread when I am back by end of this month. |
@fs-eire Oh, I didn't mean to disturb you on vacation. Please enjoy, relax and have a lot of fun! |
@kyr0 do you by any chance have a fork that I can try with the custom I would like to try your workaround since I am stuck in a similar spot (.wasm needs to be imported in userland code) |
@asadm Well, I do have a solution for you to PoC if it would work, but I don't have a fork/PR yet. It's a bit messy, but I'll explain. If your code or a library that uses import { ... } from "onnxruntime-web/webgpu" I decided, just to try, to simply change the code of {
".": {
"node": {
"import": "./dist/ort.mjs",
"require": "./dist/ort.js"
},
"import": "./dist/ort.mjs",
"require": "./dist/ort.js",
"types": "./types.d.ts"
},
"./all": {
"node": null,
"import": "./dist/ort.all.bundle.min.mjs",
"require": "./dist/ort.all.min.js",
"types": "./types.d.ts"
},
"./wasm": {
"node": null,
"import": "./dist/ort.wasm.mjs",
"require": "./dist/ort.wasm.js",
"types": "./types.d.ts"
},
"./webgl": {
"node": null,
"import": "./dist/ort.webgl.min.mjs",
"require": "./dist/ort.webgl.min.js",
"types": "./types.d.ts"
},
"./webgpu": {
"node": null,
"import": "./dist/ort.webgpu.mjs",
"require": "./dist/ort.webgpu.js",
"types": "./types.d.ts"
},
"./training": {
"node": null,
"import": "./dist/ort.training.wasm.min.mjs",
"require": "./dist/ort.training.wasm.min.js",
"types": "./types.d.ts"
}
}, If you do that change in your local filesystem in your project, your build process will now point to those files, no matter what your build system looks like. The next thing was to make I automated this process in a little NPM postinstall script, as I didn't want to spend much time on figuring all the build processes of .replace(/importWasmModule\(/g, '(typeof env.importWasmModule === "function" ? env.importWasmModule : importWasmModule)(') I know, I know. Hacky.. but pragmatic. You will loose your changes each time you re-install your dependencies. But, after all, you can now simply assign the function to the // @ts-ignore
import getModule from "./node_modules/onnxruntime-web/dist/ort-wasm-simd-threaded.jsep";
// you may need to copy this file and the WASM file into a folder so that the loader can fetch() it well
./node_modules/onnxruntime-web/dist/ort-wasm-simd-threaded.wasm
// this is a working example in my project - it loads just fine now
env.backends.onnx.importWasmModule = async (
mjsPathOverride: string,
wasmPrefixOverride: string,
threading: boolean,
) => {
console.log(
"importWasmModule",
mjsPathOverride,
wasmPrefixOverride,
threading,
);
return [
undefined,
async (moduleArgs = {}) => {
console.log("moduleArgs", moduleArgs);
return await getModule(moduleArgs);
},
];
}; My proposal would be, just to change this one line of code in this project to allow for optional Inversion of Control @fs-eire -- it could be documented with my example code. This would probably all issues regarding "user-land based WASM loading". Okay, I made a PR for that: #21430 |
@kyr0 that is amazing! I was also hacking around with unminified bundle (don't want to rebuild from source etc). Thank you so much for detailed solution, can't wait to try this when I get home! |
@asadm You're welcome. No worries, we'll get this to work just fine, also for you :) Here's an impression from my background worker with the monkey patch applied. It's working, even through the Might take a while though, until downstream projects will adopt the new version. But once the maintainers here deploy a new version including my PR, we will be able to just override the |
@asadm https://github.com/kyr0/easy-embeddings demonstrates the whole process; I automated the monkey-patching for the moment... https://github.com/kyr0/easy-embeddings/blob/main/scripts/setup-transformers.ts |
I created #21534, which is a replacement of #21430:
|
### Description This PR adds a new option `ort.env.wasm.wasmBinary`, which allows user to set to a buffer containing preload .wasm file content. This PR should resolve the problem from latest discussion in #20876.
1.19.0-dev.20240801-4b8f6dcbb6 includes the change. |
To clarify, is the best way to go about running transformers.js with WebGPU for the onnxruntime to monkeypatch the package to make the necessary wasm stuff load in each service worker, a la @kyr0's Has anyone had luck / have tips for just running the v3 branch of Transformers.js? Or, maybe more precisely — do we know how something like Segment Anything WebGPU, which Xenova has in an HF Space, is working? Seems like there's been some official solution here but I can't find it documented / implemented well. |
I am working with Transformer.js to make v3 branch compatible with latest module system. This is one of the merged changes: huggingface/transformers.js#864. You probably need to use some workaround for now, but (hopefully) eventually you should be able to use it out of box. |
@lucasgelfond Now that the new updates from @fs-eire are in place, I'm probably able to streamline the workaround. I'll have a look soon, but as I'm on vacation right now, I cannot give an ETA, unfortunately. |
Has anyone tried getting these imports working in Vite/other bundlers? When I try the classic:
(which works in create-react-app), Vite says:
Anyways, I tried importing from url, a la
which Vite also doesn't like
I disabled SSR in Svelte but still seemingly no luck/change. I tried manually downloading the files with CURL, where I got an error about the lack of source map, so, I also downloaded .min.js.map. When I run it now, this works, but I get back to the original error in the thread about unavailable backends:
I figured it might work to just import directly, so I also tried:
but then I got Anyone have ideas of how to handle? Happy to add more verbose error messages for any of the stuff above. |
Could you share me a repo that I can reproduce the issue? I will take a look. |
@fs-eire you are amazing! https://github.com/lucasgelfond/webgpu-sam2 I swapped over to Webpack (in the svelte-webpack directory) but the original Vite version is in there. No immediate rush because I solved temporarily with Webpack, but Webpack breaks some other imports so would be awesome to move back—thanks so much again! |
👋 Thank you @fs-eire ! I tried using |
Doesn't it work if by just replacing |
It works indeed 😵 I tried doing |
So im still having this issue in 1.19.2. This is in the context of a chrome extension, mv3, This: onnxruntime/js/web/lib/wasm/wasm-factory.ts Line 119 in e91ff94
Calling this:
Seems to lead to this:
I realize the poster above me is running the same setup and has it working, but Im really not sure what to do differently. Using code from the test file, ive tried replicating it like so, but this doesnt seem to work:
|
If you are using 1.19.2 and still ran into this error, it is probably because your bundler imports onnxruntime-web as UMD. please verify the following:
|
Thanks so much for your help! The bundler was indeed the issue, for anyone reading this: I was using vite 4 and it was prefering the browser field in the package.json which led to the wrong file. Switching to vite 5 solves that issue as you can change the order of fields, even though it will by default already prefer the exports field. I have another issue now though: Now that the correct file has made it, I am getting this error:
any ideas what that might be? edit: edit#2: web worker is not available in service workers which run through the background script, hence cpu does not work there edit#3 should multithreading be possible in a chrome extension? ive got this So yeah if anyone has successfully used cpu multithreading in a chrome extension, doesnt matter how, please let me know. |
In my understanding, |
Describe the issue
I'm running into issues trying to use the WebGPU or WASM backends inside of a ServiceWorker (on a chrome extension). More specifically, I'm attempting to use Phi-3 with transformers.js v3
Every time I attempt this, I get the following error:
This is originating in the
InferenceSession
class injs/common/lib/inference-session-impl.ts
.More specifically, it's happening in this method:
const [backend, optionsWithValidatedEPs] = await resolveBackendAndExecutionProviders(options);
where the implementation is in
js/common/lib/backend-impl.ts
and thetryResolveAndInitializeBackend
fails to initialize any of the execution providers.WebGPU is now supported in ServiceWorkers though; it is a recent change and it should be feasible. Here were the chrome release notes.
Additionally, here is an example browser extension from the mlc-ai/web-llm framework that implements WebGPU usage in service workers successfully:
https://github.com/mlc-ai/web-llm/tree/main/examples/chrome-extension-webgpu-service-worker
Here is some further discussion on this new support from Google itself:
https://groups.google.com/a/chromium.org/g/chromium-extensions/c/ZEcSLsjCw84/m/WkQa5LAHAQAJ
So technically I think it should be possible for this to be supported now? Unless I'm doing something else glaringly wrong. Is it possible to add support for this?
To reproduce
Download and set up the transformers.js extension example and put this into the background.js file:
Urgency
this would help enable a new ecosystem to build up around locally intelligent browser extensions and tooling.
it's urgent for me because it would be fun to build and I want to build it and it would be fun to be building it rather than not be building it.
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
1.19.0-dev.20240509-69cfcba38a
Execution Provider
'webgpu' (WebGPU)
The text was updated successfully, but these errors were encountered: