Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[browser] WASM sidecar - multi-threading proposal #91731

Closed

Conversation

pavelsavara
Copy link
Member

@pavelsavara pavelsavara commented Sep 7, 2023

This is (anternative 10. like) "sidecar" proposal on how to enable threads on WASM, while allowing for blocking .Wait, lock() { ... } and similar from user C# code.

Feedback and questions are welcome.

Contributes to #85592

Alternative to #91696

@pavelsavara pavelsavara added arch-wasm WebAssembly architecture area-VM-threading-mono os-browser Browser variant of arch-wasm labels Sep 7, 2023
@pavelsavara pavelsavara added this to the 9.0.0 milestone Sep 7, 2023
@pavelsavara pavelsavara self-assigned this Sep 7, 2023
@ghost
Copy link

ghost commented Sep 7, 2023

Tagging subscribers to 'arch-wasm': @lewing
See info in area-owners.md if you want to be subscribed.

Issue Details

This is (anternative 10. like) proposal on how to enable threads on WASM, while allowing for blocking .Wait, lock() { ... } and similar from user C# code.

Feedback and questions are welcome.

Contributes to #85592

Alternative to #91696

Author: pavelsavara
Assignees: pavelsavara
Labels:

arch-wasm, area-VM-threading-mono, os-browser

Milestone: 9.0.0


## C# Thread
- could block on synchronization primitives
- without JS interop. calling JSImport will PNSE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think leaving out JSImport makes sense as a 1.0 version of this, but it's meaningful to support JSImport so that the "server" here can utilize browser APIs, to compensate for the lack of native APIs. Like for example if the "server" is able to use an offscreen canvas it would be able to access hardware accelerated rendering, which would enable developers to use stuff like browser machine learning APIs.

Copy link
Member Author

@pavelsavara pavelsavara Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be able to do that we need JSImport/JSExport interop on the UI thread.

There are 4 choices how to do that
A) dispatch via managed as described in deputy-worker proposal

B) dispatch via just JS - double proxy

  • Which means we will have JS proxy in UI thread of C# proxy in side-car worker.
  • Comlink style.
  • We could write out own and make in also spin-blocking the UI.
  • But this is double dispatch on each call. Hopping over 2 threads and their main loop. This is in cases when caller was not on side-car.
  • It will be slow and difficult to GC.

C) dispatch just the mono_wasm_bind_js_function and mono_wasm_bind_cs_function and then do the JS side of the marshaling in the UI thread. But the implementation is heavily dependent on

  • memory (that's easy one)
  • shared code invoke-cs.ts, invoke-js.ts etc
  • but that is dependent on Mono C methods and emscripten methods.
  • Most of them are synchronous and need to be fast. Because some of them are called per argument.
  • But this proposal assumes that emscripten is not on UI thread!
    • emscripten: stack alllocation, memory views (growing)
    • GC and JS handles (I guess those should be UI thread local)
    • various JS helpers (logging, exception handling, asserts)
    • mono: string marshaling, gc roots
    • mono: call dispatch to managed code: instantiate TaskCompletionSource etc

Copy link
Member Author

@pavelsavara pavelsavara Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D) we could further move the boundary and have it between full dotnet.runtime.js on the UI thread and emscripten+MONO on the side-car. That would narrow it down to

  • memory view update events
  • per assembly
    • mono_wasm_runtime_run_module_cctor
  • per method binding
    • BindJSFunction
    • BindCSFunction
    • mono_wasm_assembly_find_class
    • mono_wasm_assembly_find_method
    • mono_wasm_invoke_method_ref
    • free
  • per method invoke
    • InvokeJSFunction
    • InvokeImport
    • mono_wasm_invoke_method_bound
    • stackalloc
  • per parameter when array
    • mono_wasm_deregister_root/DeregisterGCRoot
    • mono_wasm_register_root/RegisterGCRoot
    • malloc
  • per parameter instance when proxy
    • release_js_owned_object_by_gc_handle for proxy of C# object
    • mono_wasm_release_cs_owned_object/ReleaseCSOwnedObject for JSObject proxy of C# object
    • get_managed_stack_trace_method
  • per parameter instance when string
    • mono_wasm_string_get_data_ref
    • mono_wasm_string_from_utf16_ref
    • mono_wasm_write_managed_pointer_unsafe
    • mono_wasm_deregister_root
  • per parameter instance when promise/task/function/delegate
    • create_task_callback_method
    • complete_task_method
    • call_delegate_method
    • MarshalPromise

most of the C methods above need GC boundary and Mono registered thread

perhaps we could marshal strings and other value types already in side-car

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was proposing that the sidecar worker be able to access JS inside the sidecar context, to be clear. Not JS objects from the main thread. No remoting, so it can still be synchronous.

Copy link
Member Author

@pavelsavara pavelsavara Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some JS interop anyway for Blazor, the current surface is

This is about startup, loading and embedding

  • INTERNAL.loadLazyAssembly - async, string
  • INTERNAL.loadSatelliteAssemblies - async, string
  • Blazor._internal.getApplicationEnvironment, string
  • receiveHotReloadAsync - async void

This needs to hit UI thread (but the payload is string/bytes, not objects)

  • Blazor._internal.endInvokeDotNetFromJS
  • Blazor._internal.invokeJSJson
  • Blazor._internal.receiveByteArray

This needs to hit UI thread

  • Blazor._internal.getPersistedState -> sync string

Could be on sidecar

  • globalThis.console.debug
  • globalThis.console.error
  • globalThis.console.info
  • globalThis.console.warn
  • Blazor._internal.dotNetCriticalError

This is related to renderBatch (which we could do the same way as Blazor server does and skip this)

  • MONO.getI16
  • MONO.getI32
  • MONO.getF32
  • BINDING.js_string_to_mono_string
  • BINDING.conv_string
  • Blazor._internal.renderBatch
  • BINDING.unbox_mono_obj - this one could be tricky

I hope this is out of scope

  • ICall InvokeJS

@kg
Copy link
Member

kg commented Sep 7, 2023

This is a robust proposal, I could see either this or the original one working really well. I think it ends up depending on the user scenarios each proposal is best at solving.

@pavelsavara pavelsavara changed the title [browser] WASM server for Blazor - multi-threading proposal [browser] WASM side-car - multi-threading proposal Sep 18, 2023
@pavelsavara
Copy link
Member Author

pavelsavara commented Sep 20, 2023

@elringus thanks for chiming in, appreciated. I moved our conversation here.

To minimize breaking changes in the APIs, I've used comlink
except that they all become async, and blocking calls from JS to C# were no longer possible.

Something like that is what I'm considering in this "side-car" design.
This is approach is dispatch to UI thread via JS.

The other approach describes how to do the call dispatch via C#/managed/emscripten.

During this research, I realized that out of 4 combinatios [JSImport/JSExport] x [Sync/Async] the only thing which need to be spin-waiting is sync JSExport (and internal variations). The other sync scenario, sync JSImport (and internal methods) could be truly blocking via Atomics.

There are hairy details of managing C# stack. Also JSObject/ManagedObject proxies which I'm thinking about now, because it would be better to have proxy directly created from managed call in the UI thread, rather than have comlink-like proxy of a proxy.

BTW: how does the comlink deal with nested structures ? Like

    setModuleImports("main.js", {
        Sample: {
            Test: {
                displayMeaning: () => { return something }
            }
        }
    });

In the end, however, I've rolled all that back to blocking. Main reason was debug experience.

This is valuable feedback.

With this design, I think that

  • All the legacy interop and memory functions will just throw.
  • There are few sync APIs of the dotnet.create() which need to stay working. I consider making them spin-blocking.
  • The sync JSExport would do spin-wait, and complain loudly into browser console, like emscripten does.
    Same for callback passed on JSImport .There should be MSBuild switch that makes it throw instead. I'm not sure if that will be opt-in or opt-out.

Another point is performance/responsiveness. While the UI was not blocking anymore (though it was a hardly noticable improvement in itself), total time spend on interop became noticeably larger, especially in tight back-and-forth loops.

I hope that Atomic wait would be faster than self.postMesssage
I'm not sure if that's reason enough to skip postMesssage completely and go full SharedArrayBuffer for all messages.

I have similar concerns about dispatch between C# threads. We need to measure it.

Imho, running the entire C# module on webwoker is not worth it

The main motivation of moving out off the UI thread, is to allow C# to have consistent and blocking .Wait , lock() etc on all threads.

We tried to do threads in Net8 and one of the reasons we didn't deliver that, are the the implications treating one of the threads differently. That's confusing to the developer and difficult in existing code-bases.

Allowing blocking everywhere would also allow us to finally implement C# crypto via browser's subtle.

However, in case of computation-heavy tasks, it sure make sense to off-load them on worker threads.

That's already possible with Net8 experimental workload. It just has some rough edges, like running out of emscripten's thread pool and maybe leaking threads.

@pavelsavara
Copy link
Member Author

pavelsavara commented Sep 20, 2023

partytown have neat trick, they use synchronous XHR + service worker instead of spin-wait.
That's deprecated, but better because it doesn't eat battery.

@elringus
Copy link

elringus commented Sep 20, 2023

BTW: how does the comlink deal with nested structures ?

Iirc, it uses deep clone, so yes. Alternatively, they also use transferable for types that support it and JS Proxies for callbacks (as it's not possible to clone or transfer functions to another context/thread in JS).

One more thing I've forgot to mention is the frontend behavior with this kind of setup, which was another reason I've switched back to full blocking. Imagine a button with hover and active animations. With blocking, when user clicks the button, the UI thread is blocked until the underlying code is fully executed and the button remains in active (pressed) visual state during this time. While in theory it may be even desirable to not block the UI here (to get rid of slight stutter on button click), in reality UX becomes weird when user clicks the button, sees the active state, which then immediately returns to normal and then nothing happens (while the code is executed async). This also open doors to all kind of unspecified behavior, as user may accidentally click twice or interact with something else, while the call is not finished. The solution here would be blocking the interaction and authoring special CSS styles for the time when the async calls is executing, but that's additional layers of complexity.

That's already possible with Net8 experimental workload. It just has some rough edges, like running out of emscripten's thread pool and maybe leaking threads.

Oh, so that's possible without all the limitations with sync/blocking calls from JS to C# (JSExport)? I wasn't able to test it, as it doesn't seem to support module injection I'm using when creating the runtime (#90392), but if it'll be possible in the future, this would completely cover all the needs I have for threading.

@pavelsavara
Copy link
Member Author

One more thing I've forgot to mention is the frontend behavior with this kind of setup, which was another reason I've switched back to full blocking. Imagine a button with hover and active animations. With blocking, when user clicks the button, the UI thread is blocked until the underlying code is fully executed and the button remains in active (pressed) visual state during this time.

Steve already answered that for Blazor #91696 (comment)
I think that he is right and that this is not dotnet runtime issue to solve.

Oh, so that's possible without all the limitations with sync/blocking calls from JS to C# (JSExport)?

You are on your own, we would be happy to hear about trouble when you try that, but we probably would not try to fix it.

it doesn't seem to support module injection I'm using when creating the runtime (#90392)

That's actually one of the difficult problems with side-car. How to make it webpack friendly.

@elringus
Copy link

I think that he is right and that this is not dotnet runtime issue to solve.

Sure, but if there are no useful cases for this mode, no one will bother to solve it. Even worse, users will spend time and resources adapting to this mode only to later discard all the work. Maybe some kind of warning explaining all the inherent pitfalls and additional complexity would help here.

You are on your own, we would be happy to hear about trouble when you try that, but we probably would not try to fix it.

I mean, does it run the main/entry .NET runtime on webworker (in which case all the blocking interop limitations apply, which is not useful in my case) or it runs in the same way as ST, but has an API to dispatch tasks to worker threads?

@pavelsavara
Copy link
Member Author

pavelsavara commented Sep 20, 2023

explaining ... additional complexity would help here.

This is for advanced user like authors of higher level UI frameworks, like Blazor and Uno.
But yeah, as I said earlier, I consider to keep blocking UI thread part of the design here.

I mean, does it run the main/entry .NET runtime on webworker

Net 8, no emscripten doesn't work on web worker with MT build in Net8. I'm fixing it for Net9.

but has an API to dispatch tasks to worker threads?

You can use C# thread pool.

JSImport needs thread affinity of the caller and we have not exposed API for that. JSExport has the blocking problem.

@pavelsavara pavelsavara changed the title [browser] WASM side-car - multi-threading proposal [browser] WASM sidecar - multi-threading proposal Sep 26, 2023
@pavelsavara
Copy link
Member Author

partytown have neat trick, they use synchronous XHR + service worker instead of spin-wait. That's deprecated, but better because it doesn't eat battery.

But it only works on Firefox https://wpt.fyi/results/service-workers/service-worker/fetch-request-xhr-sync.https.html?label=experimental&label=master&aligned

@pavelsavara
Copy link
Member Author

closing in favor of dotnet/designs#301

@ghost ghost locked as resolved and limited conversation to collaborators Oct 28, 2023
@pavelsavara pavelsavara deleted the browser_threads_in_the_box branch September 2, 2024 15:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-wasm WebAssembly architecture area-VM-threading-mono os-browser Browser variant of arch-wasm
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants