Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS interop and glue code #119

Closed
kg opened this issue Jun 8, 2015 · 11 comments
Closed

JS interop and glue code #119

kg opened this issue Jun 8, 2015 · 11 comments

Comments

@kg
Copy link
Contributor

kg commented Jun 8, 2015

There's a gap in our current plans regarding JS interop and JS glue code.

At present, an emscripten-compiled module includes a bunch of JS interop and glue code that wraps the asm.js module. When you load an emscripten-compiled module, the glue code is what runs and does the work of turning that asm.js module into something that is exposed to javascript.

So far it seems that we are assuming that wasm will eliminate the need for most of this glue javascript. I agree with this. However, some of it is going to have to stick around no matter what, because it abstracts away the emscripten ABI and provides a surface the emscripten-compiled C++ can use to talk to the dom. Eliminating that glue will not be feasible until we've specified an ABI and webassembly can interact with GC objects.

At present the solution to these issues would probably be something like this:
Compilers like emscripten generate a wrapper javascript file (foo.wrap.js) next to every wasm module they generate (foo.wasm). Consumers have to import foo.wrap.js, not foo.wasm. The ABI is abstracted by each generated .wrap.js file.
The .wrap.js file exposes its interop code to the webasm... somehow? Right now it'd be passed in to the asm.js module via that globals/imports object, but we're doing away with that and having webassembly pull things in via ES6 imports. So does the webasm module import from the glue JS while the glue JS imports the webasm module?
The glue JS is required for an asm.js/webasm module to be able to interop with some DOM APIs, for example ones that take strings or arrays. Presently this is simple since emscripten just bundles the necessary wrapper code in - when we take that away the glue code will be mandatory for those modules to work.

Aside from the complexity here, there are some real threats:

  • Once dynamic linking is introduced, webasm modules will need a way to import the unwrapped webasm module so they have direct access to the native entry points, instead of the wrapped JS ones being exposed by the wrapper JS module.
  • Developers may tire of the wrapper JS mess and decide to instead use a single common glue/wrapper library that imports all their modules directly. In this scenario, the emscripten ABI becomes a de-facto standard. (I consider this an extremely realistic threat, since that's the approach I went with for my compiler's emscripten interop.)

It's my opinion that we need to spec - even in the MVP - a basic mechanism for bundling a glue/wrapper JS file into a webasm module/executable. The glue/wrapper file will serve as an analogue to how COM type libraries (IDL descriptions) can be embedded into native win32 .dll and .exe files to allow a consumer to import directly from those executables without any supporting files. In native runtime environments (non-JS) and dynamic linking scenarios, the glue JS would be ignored since it serves no purpose.

My rough proposal for this would be:
Every webasm file has an optional 'glue' section that contains an ecmascript module. If provided, this module can be imported by the webasm code via a special name as if it were a regular ES6 module (i.e. import { supportFunction } from 'glue').

The glue module can expose a special function that is invoked after the webasm module has finished loading:
function doExport (nativeModule)
If provided, doExport is invoked and the native webasm module object is passed as a parameter. The return value of doExport is the actual object exported when JS imports the module. This allows you to add glue code as properties onto the asm module object, or return a special wrapper object instead (that hides methods, etc.)

@lukewagner
Copy link
Member

Eliminating that glue will not be feasible until we've specified an ABI and webassembly can interact with GC objects.

Note that for simple signatures, there is no need for fancy cwrap or other things: if all you take is pointers, integers and doubles, you can import the wasm module and call it directly. I like simple interfaces like this and for the modular/library uses of wasm, I think people will have a much nicer time designing w/ these limitations in mind until we allow more expressive GC thing types. That is, I'm not a big fan of trying to present high-level abstractions like C++ classes to JS, but, rather, simple C-style interfaces that return opaque (integer) handles and have operations for operating on these handles.

The .wrap.js file exposes its interop code to the webasm... somehow?

Two choices here: the wasm module can import the .wrap.js file (so circular import, well-defined to resolve to the same module (by default, an be overidden) in ES6 modules) or a second JS module can be defined which is imported by the wasm module. I think the latter is preferable as it separates the concerns of (1) wrapping Web APIs for wasm consumption, something that can be shared as part of a standard library for each Web API, (2) wrapping wasm for handwritten JS consumption, something that should be generated custom for each module.

It's my opinion that we need to spec - even in the MVP - a basic mechanism for bundling a glue/wrapper JS file into a webasm module/executable.

The only problem I see this addressing is how to put two modules into one file. But this is not a problem specific to wasm and one being addressed by HTTP/2 and the packaging on the web TAG work. I don't think we should define a redundant solution to this.

Other than this putting two files into one, I don't see this increasing/decreasing reliance on the ABI (properly defined) since the JS code still depends on the ABI. Also, generalized glue code wrapping libraries are less efficient and bloatier than per-signature generated wrappers (witness XPCOM vs. new-DOM-bindings in Gecko), so I wouldn't expect 1 library to end up being shared that fixes an ABI, but rather for developers to use per-module automatically-generated wrapping code which inherently ties the wrapping code to the wasm module.

@kg
Copy link
Contributor Author

kg commented Jun 8, 2015

The only problem I see this addressing is how to put two modules into one file. But this is not a problem specific to wasm and one being addressed by HTTP/2 and the packaging on the web TAG work. I don't think we should define a redundant solution to this.

The problem I'm referring to isn't the filesystem/networking one (which is theoretically solved by packaging, as you point out) but the identity one. A considerable (IMO) burden is imposed on the consumer when you tell them that they need to carry around a bunch of paired files and remember which one to import where. I am not aware of an equivalent to this obligation in any development environment I've ever used. At most, you have the header/lib/executable pairing for C/C++, and the executable library file for systems like COM and .NET. Making the webasm code and the wrapper JS a part of a single unit (with the appropriate interface being exposed based on who imports it) is far more significant than just bundling two files together into a package for reduced network traffic.

It's sensible to scope this out entirely if we think it won't matter, but for the forseeable future a major use case for asm.js (and webasm) is embedding native code, not just embedding entire applications. In this sense webasm modules + paired stray glue js files plus the additional mental overhead is a regression versus current emscripten/asm.js. Mistakenly importing the native module instead of the wrapper module will be a source of bugs in cases where the glue only differs from the native module in a handful of cases.

The heap data could also have remained a separate file (.mem.js or whatever) as it is in emscripten now, but we pulled it in to webasm as a section alongside the others because it makes sense for it to be there. I think the same should apply here.

@lukewagner
Copy link
Member

I'm not sure the burden of having to move around 2 files instead of 1 is a big of enough problem to warrant a semantic extension (much less in the MVP). Also, IIUC, the packaging proposal lets you literally have a zip. Lastly, in the short term, if we're doing our own compression at layers 2 and 3, it seems like we could just as well cat files together if there was a significant use case (and then get experience and iterate while it was in user space before specifying anything).

@kg
Copy link
Contributor Author

kg commented Jun 8, 2015

The problem I'm referring to isn't the filesystem/networking one (which is theoretically solved by packaging, as you point out) but the identity one.

:-)

Lastly, in the short term, if we're doing our own compression at layers 2 and 3, it seems like we could just as well cat files together if there was a significant use case (and then get experience and iterate while it was in user space before specifying anything).

The idea of doing this in userspace as a transform instead of speccing it is interesting. I suppose you'd do it with a custom loader, and then you'd bypass the custom loader somehow to get at the actual native module? I'm not clear on how this would work, since the last time I checked the Loader part of the module spec isn't finalized. If you're confident that all of these problems can be solved in userspace and we won't regret it later, the section I propose can be a de-facto user-space standard instead of something in the spec.

@kripken
Copy link
Member

kripken commented Jun 8, 2015

So far it seems that we are assuming that webasm will eliminate the need for most of this glue javascript. I agree with this.

In the long term, wasm with typed objects and GC might help. Is that what is meant here?

Otherwise, I'm not sure how any of that glue code can be removed in the short or mid term. It takes a lot of work to form a shell around C/C++ things that make them look like a JS thing. For example, the glue code must contain methods to convert a C string into a JS string, and vice versa. On a C++ level, supporting extending a C++ class in JS and implementing virtual methods in JS takes quite a bit of hackery. And in general, lots of glue will be necessary to access web APIs from wasm through JS.

@lukewagner
Copy link
Member

@kg In the long term, I think these things can be done with loader hooks, but in the short-term, plain old fetch/XHR + IndexedDB (or Cache API if that is ready in time) + Blob + `script.src = URL.createObjectURL(blob)' should work just fine.

@luser
Copy link

luser commented Jun 19, 2015

This is analogous to how a lot of Python modules are written--there's a module written in C that provides a simple API (to minimize the amount of C code that needs to be written) but does the low-level work and then a Python module that imports that C module and uses it to provide a nice Pythonic API.

In Python this is generally done by naming the C module _foomodule and then the Python module foomodule, and having foomodule.py import _foomodule.

Even when wasm gets to a state of having good DOM interop etc, it will still probably be nicer ergonomically to write JS-friendly API surface in JS. It's certainly not hard to say 'Use <script src="foo.js">' and have that load foo.wasm under the hood, and maybe packaging makes that easy to do, but having a way to embed some JS in a section in the .wasm and using that to provide the exports is pretty compelling.

@dead-claudia
Copy link

This will have massive ramifications for Node as well, by the way.

@YurySolovyov
Copy link

Correct me if I'm wrong, but it seems like ATM wasm is better suited for "pure" modules that encapsulate some compute-heavy workload and return some output. And this is actually ok, and valid use-case. Though porting/re-using existing software can be more complicated then that and may involve fs and networking. It would be nice know how/if wasm infra addresses that.

@dead-claudia
Copy link

dead-claudia commented Jul 23, 2016

@YurySolovyov

Correct me if I'm wrong, but it seems like ATM wasm is better suited for "pure" modules that encapsulate some compute-heavy workload and return some output.

I believe so, but a JS interop story would be infinitely useful. For starters, WebAssembly can't interface with Node builtins yet, so that would be a pretty big problem solved by having some kind of JS interop.

@jfbastien
Copy link
Member

We never got around to half of this, but we do have wasm -> wasm calls... Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants