How should casts/checks for structs passed into Wasm work in the JS API? #203

takikawa · 2021-03-19T19:31:01Z

I'd like to ask a design question about how checks for structs that are passed into Wasm functions by JS code should work. Here is a concrete example with a Wasm module and some JS code that interacts with it to motivate the discussion:

;; example.wat --> example.wasm
;; an example based off of MVP JS API docs
(module
  (type $pt (struct (field $x i32) (field $y i32)))

  (func (export "makePt") (param i32 i32) (result (ref $pt))
    (struct.new_with_rtt $pt (rtt.canon $pt) (local.get 0) (local.get 1)))

  (func (export "addXY") (param (ref $pt)) (result i32)
    (i32.add
      (struct.get $pt $x (get_local 0))
      (struct.get $pt $y (get_local 0)))))

WebAssembly.instantiateStreaming(fetch('example.wasm'))
.then(({instance}) => {
   let pt = instance.exports.makePt(1, 2);
   instance.exports.addXY(pt); // we'd like this call to work
});

The key thing here is that a struct originates in Wasm (via makePt), goes to JS, and then is supplied back to Wasm (via addXY). At the point that it goes back to Wasm, a check/cast is needed to ensure it's really a (ref $pt) as promised.

An obvious/naive approach to this might be to extend ToWebAssemblyValue in the JS API with a new case ToWebAssemblyValue(v, (ref $t)) that would evaluate the cast (ref.cast v (rtt.canon $t)). This will work for the exact example presented above, but unfortunately it is easy to construct situations in which the cast is more conservative than the type (ref $pt) due to inheritance chains in the RTT.

For example, suppose that there's more code in the original Wasm module that constructs points in a different way, using a different set of RTTs. And let's suppose that we still want to call addXY on these other kinds of points:

  ;; module fragment continuing example.wat

  ;; parent struct
  (type $mt (struct))

  ;; defining a sub-RTT for $pt structs instead of the canon one
  (global $rttMt (rtt 0 $mt) (rtt.canon $mt))
  (global $rttPt (rtt 1 $pt) (rtt.sub $pt (global.get $mt)))

  ;; note that the result is the *same* $pt type in the same module as before
  (func (export "makePt2") (param i32 i32) (result (ref $pt))
    (struct.new_with_rtt $pt $rttPt (local.get 0) (local.get 1)))

Then the cast will fail on points created by makePt2() even though they are still inhabitants of (ref $pt). Because of this cast behavior, JS cannot execute addXY(makePt2()).

In the current MVP JS API document, I didn't see a mention of how this specific situation might work out in terms of the casts. On the other hand, a while back there was an alternative proposal by the V8 team that involved extending the RTT.

In that design, it is assumed that structs are not automatically castable and you need to specify a concrete RTT to use for a cast if you want to access a field in JS. If I understand correctly, this only works for fields and not for function arguments in general as in the examples above (for addXY you could make it a method using that API, but you may have struct arguments that are not the method receiver in general).

This kind of JS/Wasm interaction with function calls and structs seems like a case that would be very desirable for the JS API and GC proposal design to support. Are there any changes we could make for RTTs or the type system that would enable this with low friction? Or will we need to use a design (like the V8 proposal) in which casts are somehow explicitly specified for struct fields and struct-typed function arguments? (either via the RTT, or perhaps some other mechanism like a custom section)

The text was updated successfully, but these errors were encountered:

RossTate · 2021-03-19T21:19:04Z

Thanks for writing this up, @takikawa! I've had the same concern for a while. The JS API's design philosophy (beyond this proposal) has been that coercions can be automatically generated from type signatures. But the research in this area indicates that such an approach tends to cause coercions to be either inefficient (originally, the GC proposal expected coercions to perform structural equi-recursive type casts) or prohibitively lossy (as your example illustrates). My advice would be to find a way to move coercions into application space. The problem is that coercion systems tend to be specific to the two systems being coerced between, in this case wasm and JS, which is why I've been exploring ways to equip wasm modules with embedder-specific linking. I've been working out a high-level strategy for a JS-specific coercion system that bridges the ideas in @tebbi's #132 with the ideas in the @tschneidereit's newer Typed Objects proposal and addresses the issues raised above, but it utilizes a nominal type system (which also makes it avoid another issue in the current JS API wherein equivalent type signatures have different semantics) and so currently is not a viable option.

jakobkummerow · 2021-03-20T14:08:50Z

I think the high-level comment is: yes, there are unsolved issues/questions with the JS API design, and what the JS API will end up looking like is still very much an open question.

One thing that seems pretty certain (to me at least) that the current state of the "MVP-JS"document will not be the final state of things.

Regarding the "where does ToWebAssemblyValue get the right RTT from?" question: one approach would be to only allow export of functions that take anyref parameters, so modules would typically export a wrapper that performs checks/casts and then calls the actual typed implementation. That way, the cast and its RTT choice would be fully under the module's control; but it may be difficult to find a good balance of flexibility and performance, and the wrappers would increase module size.

Regarding the makePt2 issue: I think one way or another, it may end up being the case that modules have to consistently use the same RTTs if they want this kind of interop to work. In the current MVP design, that's intentional, even within a single module: the static types are pretty flexible, but for ref.cast to work, modules have to be consistent in their RTT usage. This is in order to keep type checks as efficient as possible, and make the cost that they do have (such as setting up RTT inheritance chains, rather than just using rtt.canon everywhere) opt-in. In case we end up moving the type system in a more nominal direction, or if we just make RTTs a cornerstone of the JS interop design, then this basic principle will very likely still apply.

Looking forward to hearing more about Ross' explorations!

takikawa · 2021-03-22T15:55:02Z

@RossTate Thanks, you make some good points regarding automatic coercions being potentially inefficient (I have some experience with this being a problem in the gradual typing world) or inexpressive. But this seems like it's partly an artifact of the type system that the coercions have to enforce, right? Hypothetically if we considered nominal types, wouldn't that have different tradeoffs even with automatic coercion?

Also I look forward to hearing more about embedder-specific linking & how that connects to coercions as well. :)

@jakobkummerow Thanks for bringing up the anyref wrapper approach, that's a good point that it's already controllable in module code. If the wrapper overhead is an issue, it seems like since engines have to already create a different entry point for JS->Wasm calls anyway for argument coercion, it could make sense to optionally specify to the engine to do this casting down (via a custom section for example) in that entry point.

RossTate · 2021-03-22T21:57:59Z

Oh, this is Asumu @takikawa! Great to have ya 😄 For context for everyone else, Asumu is lead author on the paper that brought broad awareness to the performance problems in "sound" gradual typing. The relevance to the JS API is that sound gradual typing (more generally speaking) is all about mixing languages safely (more specifically where the languages differ only in being statically versus dynamically typed), which is generally done through some means of coercions at the boundary points between the languages. His paper identified major problems in the performance overhead caused by these coercions, and many research teams (including my own) have been pursuing a wide variety of ways to address those overheads. I believe there are many lessons for language interop to take away from these works. One is that there is a lot of choice in how to design these coercions, and those choices have substantial consequences and tradeoffs (w.r.t. performance, functionality, guarantees, and so on). This is why I was suggesting pushing as much coercion into the WebAssembly module (or embedder-specific portion thereof) rather than attempting to derive coercions automatically. @jakobkummerow's suggestion is also essentially doing this by using just anyref at the boundary and then having the module employ wasm casts internally.

The above is largely about bringing JS references into wasm, but that is only half of the picture. The other half is putting wasm references out into JS. It's one thing to put them out as black boxes, but ideally they can be made to be accessible in "natural" ways from JS, e.g. field accesses and method invocations and such. In the current JS API, the expectation is that such "decoration" is performed on the JS side. But there are problems with that, such as the issues identified in the slides in #107, plus the issue that these decorations would all be going through coercions. So decorating it inside (an embedder-specific portion of) wasm, along the lines of @tebbi's #132 (sorry, I linked the wrong related issue before) could be a big performance improvement. There are lots of loose ends to tie, but that's something I've been working through (though there are a number of points that would greatly benefit from more insights from y'all).

tlively · 2022-11-01T22:42:01Z

Closing this in favor of the more up-to-date discussion of casts on the JS-Wasm boundary: #279 (comment).

RossTate mentioned this issue Aug 23, 2021

Proposal: Async/Await JS API WebAssembly/design#1425

Closed

tlively closed this as completed Nov 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How should casts/checks for structs passed into Wasm work in the JS API? #203

How should casts/checks for structs passed into Wasm work in the JS API? #203

takikawa commented Mar 19, 2021

RossTate commented Mar 19, 2021 •

edited

Loading

jakobkummerow commented Mar 20, 2021

takikawa commented Mar 22, 2021 •

edited

Loading

RossTate commented Mar 22, 2021

tlively commented Nov 1, 2022

How should casts/checks for structs passed into Wasm work in the JS API? #203

How should casts/checks for structs passed into Wasm work in the JS API? #203

Comments

takikawa commented Mar 19, 2021

RossTate commented Mar 19, 2021 • edited Loading

jakobkummerow commented Mar 20, 2021

takikawa commented Mar 22, 2021 • edited Loading

RossTate commented Mar 22, 2021

tlively commented Nov 1, 2022

RossTate commented Mar 19, 2021 •

edited

Loading

takikawa commented Mar 22, 2021 •

edited

Loading