Closures - MVP #66

ballercat · 2018-01-13T16:10:58Z

Early work to enable closures, parsing and emitting of closures(#65). The goal of this poc is to figure out the practical challenges in enabling this type of functionality and laying the groundwork for more of these type of "side_module" features to come.

To create a closure, arrow function syntax is used. This fits really well with existing JS syntax and creates clear separation in logic:

// Toy example
export function getClosure(): i32 {
  let x: i32 = 0;
  return () : i32 => {
     const y: i32 = x;
     x += 1;
     return x;
  }
}

Limitations of POC

Multiple closures will not be supported (yet)
Object(memory) closures are post-poc
The side module will not be very clever and will most likely never clean up its own memory space.
The side closure module cannot be overwritten

High-Level Implementation Details

The basic idea behind the implementation is to not dynamically manage the memory inside the module even with closures. This means that there will be no implicit runtime memory regions, maintaining the design direction of no surprises. A closure needs to create an object to represent the environment it's closing over however, the way this will be accomplished is with a side-module with its own memory object completely independent of the main module. The two will be linked with a shared Table.

Closure representation inside the binary

Declaration:

The closure is compiled to a regular old function declaration with an additional side-effect of its function index being encoded into the Element section(in the binary). Because function pointers are already supported this will be done for free by the compiler already. The closure value will be an expression returning 64-bit value. The original function declaration AST nodes will be moved to top-level module scope, with its default arguments appended with a base memory pointer 32-bit argument __ptr__.

For example

(x: i32) : i32 => { ... };

becomes a top level function definition in the binary

function <parentFunctionName>--closure#(__ptr__: i32, x: i32) { ... }

The -- is used so that the name cannot be accidentally overwritten by end-user. With this approach we only ever create a single function for every closure definition and many potential __ptr__-ers at runtime.

Call-Site:

Closure call sites will compile to an indirect_call, although a regular call operation may be possible(unlikely) since we can statically analyze which real function is being called. Either option will work, but this requires some creative encoding. Either option will require the closure to be encoded as a 64-bit value with low 32 bits being a table pointer(or function index) and the high 32 bits being the base environment pointer. Since wasm-MVP only supports 32-bit address space in both tables and memory this should work out of the box! This is a pretty neat way to get around the fact that WASM functions cannot return multiple values. The only downside to this is that a closure pointer cannot be directly exported to JS, so a wrapper function would need to be used.

The call sites will incur an additional instruction cost of shifting the bits to call correct function indexes with correct memory offsets and additional 32bit truncations for indirect_call.

Closure body encoding:
Special closure helper functions for <type>closure[GS]et and makeClosure will be implemented and likely be emulated via JS for the first POC. Every closed over variable access will become a function call inside the closure, which will provide the values stored inside a completely different memory space.

So a closure which looks like this:

  function getClosure(): ClosureType {
    let x: i32 = 0;
    return (z: i32): i32 => {
      const y: i32 = x;
      x += 1;
      return y;
    };
  }

becomes the following AST(subject to change)

    FunctionArguments FUNCTION_ARGUMENTS
      Pair :
        Identifier<i32> __ptr__ local/index(0)
        Type i32
      Identifier z
      Type<i32> i32
    FunctionResult<i32> i32
    Block {
      Declaration<i32> y local/index(1),type/const(true)
        FunctionCall<i32> i32closureGet__ function/index(1)
          BinaryExpression<i32> +
            Constant<i32> 0
            Identifier<i32> __ptr__ local/index(0)
      FunctionCall i32closureSet__ function/index(2)
        BinaryExpression<i32> +
          Constant<i32> 0
          Identifier<i32> __ptr__ local/index(0)
        BinaryExpression<i32> +
          FunctionCall<i32> i32closureGet__ function/index(1)
            BinaryExpression<i32> +
              Constant<i32> 0
              Identifier<i32> __ptr__ local/index(0)
          Constant<i32> 1
      ReturnStatement return
        Identifier<i32> y local/index(1),type/const(true)

Should be clear what is happening here, every use of x, the closed over variable becomes a lookup call, where the __ptr__ argument is used as an offset into the object store inside the side-module memory space. The Constant<i32> 0s are offsets from the base pointer. Tada! Closure 🎊

Tasks

xtuc · 2018-01-15T06:51:01Z

That sounds pretty solid!

This is a pretty neat way to get around the fact that WASM functions cannot return multiple values

I recall seeing someone emulating a multiple values result (I believe it was a Forth to WASM compiler).

Closure call sites will compile to an indirect_call

Do you know if you have an addition call cost compared to regular call? I guess it has to search/fetch by type.

The side module will not be very clever and will most likely never clean up its own memory space

I know it's just a POC, but this will be problematic in the future. I don't know what your memory management is done but instead of being separated (and 👍 isolated) it could be part of your "memory chunk pool"?

side_module

What is meant by side_module exactly? Do you mean a WebAssembly module instance?

ballercat · 2018-01-15T14:40:49Z

Hey @xtuc,

Thanks for the feedback! I wonder how FORTH does it(probably by writing to memory)

Do you know if you have an addition call cost compared to regular call? I guess it has to search/fetch by type.

indirect_call is almost guaranteed to incur a penalty because of the runtime type check. Pretty sure it's possible to optimize it out to a regular call in a lot of cases but that's not something I'm super concerned with at this moment.

RE: memory & side_module

Yes, I am talking about a separate module instance. The main module and the extension would share a table with two split memory spaces.

it could be part of your "memory chunk pool"?

I don't love the pattern of an implicit, sectioned off chunk of memory just for the runtime. I know this is a pattern used with other compiled languages to WASM, but that is not fit into the basic goals of the project. I'd like to stay unopinionated about the management of the memory in the module as much as possible.

I don't know what your memory management is done

It isn't 🙃 the user is in-charge of the memory just like with raw WASM. We only provide better syntax to do so.

All that being said, this was all theorycrafting and the PR once ready should be interesting. I think POC-ing something like this will give me more data on how to move forward. I'm pretty close now, but no working wasm code just yet, I think I might emulate extension module calls via JS/imports for the very first POC.

The potential benefits of using an extension mechanism instead of building functionality into a monolithic main module is that it would be possible for the end-user to overwrite the behavior completely, the ABI is going to be dead simple so it would be trivial for someone to swap in their own shared memory chunk pool closure manager instead.

xtuc · 2018-01-15T14:55:21Z

It isn't 🙃 the user is in-charge of the memory just like with raw WASM. We only provide better syntax to do so.

Yes I know, sorry that wasn't clear (and not even english). I meant if you need to allocate/free a module during runtime. Might not be a problem atm since you're not going to free it in this POC.

I thought that WASM format was only allowing one module (itself)? This is interesting because the spec tests (in WAST this time) are using sub-modules all the time.

The table you mentioned is an WebAssembly.Table? You would pass it from JS to multiple modules, right?

Edit: According to import { table: Table } from 'env'; yes. This is an incredibly clear way of doing an importObject lookup!

xtuc · 2018-01-15T14:56:37Z

packages/walt-compiler/src/__tests__/function-spec.js

@@ -49,3 +49,23 @@ test("functions", t => {
    t.is(exports.testFunctionPointers(), 4, "plain function pointers");
  });
 });
+
+test.only("closures", t => {


Just a reminder, you have an only here and a few skips later.

ballercat · 2018-01-15T15:15:08Z

Hey, no worries. Your English is great!

I thought that WASM format was only allowing one module (itself)? This is interesting because the spec tests (in WAST this time) are using sub-modules all the time.

Yes, and the specs use some superset of text-format which supports this. I always thought this was confusing because that means .wast !== .wat, but yeah.

The table you mentioned is an WebAssembly.Table? You would pass it from JS to multiple modules, right?

Precisely! While the spec says you main only have a single module compiled into a binary it has no limitations on multiple modules sharing a Table. At least that I've seen and documentation on MDN seems to support the idea that this is possible. Tentatively I'm planning on using JavaScript to pass the table around and use WebAssembly.Table.set to basically dynamically link the two together.

ballercat · 2018-01-27T17:20:02Z

Update:

Closure implementation is now working via a plugin system for closure handlers/callbacks. Interesting fact I discovered within Chrome/v8 wasm compiler. Passing WebAssembly functions as imports to another module kicks off type validation when the two modules are linked. This is not the case when imported functions are JS wrappers, hinting that V8 handles wasm-to-wasm imports/exports differently. I have a feeling this means that imports bound in this manner don't suffer from WASM -> JS call penalty so I used this approach instead of indirect_call approach through a shared table.

A consequence of this validation is that tests for type imports/exports can be verified by V8 itself:

walt/packages/walt-compiler/src/parser/__tests__/type-spec.js

Lines 36 to 40 in 4892825

    
           const importWASM = getIR(imports); 
        
           const sourceWASM = getIR(source); 
        
           return WebAssembly.instantiate(importWASM.buffer()).then(deps => { 
        
             return WebAssembly.instantiate(sourceWASM.buffer(), { 
        
               env: { ...deps.instance.exports },

Pretty neat.

So the closure plugin will work through importsObject and can be easily overwritten by the end-user.

coveralls · 2018-01-27T20:21:16Z

Coverage increased (+0.2%) to 95.867% when pulling 2a9d388 on closures into 7a703ff on master.

Arty Buldauskas added 3 commits January 7, 2018 18:43

initial parser

b51f045

get parser working

3805dfc

Initial AST Closure parser

f97e98e

ballercat changed the title ~~[WIP]: Closures~~ [WIP, POC]: Closures Jan 13, 2018

xtuc reviewed Jan 15, 2018

View reviewed changes

Arty Buldauskas added 8 commits January 17, 2018 22:23

attempt a build (it fails)

c1dcc3f

Inital concept working, closure cbs as imported fns

abeacbf

big cleanup

4b63ce9

initial conversion to closure plugin

7dc93e0

fix issues with type imports

aed4405

small tweaks

4a573f7

try to figure out some stuff

66571fe

slit type parser, add closure plugin

4892825

Arty Buldauskas added 2 commits January 27, 2018 13:05

pass tests

9c49a77

remove some dead code, raise coverage

dbad96c

ballercat changed the title ~~[WIP, POC]: Closures~~ Closures - MVP Jan 27, 2018

Arty Buldauskas added 3 commits January 27, 2018 13:33

fix merge conflict

300a21d

cleanup + closure example

3c41f0b

ignore a flow issue for now

eda8eaa

Arty Buldauskas added 4 commits January 27, 2018 16:11

improve types, add lambda keyword

0965b6c

fix type validation, improve coverage

fdb52d2

test multiple params closures

fbb212e

more coverage

dcf8578

Arty Buldauskas added 7 commits January 27, 2018 18:12

remove uncovered lines

8c4c4f9

remove dead code

5e5cc98

cleanup and update builds

b85fcb8

iron out some closure imports

680d071

remove min build

f1c39d1

Merge branch 'master' into closures

1f415bf

update builds

2a9d388

ballercat merged commit 5465d72 into master Jan 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Closures - MVP #66

Closures - MVP #66

ballercat commented Jan 13, 2018 •

edited

Loading

xtuc commented Jan 15, 2018 •

edited

Loading

ballercat commented Jan 15, 2018

xtuc commented Jan 15, 2018 •

edited

Loading

xtuc Jan 15, 2018 •

edited

Loading

ballercat commented Jan 15, 2018

ballercat commented Jan 27, 2018

coveralls commented Jan 27, 2018 •

edited

Loading

Closures - MVP #66

Closures - MVP #66

Conversation

ballercat commented Jan 13, 2018 • edited Loading

Limitations of POC

High-Level Implementation Details

Closure representation inside the binary

Tasks

xtuc commented Jan 15, 2018 • edited Loading

ballercat commented Jan 15, 2018

xtuc commented Jan 15, 2018 • edited Loading

xtuc Jan 15, 2018 • edited Loading

Choose a reason for hiding this comment

ballercat commented Jan 15, 2018

ballercat commented Jan 27, 2018

coveralls commented Jan 27, 2018 • edited Loading

ballercat commented Jan 13, 2018 •

edited

Loading

xtuc commented Jan 15, 2018 •

edited

Loading

xtuc commented Jan 15, 2018 •

edited

Loading

xtuc Jan 15, 2018 •

edited

Loading

coveralls commented Jan 27, 2018 •

edited

Loading