Skip to content

Commit

Permalink
Finish refactoring. Address #49.
Browse files Browse the repository at this point in the history
  • Loading branch information
jfbastien committed Jun 9, 2015
1 parent bf2555a commit ad83b19
Showing 1 changed file with 110 additions and 87 deletions.
197 changes: 110 additions & 87 deletions FutureFeatures.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,16 @@ Security-wise, dynamic linking and CDNs should be combine with [CORS][] and

## More expressive control flow

* Some types of control flow (esp. irreducible and indirect) cannot be expressed
with maximum efficiency in WebAssembly without patterned output by the
relooper and [jump-threading](http://en.wikipedia.org/wiki/Jump_threading)
optimizations in the engine.
* Options under consideration:
* No action, while+switch and jump-threading are enough.
* Just add goto (direct and indirect).
* Add [signature-restricted Proper Tail Calls](FutureFeatures.md#signature-restricted-proper-tail-calls).
* Add new control-flow primitives that address common patterns.
Some types of control flow (especially irreducible and indirect) cannot be
expressed with maximum efficiency in WebAssembly without patterned output by the
relooper and [jump-threading](http://en.wikipedia.org/wiki/Jump_threading)
optimizations in the engine.

Options under consideration:
* No action, `while` and `switch` combined with jump-threading are enough.
* Just add `goto` (direct and indirect).
* Add [signature-restricted Proper Tail Calls](FutureFeatures.md#signature-restricted-proper-tail-calls).
* Add new control-flow primitives that address common patterns.

## GC/DOM Integration

Expand Down Expand Up @@ -89,97 +90,119 @@ therefore unnecessary.
codes, so probably need to define new binary format for source maps.

## Signature-restricted Proper Tail Calls
* See the [asm.js RFC](http://discourse.specifiction.org/t/request-for-comments-add-a-restricted-subset-of-proper-tail-calls-to-asm-js).
* Useful properties of signature-restricted PTCs:
* In most cases, can be compiled to a single jump.
* Can express indirect `goto` via function-pointer calls.
* Can be used as a compile target for languages with unrestricted PTCs;
the code generator can use a stack in the heap to effectively implement a
custom call ABI on top of signature-restricted PTCs.
* An engine that wishes to perform aggressive optimization can fuse a graph of PTCs into a
single function.
* To reduce compile time, a code generator can use PTCs to break up
ultra-large functions into smaller functions at low overhead using PTCs.
* A compiler can exert some amount of control over register allocation via the ordering of
arguments in the PTC signature.


See the [asm.js RFC][] for a full description of signature-restricted Proper
Tail Calls (PTC).

Useful properties of signature-restricted PTCs:

* In most cases, can be compiled to a single jump.
* Can express indirect `goto` via function-pointer calls.
* Can be used as a compile target for languages with unrestricted PTCs; the code
generator can use a stack in the heap to effectively implement a custom call
ABI on top of signature-restricted PTCs.
* An engine that wishes to perform aggressive optimization can fuse a graph of
PTCs into a single function.
* To reduce compile time, a code generator can use PTCs to break up ultra-large
functions into smaller functions at low overhead using PTCs.
* A compiler can exert some amount of control over register allocation via the
ordering of arguments in the PTC signature.

[asm.js RFC]: http://discourse.specifiction.org/t/request-for-comments-add-a-restricted-subset-of-proper-tail-calls-to-asm-js

## Proper Tail Calls
* Expands upon Signature-restricted Proper Tail Calls.
* TODO

Expands upon signature-restricted Proper Tail Calls, and makes it easier to
support other languages, especially functional programming languages.

## Asynchronous Signals
* TODO

TODO

## "Long SIMD"
* The initial SIMD API will be a "short SIMD" API, centered around fixed-width
128-bit types and explicit SIMD operations. This is quite portable and useful,
but it won't be able to deliver the full performance capabilities of some of
today's popular hardware. There is [a proposal in the SIMD.js repository][]
for a "long SIMD" model which generalizes to wider hardware vector lengths,
making more natural use of advanced features like vector lane predication,
gather/scatter, and so on. Interesting questions to ask of such an model will
include:
* How will this model map onto popular modern SIMD hardware architectures?
* What is this model's relationship to other hardware parallelism features,
such as GPUs and threads with shared memory?
* How will this model be used from higher-level programming languages?
For example, the C++ committee is considering a wide variety of possible
approaches; which of them might be supported by the model?
* What is the relationship to the "short SIMD" API? "None" may be an
acceptable answer, but it's something to think about.
* What non-determinism does this model introduce into the overall platform?
* What happens when code uses long SIMD on a hardware platform which doesn't
support it? Reasonable options may include emulating it without the
benefit of hardware acceleration, or indicating a lack of support through
feature tests.

The initial SIMD API will be a "short SIMD" API, centered around fixed-width
128-bit types and explicit SIMD operations. This is quite portable and useful,
but it won't be able to deliver the full performance capabilities of some of
today's popular hardware. There is [a proposal in the SIMD.js repository][] for
a "long SIMD" model which generalizes to wider hardware vector lengths, making
more natural use of advanced features like vector lane predication,
gather/scatter, and so on. Interesting questions to ask of such an model will
include:

* How will this model map onto popular modern SIMD hardware architectures?
* What is this model's relationship to other hardware parallelism features, such
as GPUs and threads with shared memory?
* How will this model be used from higher-level programming languages? For
example, the C++ committee is considering a wide variety of possible
approaches; which of them might be supported by the model?
* What is the relationship to the "short SIMD" API? "None" may be an acceptable
answer, but it's something to think about.
* What non-determinism does this model introduce into the overall platform?
* What happens when code uses long SIMD on a hardware platform which doesn't
support it? Reasonable options may include emulating it without the benefit of
hardware acceleration, or indicating a lack of support through feature tests.

[a proposal in the SIMD.js repository]: https://github.com/johnmccutchan/ecmascript_simd/issues/180

## Operations which may not be available or may not perform well on all platforms
* Fused multiply-add.
* Reciprocal square root approximate.
* 16-bit floating point.
* and more!

## Platform-independent Just-in-Time compilation
* Minimally, we need mechanisms to make this possible.
* Producing a dynamic library and loading it is very likely the first step, as
it will be easy to get working.
* Fused multiply-add.
* Reciprocal square root approximate.
* 16-bit floating point.
* and more!

* After that, it may become desirable to define lighter-weight mechanisms, such
as the ability to add a function to an existing module, or even the ability to
define explicitly patchable constructs within functions to allow for very
fine-grained JITing.
## Platform-independent Just-in-Time compilation

* Potential enhancements include:
* Provide JITs access to profile feedback for their JITed code.
WebAssembly is a new virtual ISA, and as such applications won't be able to
simply reuse their existing JIT-compiler backends. Applications will instead
have to interface with WebAssembly's instructions as if they were a new ISA.

Applications expect a wide variety of JIT-compilation capabilities. WebAssembly
should support:

* Producing a dynamic library and loading it into the current WebAssembly
module.
* Define lighter-weight mechanisms, such as the ability to add a function to an
existing module.
* Support explicitly patchable constructs within functions to allow for very
fine-grained JIT-compilation. This includes:
* Code patching for polymorphic inline caching;
* Call patching to chain JIT-compiled functions together;
* Temporary halt-insertion within functions, to trap if a function start
executing while a JIT-compiler's runtime is performing operations
dangerous to that function.
* Provide JITs access to profile feedback for their JIT-compiled code.
* Code unloading capabilities, especially in the context of code garbage
collection and defragmentation.

## Multiprocess support
* `vfork`.
* Inter-process communication.
* Inter-process `mmap`.

* `vfork`.
* Inter-process communication.
* Inter-process `mmap`.

## Trapping or non-trapping strategies.
* Presently, when an instruction traps, the program is immediately terminated.
This suits C/C++ code, where trapping conditions indicate Undefined Behavior
at the source level, and it's also nice for handwritten code, where trapping
conditions typically indicate an instruction being asked to perform outside
its supported range. However, the current facilities do not cover some
interesting use cases:

* Not all likely-bug conditions are covered. For example, it would be very
nice to have a signed-integer add which traps on overflow. Such a construct
would add too much overhead on today's popular hardware architectures to be
used in general, however it may still be useful in some contexts.

* Some higher-level languages define their own semantics for conditions like
division by zero and so on. It's possible for compilers to add explicit
checks and handle such cases manually, though more direct support from the
platform could have advantages:
* Non-trapping versions of some opcodes, such as an integer division
instruction that returns zero instead of trapping on division by zero,
could potentially run faster on some platforms.
* The ability to recover gracefully from traps in some way could make many
things possible. Possibly this could involve throwing or possibly by
resuming execution at the trapping instruction with the execution state
altered, if there can be a reasonable way to specify how that should work.

Presently, when an instruction traps, the program is immediately terminated.
This suits C/C++ code, where trapping conditions indicate Undefined Behavior at
the source level, and it's also nice for handwritten code, where trapping
conditions typically indicate an instruction being asked to perform outside its
supported range. However, the current facilities do not cover some interesting
use cases:

* Not all likely-bug conditions are covered. For example, it would be very nice
to have a signed-integer add which traps on overflow. Such a construct would
add too much overhead on today's popular hardware architectures to be used in
general, however it may still be useful in some contexts.
* Some higher-level languages define their own semantics for conditions like
division by zero and so on. It's possible for compilers to add explicit checks
and handle such cases manually, though more direct support from the platform
could have advantages:
* Non-trapping versions of some opcodes, such as an integer division
instruction that returns zero instead of trapping on division by zero, could
potentially run faster on some platforms.
* The ability to recover gracefully from traps in some way could make many
things possible. Possibly this could involve throwing or possibly by
resuming execution at the trapping instruction with the execution state
altered, if there can be a reasonable way to specify how that should work.

0 comments on commit ad83b19

Please sign in to comment.