Avoid reordering execution of module bodies before top-level await is reached #47

littledan · 2019-02-17T15:53:17Z

Let's continue discussion on #44 (comment) here. To summarize, @zenparsing noticed a potential reordering issue in #33 , and @bergus and I were chatting about possible ways to avoid it.

littledan · 2019-02-17T15:55:03Z

@bergus, What do you see as the problem with draining the job queue between siblings? I see it as a midpoint between inserting additional implicit dependencies and not controlling things at all.

littledan · 2019-02-17T15:56:55Z

About a "synchronous" Promise.all, as proposed in #44 (comment) , that would seem to go against the spirit of what many expressed in
#43

domenic · 2019-02-17T16:17:23Z

Can someone summarize the issue? Extracting it from the comment thread is hard, and I'd like to first get on the same page as to whether there's actually an issue.

bergus · 2019-02-17T18:02:42Z

@domenic: @zenparsing brought the example

// a-1.js
console.log('a-1');

// a-2.js
console.log('a-2');

// a.js
import './a-1.js';
import './a-2.js';
console.log('a');

// b.js
console.log('b');

// main.js
import './a.js';
import './b.js';
console.log('main');

with the current ES semantics this will result in

a-1
a-2
a
b
main

and we of course want to keep this ordering. However if we naively just add Promise.all to await all dependencies, we get to evaluate

const a1 = Promise.resolve()    .then(() => console.log('a-1'));
const a2 = Promise.resolve()    .then(() => console.log('a-2'));
const a  = Promise.all([a1, a2]).then(() => console.log('a'));
const b  = Promise.resolve()    .then(() => console.log('b'));
           Promise.all([a, b])  .then(() => console.log('main'));

and the ordering will become

a-1
a-2
b
a
main

which is undesirable.

littledan · 2019-02-17T18:05:23Z

Right, @bergus summarizes it well. To sum it up, the microtask queue may get gummed up just working through all the microtasks from module evaluation itself, leading a mid-level module in a heavier subtree on the "left" side have its body evaluated after a lighter "right" subtree, even in the absence of top-level await.

domenic · 2019-02-17T18:15:53Z

Thanks.

I'm not sure this is a problem in HTML, because every script evaluation (including module scripts) there is done by wrapping it in various things, the end of which will run a microtask checkpoint if the stack is empty (which it would be in this case, I believe). Concretely, https://html.spec.whatwg.org/multipage/webappapis.html#run-a-module-script step 9.

Note that there are a lot of important setup and teardown steps there; in general it's important for any synchronous block of code to be run via the host so that the host can maintain its invariants in these sorts of ways.

So as long as we respect that existing infrastructure, this will not be an issue in the browser. I haven't checked if any of the various spec PRs open do or do not respect the existing infrastructure.

domenic · 2019-02-17T18:26:35Z

Note that this is also taken care of if all script work is queued via EnqueueJob(); see https://html.spec.whatwg.org/multipage/webappapis.html#enqueuejob(queuename,-job,-arguments) step 7.8. So e.g. if you queued separate evaluation jobs for a and then b, the queue would be completely emptied by the time the b job came around, and thus presumably 'a' would have been logged to the console.

bergus · 2019-02-17T21:25:43Z

@littledan I just think it would be ugly when the module loading component would take control over the job queues, this doesn't look sound from an architectural point of view. But it's not my decision, you would have to discuss this with the TC editors.

@domenic Ah, good point that the HTML spec could solve this, but I'd rather have this behaviour mapped out in the ES spec alone, not relying on the host to run the job queue in the expected order.
However I don't understand what you mean by queuing script work via EnqueueJob. Script evaluation should not be a promise job, should it? And even if it was, those jobs are put on the microtask queue, and the "script clean up" step of the first de-queued job would then perform a microtask checkpoint which would run all the other jobs in the microtask queue - afaics not really a difference wrt. evaluation order from not doing cleanup and just continuing the dequeuing.

I'd rather see the weakly-deterministic-with-yielding strategy from your Determinism vs ASAP document, with dynamic import() featuring the same role as the opt-in second entry point for concurrent evaluation.

littledan · 2019-02-18T00:15:53Z

@domenic If there's a way that we can handle this logic via HTML, that could be a good solution. I don't understand how the current HTML spec handles this, however.

Step 7.2 of the "run a module script" algorithm notes, the Evaluate algorithm "will recursively evaluate all of the module's dependencies." It's exactly this recursive evaluation that we might need to intersperse microtask queue checkpoints into, but I don't see where the recursion in the ES spec jumps back to this HTML algorithm.

I'm not sure what you're suggesting by EnqueueJob. The only jobs for module evaluation in #33 are Promise reaction jobs due to waiting on dependencies. This hits the HTML EnqueueJob algorithm, but I don't see how that leads to the ordering we want. Is there some other way we should enqueue additional jobs to get another ordering?

domenic · 2019-02-18T17:13:24Z

Step 7.2 of the "run a module script" algorithm notes, the Evaluate algorithm "will recursively evaluate all of the module's dependencies." It's exactly this recursive evaluation that we might need to intersperse microtask queue checkpoints into, but I don't see where the recursion in the ES spec jumps back to this HTML algorithm.

Currently, Evaluate() only executes one big sync chunk of code. Every sync chunk of code needs to be wrapped in setup/cleanup steps, so in the current setup, nothing more is needed.

What I was alluding to above, was that if whatever changes the top-level await proposal makes also respect this invariant---that every sync chunk of code gets wrapped in a setup/cleanup pair---then I think everything will work as expected.

Similarly, I was pointing out that if sync chunks of code are executed by using EnqueueJob, then you'd get this wrapping for free.

Does this help?

I might still be missing something, as I haven't immersed myself in the problem fully like you have. I am just pattern matching: we need to run microtask checkpoints after each chunk, and HTML has well-established mechanisms for ensuring microtask checkpoints are run. I'm not sure if it solves the problem fully, but I was hopeful. Thus, for example:

if you queued separate evaluation jobs for a and then b, the queue would be completely emptied by the time the b job came around, and thus presumably 'a' would have been logged to the console.

I'm not sure how that lines up with #33, having not reviewed it in detail.

littledan · 2019-02-18T22:16:59Z

@domenic I see; this is an interesting frame. Maybe there's some way that we could give HTML a chance to clean up after launching a module load. My first shot at HTML integration with top-level await doesn't do that, but I'll think about host hooks that we might add for that purpose.

At the same time, given comments people are making elsewhere about job queue related topics, I imagine there would be interest in ensuring a cross-environment guarantee that this reordering doesn't occur.

Another solution to this problem is to not yield to the job queue when importing a module that does not have a top-level await, as I was suggesting in #43. I think that'd solve this problem as well.

Closes tc39#47, tc39#48, tc39#43

Closes tc39#47, tc39#43, tc39#48, tc39#50

littledan · 2019-02-27T15:09:42Z

Fundamentally, I see two approaches to solve this problem. Both of these approaches seem doable to me; I don't see fatal flaws with either, and I don't have an extremely strong preference. I'm wondering what you all think.

Normative: Avoid microtask queue delay when resolved #49: Synchronously load modules when a top-level await is not dynamically reached. Advantages:
- No behavior changes for what's currently shipping in browsers (related discussion: Execute module graph synchronously when no top-level await is reached? #43 )
- (minor) easier to polyfill/transpile precisely (@guybedford and I couldn't find a way in HTML to do a microtask checkpoint without yielding to the event loop)
- Avoids queue-flushing as a synchronization primitive, which isn't something JS does right now and doesn't feel right to some people.
Normative: Flush Job Queue between module imports #51: Flush the job queue after loading each module. Advantages:
- Modules with and without top-level await are treated the same: There's always a microtask queue tick when importing them. This feels closer to the recommendations of the W3C Promises guide.
- HTML cleans up after JS all the time with microtask queue checkpoints, so this is not unusual at all.

cc @GeorgNeis @domenic @MylesBorins @guybedford @mathiasbynens

domenic · 2019-02-27T22:36:16Z

HTML cleans up after JS all the time with microtask queue checkpoints, so this is not unusual at all.

To expand on this. Microtask checkpoints get run whenever HTML is done running "a block of script". For example, given two sync classic scripts

<script>
Promise.resolve().then(() => console.log("promise 1"));
console.log("script 1");
</script>
<script>
Promise.resolve().then(() => console.log("promise 2"));
console.log("script 2");
</script>

the result will be script 1 / promise 1 / script 2 / promise 2. (Try it: https://honeysuckle-ermine.glitch.me/. Note that iframe-based runners like jsbin will NOT show this effect, since the stack is not empty.)

So it makes perfect sense to me that we would flush it between modules as well.

Honestly, it is kind of an oversight that we haven't been doing that so far, I would say. We didn't do so because it was easier for the HTML spec to treat module graphs as atomic single-blocks-of-script. But I think that approach was against the spirit of microtask checkpoints, and is showing its cracks here.

zenparsing · 2019-02-27T22:46:03Z

Curious: if you flush microtasks after executing a module body (leaving TLA aside) does that make the module evaluation algorithm reentrant?

littledan · 2019-02-27T23:50:37Z

@zenparsing Are you asking, is it possible for some other module evaluation algorithm to start running? I believe the answer is, with the HTML host environment, no. import() always calls back into the ES algorithms after queueing a task, see HTML's definition of HostImportModuleDynamically. I was thinking, we could make a tweak to the requirements for HostImportModuleDynamically to require that all host environments behave, fundamentally, like this, in some way or other.

littledan · 2019-02-28T14:35:45Z

cc @jakearchibald

jakearchibald · 2019-02-28T15:45:44Z

The flushing model will change current behaviour right?

1.js

import './2.js';
console.log('1');

2.js

console.log('2');
Promise.resolve().then(() => console.log('2-promise'));

https://static-misc.glitch.me/module-microtask-test/

The above currently logs 2, 1, 2-promise, and we're talking about changing it to 2, 2-promise, 1? Is it ok to change something like that?

Fwiw, I tell devs that microtask checkpoints happen whenever the JS stack empties (plus a specific point in the event loop). I don't know how bulletproof that rule is supposed to be, but the current behaviour breaks that rule, whereas the flushing proposal 'fixes' it.

I guess I agree with @domenic that we should have been flushing the job queue between module evaluation all along.

littledan · 2019-02-28T16:48:15Z

@jakearchibald Yes, I agree that it'd be an observable change; see previous discussion at #43 .

OK, if folks are pretty positive on #51, maybe we should settle on that.

guybedford · 2019-02-28T19:35:46Z

I've been going back and fourth on this, but after some consideration, I'm actually strongly against #51.

The reason for this is that bundling tools like Rollup combine modules together in a way that assumes execution between successive modules in execution order will not ~~yield to the event loop~~ execute microtasks.

If we execute microtasks, we will create observable differences between the built version of a library, and the original source version.

This one change will severely impact Rollup and similar JS optimiziation tools, because there is now a very different execution semantic in play for the build vs the original source.

Even before I saw this argument, I was leaning towards #49 as well because , just as @jakearchibald mentions, JS developers think very differently to standards here, because we only see the event loop trigger as when we reach the end of the our code.

I've yet to fully understand the arguments against #49, but I hope these new arguments can be considered here.

guybedford · 2019-02-28T19:36:36Z

(corrected usage of "event loop" - see I'm still thinking like an early Node.js dev :P)

zenparsing · 2019-02-28T19:38:23Z

I'm trying to better understand the browser semantics at play here. Let's say I do this:

var script = document.createElement('script')
script.innerText = `
  console.log('script-a'); 
  Promise.resolve().then(() => console.log('script-promise'));
  console.log('script-b');
`;
(
  console.log('outside-a'), 
  document.head.appendChild(script),
  console.log('outside-b')
);

I get

outside-a
script-a
script-b
outside-b
script-promise

Is that correct?

littledan · 2019-02-28T19:44:02Z

@zenparsing See clean up after running script:

If the JavaScript execution context stack is now empty, perform a microtask checkpoint. (If this runs scripts, these algorithms will be invoked reentrantly.)

Because there is JS code running, the microtask queue checkpoint doesn't happen in your example.

domenic · 2019-02-28T19:46:42Z

@guybedford we're likely to end up yielding to the event loop anyway, regardless of top-level await, for the performance reasons discussed in this document. At least in browsers. Startup jank is an increasing problem and one modules are well-positioned to solve.

JS developers think very differently to standards here, because we only see the event loop trigger as when we reach the end of the our code.

I read @jakearchibald's point the opposite way, that JS developers expect reaching the end of a module script to behave the same as reaching the end of a sync script, and yield. I guess it depends on what you mean by "our code", but I interpreted that to be a single module, not an entire graph of modules.

zenparsing · 2019-02-28T19:55:41Z

Thanks @littledan. I wonder how this might play out for a hypothetical JS environment which allows for more direct access to module compilation and evaluation.

Strawman example:

const module = System.compileModule(`
  export const a = 1;
`);
let moduleGraph;
// Populate moduleGraph
System.linkAndEvaluate(moduleGraph);

Now we would presumably have JS on the call stack below module evaluation. Would we still flush jobs after every module evaluation in that graph?

guybedford · 2019-02-28T20:24:27Z

@guybedford we're likely to end up yielding to the event loop anyway, regardless of top-level await, for the performance reasons discussed in this document

Fascinating. I agree this is a huge problem today, and that would be a huge solution. If this is the case why not go all the way then and have this proposal yield the event loop for all modules so we can adopt evenly amongst platforms? It would be a win for all.
.

domenic · 2019-02-28T21:16:40Z

I've filed an issue to discuss the more general yielding-to-the-event-loop question at whatwg/html#4400. BTW if anyone wants to be added to the @whatwg/modules team, ping me on IRC or something. (Or maybe https://github.com/orgs/whatwg/teams/modules has a "ask to join" button? I dunno.)

littledan · 2019-02-28T22:58:38Z

@zenparsing I'd suggest that we include requirements in the specification that the module loading/execution algorithm not be used re-entrantly like that. I think some aspects of it may already be broken in such an embedding environment (e.g., the way errors are propagated).

littledan · 2019-02-28T23:01:46Z

OK, between the consistency in terms of using promises the same way all the time (as strongly encouraged by TC39) and the potential of reducing jank on page load, I think we have sufficient motivation to land #51 for now, with the hope that we'll be able to work through some further details in whatwg/html#4400 and whatwg/html#4352 . If we encounter issues with this approach, we can revisit #49.

zenparsing · 2019-03-01T00:08:34Z

@littledan I'm sure you can appreciate that I don't find that answer particularly satisfying. 😉

zenparsing · 2019-03-01T04:14:04Z

I described the example above as a "hypothetical" environment, but in fact you can already do something very similar in node with the --experimental-modules and --experimental-vm-modules flags.

Node VM docs

When you call module.evaluate() for a vm.SourceTextModule object, it runs evaluation synchronously, so in that case you do have JS on the call stack. Deeper still, in node the ES module loader itself is written in JS.

Do you think we can support those scenarios and also flush the promise job queue in the middle of evaluation?

littledan · 2019-03-01T10:33:51Z

OK, between #51 (comment) and #47 (comment) , sounds like I misunderstood the concerns here, which are concretely about compatibility with the web and Node.js. I will revert #51 for now and just land some parts of it that were common with #49 and no one raised concerns about.

I have to spend some time rethinking how these algorithms would work if invoked reentrantly. I was just looking at HTML and hoping that other environments would also refrain from this case.

To answer your question, no, I don't think we can re-entrantly flush the job queue/perform a microtask checkpoint (or re-entrantly queue a task and continue the following steps from the task).

For Node's module integration, I am wondering, do they need to be synchronous, or could they return a Promise?

Closes tc39#47, tc39#48, tc39#43

littledan · 2019-03-01T11:51:07Z

Apologies for the noise; let's keep discussing here. You can find the two (rebased) alternatives at #57 (which is the new #51) and #49.

domenic · 2019-03-01T14:50:42Z

Sad to hear the hard work here was reverted. I remain confident the best path for landing top-level await in browsers is via a microtask (or perhaps full event loop) flush.

As for the question about whether Node should allow sync module script execution, I strongly hope they do not. But if they do, they are welcome to either run the microtask checkpoint or not, as this is a host hook. For example, one could argue that vms are not like iframes, and so in Node, they might count "the topmost thing on the stack is the VM running script" as an "empty stack" for the purposes of performing a microtask checkpoint.

zenparsing · 2019-03-01T16:40:53Z

For Node's module integration, I am wondering, do they need to be synchronous, or could they return a Promise?

Node exposes a thin wrapper (ModuleWrap) around v8 Module objects and this wrapper is used by the vm module and the internal ESM module loader. So, currently, both in vm and the loader, JS is on the stack when evaluation occurs.

If there is JS on the call stack (even "internal" node JS) then flushing the job queue will mess with our run-to-completion semantics.

It may be possible to modify node's internals such that a (non-JS) job to call "evaluate" is enqueued and then evaluate is only called directly from c++, but I'm not sure.

@guybedford what are your thoughts?

littledan · 2019-03-02T08:57:20Z

Talking with @MylesBorins , it sounded like that API was experimental and due to be rethought and possibly replaced, which is why it's behind multiple flags. When that happens, I hope it's possible to consider making the API not synchronous, but instead return a Promise and wait an event loop tick before doing work. But, in particular, it doesn't sound like this is a concern about maintaining compatibility with the current Node ecosystem, or a new API that's about to ship.

Stepping back, about alignment between Node, tools, browsers, and other environments: I really want to find a common solution between all JavaScript implementations and environments for this problem if we can. I'm not sure if they are all in scope with respect to whatwg/html#4400 , but it's clear to me that there's broad interest in TC39 in helping unify things. So, @zenparsing , thank you for bringing background about Node concerns, and thank you to @guybedford , @Rich-Harris, @lukastaegert for bringing the context about bundlers.

littledan · 2019-03-15T23:24:46Z

Another alternative to avoid reordering is to differentiate loading modules which might contain await with an import await statement, as proposed in #60 . Then, ordinary import statements are always synchronous and never reordered. However, significant concerns are raised in that thread.

This patch is a variant on tc39#49 which determines which module subgraphs are to be executed synchronously based on syntax (whether the module contains a top-level await syntactically) and the dependency graph (whether it imports a module which contains a top-level await, recursively). This fixed check is designed to be more predictable and analyzable. Abstract module record changes: - The [[Async]] field stores whether this module or dependencies are async. It is expected to be initialized by the Linking phase. - The [[ExecutionPromise]] field stores the Promise related to the evaluation of a module whose [[Async]] field is *true*. - Evaluate() returns a Promise for [[Async]] modules, a completion record for sync modules which throw, or undefined otherwise. Cyclic Module Record changes: - A new [[ModuleAsync]] field stores whether this particular module is asynchronous (dependencies aside). - The ExecuteModule method on Cyclic Module Records takes an argument for the Promise capability, but only if that particular module is [[ModuleAsync]]. - The Link/Instantiate phase is used to propagate the [[Async]] field up the module graph to dependencies. - When there's a cycle, with some modules sync and some async, the whole cycle is considered async, with the Promise of each module set to the entrypoint of the cycle, although the cycle-closing edge will not actually be awaited (since this would be a deadlock). Source Text Module Record changes: - The check for whether a module contains a top-level await locally is in a ContainsAwait algorithm (TBD writing this out, but it should be static since await may not appear in a direct eval) - Module execution works as before if ContainsAwait is false, and works like an async function if ContainsAwait is true. Closes tc39#47, tc39#48, tc39#43

littledan mentioned this issue Feb 17, 2019

Largely rewrite the explainer #44

Merged

littledan mentioned this issue Feb 20, 2019

Variant B promise refactoring #33

Merged

littledan added a commit to littledan/proposal-top-level-await that referenced this issue Feb 22, 2019

Normative: Avoid microtask queue delay when resolved

516e05f

Closes tc39#47, tc39#48, tc39#43

littledan mentioned this issue Feb 22, 2019

Normative: Avoid microtask queue delay when resolved #49

Closed

littledan added a commit to littledan/proposal-top-level-await that referenced this issue Feb 22, 2019

Normative: Flush Job Queue between module imports

a449e37

Closes tc39#47, tc39#43, tc39#48, tc39#50

littledan added a commit to littledan/proposal-top-level-await that referenced this issue Feb 22, 2019

Normative: Flush Job Queue between module imports

3a68b79

Closes tc39#47, tc39#43, tc39#48, tc39#50

littledan mentioned this issue Feb 22, 2019

Normative: Flush Job Queue between module imports #51

Merged

littledan mentioned this issue Feb 28, 2019

JSON modules whatwg/html#4315

Closed

domenic mentioned this issue Feb 28, 2019

Yielding to the event loop (in some way) between module evaluations whatwg/html#4400

Closed

littledan closed this as completed in #51 Feb 28, 2019

littledan reopened this Mar 1, 2019

littledan added a commit to littledan/proposal-top-level-await that referenced this issue Mar 1, 2019

Normative: Avoid microtask queue delay when resolved

c1a5fe7

Closes tc39#47, tc39#48, tc39#43

littledan added a commit to littledan/proposal-top-level-await that referenced this issue Mar 1, 2019

Normative: Avoid microtask queue delay when resolved

11a4bff

Closes tc39#47, tc39#48, tc39#43

This was referenced Mar 19, 2019

Normative: Synchronous based on a syntax and module graph #61

Merged

Reland job queue flush #57

Closed

littledan closed this as completed in #61 Mar 26, 2019

Avoid reordering execution of module bodies before top-level await is reached #47

Avoid reordering execution of module bodies before top-level await is reached #47

Comments

littledan commented Feb 17, 2019

littledan commented Feb 17, 2019

littledan commented Feb 17, 2019

domenic commented Feb 17, 2019

bergus commented Feb 17, 2019

littledan commented Feb 17, 2019

domenic commented Feb 17, 2019

domenic commented Feb 17, 2019

bergus commented Feb 17, 2019 • edited Loading

littledan commented Feb 18, 2019

domenic commented Feb 18, 2019

littledan commented Feb 18, 2019

littledan commented Feb 27, 2019 • edited Loading

domenic commented Feb 27, 2019 • edited Loading

zenparsing commented Feb 27, 2019

littledan commented Feb 27, 2019 • edited Loading

littledan commented Feb 28, 2019

jakearchibald commented Feb 28, 2019

littledan commented Feb 28, 2019

guybedford commented Feb 28, 2019 • edited Loading

guybedford commented Feb 28, 2019

zenparsing commented Feb 28, 2019

littledan commented Feb 28, 2019

domenic commented Feb 28, 2019 • edited Loading

zenparsing commented Feb 28, 2019

guybedford commented Feb 28, 2019

domenic commented Feb 28, 2019

littledan commented Feb 28, 2019

littledan commented Feb 28, 2019 • edited Loading

zenparsing commented Mar 1, 2019

zenparsing commented Mar 1, 2019

littledan commented Mar 1, 2019 • edited Loading

littledan commented Mar 1, 2019 • edited Loading

domenic commented Mar 1, 2019

zenparsing commented Mar 1, 2019

littledan commented Mar 2, 2019

littledan commented Mar 15, 2019

bergus commented Feb 17, 2019 •

edited

Loading

littledan commented Feb 27, 2019 •

edited

Loading

domenic commented Feb 27, 2019 •

edited

Loading

littledan commented Feb 27, 2019 •

edited

Loading

guybedford commented Feb 28, 2019 •

edited

Loading

domenic commented Feb 28, 2019 •

edited

Loading

littledan commented Feb 28, 2019 •

edited

Loading

littledan commented Mar 1, 2019 •

edited

Loading

littledan commented Mar 1, 2019 •

edited

Loading