Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guest shares host by default #50

Closed
wants to merge 5 commits into from
Closed

Guest shares host by default #50

wants to merge 5 commits into from

Conversation

kriskowal
Copy link
Member

This proposal would change the default behavior of the Loader constructor such that it creates a useful guest Loader by default, by sharing the host’s global object, static memos, and hooks. Loader constructor options override these behaviors selectively.

The motivation for this case is to provide a direct answer to @guybedford and @lucacasonato who proposed that one of the requirements for WASM is the ability to defer execution of a WASM module such that it be linked in a new module graph, possibly in multiple module graphs. This aligns well with the needs for Hot Module Replacement.

I propose as a straw-man that this framing of the Loader API would make possible a very clear way to express the designed intent:

import static 'x.wasm';
for (let i = 0; i < poolSize; i++) {
  new Loader().import('x.wasm');
}

The import static declaration would defer execution but populate the static module record memo of the host’s realm loader with x.wasm and its transitive dependencies. The guest compartments would inherit these static module records by default.

Developing beyond this idea, host virtualization hooks on the loader can interpose to provide alternate linkage or fall through to the host. In this example, the new loader would inherit the instance of 'y.wasm', which we presume 'x.wasm' imports. Consequently, the new loader has its own instance of x.wasm and an instance of y.wasm shared with the host.

new Loader({
  loadHook(fullSpecifier) {
    if (fullSpecifier === 'y.wasm') {
      return { instance: fullSpecifier };
    } else {
      return { record: fullSpecifier };
    }
  },
}).import('x.wasm');

README.md Show resolved Hide resolved
// its host's.
// The constructor copies the own properties over its own globalThis by
// assignment.
globals: Object,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it resolve #38?

Copy link
Member Author

@kriskowal kriskowal Jun 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it expressly does not allow a virtualized global object. The globalThis of compartment with a dedicated global object is an ordinary object with a null prototype.

We have in the past received feedback that engines covet the definition of global environment records and that we should not allow that behavior to be virtualized. We would benefit from a fresh confirmation if that’s still the case, cc @codehag, @syg.

We certainly do not need globalThis to be virtualizable, even for Hardened JavaScript, so we have not asked. However, even if we could pass globalThis to a Loader (née Compartment) constructor, that doesn’t tell the entire story for global environment record and module environment records.

And, we would like to keep the door open if in the future, someone were to champion the use of Loader (or Compartment) for REPL-style evaluate, where each Script can add new const and let bindings to the global contour (or whatever that’s actually called).

Closer to home, for Hardened JavaScript, it may be useful to introduce a scope that overshadows globalThis for both Scripts and Modules that guest code can only access lexically and not discover by enumerating properties of globalThis. We used a technique like this to prototype metering, which is similar in spirit to adding an instrumentation/coverage collector in scope. (We’ve since left that prototype behind since XS added native support for metering).

@kriskowal
Copy link
Member Author

I think this change needs more care. We need to preserve a way for a Loader to borrow the host load hook and not borrow the static module record cache for Hot Module Replacement or other kinds of reloading, like one might see with a test watcher.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@guybedford
Copy link
Collaborator

Interesting considerations! Currently, import x from './x.wasm' as 'module' would in our proposal effectively populate the host registry with the compiled module such that the identity of that JS module object would be shared with other importers in the same host context:

reexport.js

import x2 from './x.wasm' as 'module';
export { x2 };

a.js

import x1 from './x.wasm' as 'module';
import { x2 } from './reexport.js';

x1 === x2; // true - module map retains identity

On the other hand, if passing the module to a worker over structured clone, the identity would be broken while the underlying internal compilation can be shared.

As for loaders inheriting the identity of compiled records of the parent host by default, could it not be possible to just explicitly do this passing of the compiled object instead, like permitting a static module reflection reference to be returned by the load hook? This seems like it would handle the same use case without needing to explicitly treat the registry systems as nested or iterated and copied?

@mhofman
Copy link
Member

mhofman commented Jun 9, 2022

We need to figure out the identity issue for modules blocks as well. Here is a long thread on the topic: tc39/proposal-module-expressions#58

@kriskowal
Copy link
Member Author

kriskowal commented Jun 10, 2022

This change is not ready to land because it fails to adequately account for the case where a guest compartment needs a fresh static record memo (so it can reload) and also reuse the host’s loader’s load hooks. (I’ve made the mistake of conflating the effects of loadHook and memo sharing.) The options need to be orthogonal and the design directions come in two flavors.

A: Share by default (which I suspect to be more common)
B: Detach by default (which is generally safer)

  • Adopt host globals or create new globals
    • A1
      • adopt: new Compartment()
      • create: new Compartment({ globals: {} })
    • A2
      • adopt: new Compartment()
      • create: new Compartment({ detachGlobals: true })
      • create: new Compartment({ detachGlobals: true, globals: {x, y z} })
    • B
      • adopt: new Compartment({ shareGlobal: true })
      • create: new Compartment()
      • create: new Compartment({ globals: {} })
      • create: new Compartment({ globals: {x, y z} })
  • Adopt, override, or omit resolveHook
    • A:
      • adopt: new Compartment()
      • override: new Compartment({ resolveHook })
      • omit: new Compartment({ resolveHook: null })
    • B
      • adopt: new Compartment({ shareResolveHook: true })
      • override: new Compartment({ resolveHook })
      • omit: new Compartment()
  • Adopt, override, or omit loadHook
    • A
      • adopt: new Compartment()
      • override: new Compartment({ loadHook })
      • omit: new Compartment({ loadHook: null })
    • B
      • adopt: `new Compartment({ shareLoadHook: true })
      • override: new Compartment({ loadHook })
      • omit: new Compartment()
  • Adopt or detach static module record memo
    • A
      • adopt: new Compartment()
      • override: new Compartment({ detach: true })
    • B
      • adopt: new Compartment({ shareRecords: true })
      • override: new Compartment()

I think I would like to convince this group that option B is generally better, even though detached global by default will be surprising the to majority of users. On the other hand, I suspect that attaching the static module record by default will also have surprising negative effects. We could pursue a hybrid, but I know that my mind at least has a hobgoblin.

Please let me know what you think and I’ll update this change to reflect the general favor of the champion group.

@kriskowal
Copy link
Member Author

It occurs to me, we could also reify Loader.loadHook and Loader.resolveHook so they can be passed from host to guest. This would mean revealing the intrinsic host-defined versions of these functions and allowing guest code to call them directly, which I suspect is undesirable. I can’t contemplate passing the memos in this way though.

@Jamesernator
Copy link

Jamesernator commented Jun 10, 2022

It occurs to me, we could also reify Loader.loadHook and Loader.resolveHook so they can be passed from host to guest.

This is basically how Node's experimental loader API works. In that it basically passes defaultLoad/defaultResolve to the load/resolve hooks so that you can call them to delegate to host behaviour.

This API could do something similar i.e.:

interface LoaderOptions {
    loadHook: (
        specifier: string,
        defaultLoadHook: (specifier: string) => Promise<ModuleDescriptor>
    ) => Promise<ModuleDescriptor>;
    // Ditto for resolveHook
}
const loader = new Loader({
    async loadHook(specifier, defaultLoadHook) {
        // Delegate to host when needed
        if (specifier.startsWith("node:")) {
            return await defaultLoadHook(specifier);
        }
        // Load it myself
    },
});

@kriskowal
Copy link
Member Author

By way of contrast, the same is possible without revealing the host’s loadHook to guest code, and it’s possible to distinguish the desire to create a new instance or share an instance.

const loader = new Loader({
    async loadHook(specifier) {
        // Delegate to host when needed
        if (specifier.startsWith("node:")) {
            return { instance: specifier };
        }
        // Load it myself
    },
});

@kriskowal
Copy link
Member Author

Then this looks strange to me. If the inner eval and the outer eval is the same function, how can you know the dynamic import performed should be in the child compartment or the incubator compartment?

My understanding is that the global environment record (for evaluated scripts) and module environment records (for evaluated modules) will need to keep a marker to the intrinsic eval and separately the intrinsic Loader. The intrinsic eval is used for distinguishing indirect from direct eval, and the loader is used for import syntax regardless of the intrinsic evaluator. We will of course have to take care to make this true in our proposed spec text.

@Jack-Works
Copy link
Member

Then this looks strange to me. If the inner eval and the outer eval is the same function, how can you know the dynamic import performed should be in the child compartment or the incubator compartment?

My understanding is that the global environment record (for evaluated scripts) and module environment records (for evaluated modules) will need to keep a marker to the intrinsic eval and separately the intrinsic Loader. The intrinsic eval is used for distinguishing indirect from direct eval, and the loader is used for import syntax regardless of the intrinsic evaluator. We will of course have to take care to make this true in our proposed spec text.

Sorry I still do not understand 😂 because you can pass the eval around, then how do you figure out if it should be run in the child compartment or the incubator compartment?

For example, I have an intrinsic eval from the Realm A, and in the Realm A there are compartments Root, A, and B. As you said, the eval keeps the same identity in the compartment Root, A, and B.

Then what will happen if I pass the eval to the Realm B and execute eval("import('path')")? Which loader will it trigger?

@kriskowal
Copy link
Member Author

Then this looks strange to me. If the inner eval and the outer eval is the same function, how can you know the dynamic import performed should be in the child compartment or the incubator compartment?

My understanding is that the global environment record (for evaluated scripts) and module environment records (for evaluated modules) will need to keep a marker to the intrinsic eval and separately the intrinsic Loader. The intrinsic eval is used for distinguishing indirect from direct eval, and the loader is used for import syntax regardless of the intrinsic evaluator. We will of course have to take care to make this true in our proposed spec text.

Sorry I still do not understand 😂 because you can pass the eval around, then how do you figure out if it should be run in the child compartment or the incubator compartment?

For example, I have an intrinsic eval from the Realm A, and in the Realm A there are compartments Root, A, and B. As you said, the eval keeps the same identity in the compartment Root, A, and B.

Then what will happen if I pass the eval to the Realm B and execute eval("import('path')")? Which loader will it trigger?

And now I understand the nature of the question! And I find that I do not have an answer, short of abandoning the notion of Compartments with shared globals. I hope the brain trust will join me in pondering a solution.

@erights
Copy link
Collaborator

erights commented Jun 14, 2022

Are we talking about direct or indirect eval? Indirect eval should unquestionably be associated with that eval function's context of origin.

If we're talking direct eval, I see only two issues:

  • Does code from compartment A executing an eval(src) expression with an eval function from compartment B recognize it as an original eval function, such that it interprets the expression as a direct eval.
    • If no, then that expression is interpreted as an indirect eval. To compartment A, this is just a function call to a function defined elsewhere. That eval function cares only about its compartment of origin.
    • If yes, then once we've decided to interpret it as a direct eval expression, the eval function in question no longer has any further influence whatsoever on the execution beyond this decision. The direct eval is a special form in the scope of the code in which it appears, just as an if statement is a special form in the scope of the code in which it appears.

@Jamesernator
Copy link

And now I understand the nature of the question! And I find that I do not have an answer, short of abandoning the notion of Compartments with shared globals. I hope the brain trust will join me in pondering a solution.

Is the idea that eval(...) always uses the same loader the evaluating code? If so then wouldn't we just add a similar notion to current realm but for the current loader.

i.e. More specifically we would have an operations like GetFunctionRealm except called GetFunctionLoader. eval would not have a [[Loader]] slot so effectively "step 4" would be executed delegating to essentially the caller.

@erights
Copy link
Collaborator

erights commented Jun 14, 2022

I'm jumping into the middle without having absorbed enough context first. I may be missing or misunderstanding the real question.

@kriskowal
Copy link
Member Author

It is perhaps not a fundamental hole in the notion of shared-globals, but at this moment I see no way that eval('import("x")') or compartment.evaluate('import("x")') would not fall through the floor to the compartment that first created the global environment. This of course poses no problem at all for module code, but would be at least a mild aberration for eval.

@kriskowal
Copy link
Member Author

As @erights points out, direct and indirect eval require separate consideration.

For direct eval, a sensible answer may be possible because the script is evaluated in the module, which in turn has the correct loader associated with it.

For indirect eval, my previous answer stands. The only answer I can imagine is that the dynamic import would invoke the loader associated with the compartment that first created the global environment.

@kriskowal
Copy link
Member Author

And now I understand the nature of the question! And I find that I do not have an answer, short of abandoning the notion of Compartments with shared globals. I hope the brain trust will join me in pondering a solution.

Is the idea that eval(...) always uses the same loader the evaluating code? If so then wouldn't we just add a similar notion to current realm but for the current loader.

i.e. More specifically we would have an operations like GetFunctionRealm except called GetFunctionLoader. eval would not have a [[Loader]] slot so effectively "step 4" would be executed delegating to essentially the caller.

What you propose is clearly necessary. The crux is that the eval object’s [[Loader]] is presumably set to the loader that created it, not one of the loaders that shares it.

@erights
Copy link
Collaborator

erights commented Jun 14, 2022

For direct eval, good. Is that settled? Are there remaining open questions? Well, one: which eval functions count as an original eval function when deciding whether an eval(src) expression should be treated as a direct eval expression. But once that is decided, if it is a direct eval, are there any further open questions?

For indirect eval, my previous answer stands. The only answer I can imagine is that the dynamic import would invoke the loader associated with the compartment that first created the global environment.

Where does the eval function in question come from? What created it?

@erights
Copy link
Collaborator

erights commented Jun 14, 2022

In any case, for indirect eval, it must not matter where it is called from. All that matters is where it was created.

@erights
Copy link
Collaborator

erights commented Jun 14, 2022

As a further clarification, a direct eval does not call the eval function bound to the eval variable. That eval function's identity is used to make this decision. Afterwards, that eval function is irrelevant and has no influence.

@kriskowal
Copy link
Member Author

kriskowal commented Jun 14, 2022

I think we’re in agreement about what the behavior must be, assuming that we make it possible to create a guest compartment that shares its global environment with its host (the topic of this change). The question is whether that is acceptable.

That is to say, assuming const guest = new host.globalThis.Compartment({ ... }) such that the guest compartment shares the host compartment's global environment, guest.globalThis === host.globalThis, guest.globalThis.eval === host.globalThis.eval, and the eval function’s [[Compartment]] slot is the host. With this arrangement, direct eval effects an import in the guest compartment and indirect eval imports in the host compartment.

  1. Consider the expression eval('import("x")'),
    1. If that expression exists within a program we evaluate that program with guest.evaluate(program),
      1. Look up eval in the lexical scope of the program. We find guest.globalThis.eval, which is originally host.globalThis.eval. Look up the intrinsic eval of the execution environment, which is host.globalThis.eval. These objects are identical, so we proceed to interpret the expression as a direct eval. Look up the intrinsic loader of the execution environment, which is guest, so we import x in guest.
    2. If that expression exists within a module 'y' and guest.import('y'). Same effect as previous. We import x in guest.
  2. Consider the expression (0, eval)('import("x")),
    1. If that expression exists within a program we evaluate that program with guest.evaluate(program),
      1. The LHS is not the identifier eval, so this is indirect eval. We look up eval in lexical scope, find host.globalThis.eval, look up its [[Compartment]], find that it’s host, and so we import x of host.
    2. If that expression exists within a module 'y' and guest.import('y'). Same effect as previous. We import x in host.

@kriskowal
Copy link
Member Author

My tentative opinion is that this is unfortunate but acceptable. The only known alternative is to not provide a shared-global mode. I do not speak for every champion, but I for one would be willing to pitch either option.

@Jack-Works
Copy link
Member

Does code from compartment A executing an eval(src) expression with an eval function from compartment B recognize it as an original eval function, such that it interprets the expression as a direct eval.

For Realm, the answer is No (it is an indirect eval when eval is from another Realm). So I think it should be the same for compartments.

In any case, for indirect eval, it must not matter where it is called from. All that matters is where it was created.

Yes, it should be bounded to where it was created, but the problem is we want to share globals, which means the child compartment will have the same eval as the parent one.

The only known alternative is to not provide a shared-global mode.

One of the possible solution is to give up the rootComp.globalThis === childComp.globalThis constraint.

Let's make the child compartment an exotic object, and fall through all queries besides eval Function Compartment to the parent compartment. In this way, each compartment will have its own evaluators and still "share" the global in some way.

@kriskowal
Copy link
Member Author

This change is not ready to land because it fails to adequately account for the case where a guest compartment needs a fresh static record memo (so it can reload) and also reuse the host’s loader’s load hooks. (I’ve made the mistake of conflating the effects of loadHook and memo sharing.) The options need to be orthogonal and the design directions come in two flavors.

A: Share by default (which I suspect to be more common) B: Detach by default (which is generally safer)

  • Adopt host globals or create new globals

    • A1

      • adopt: new Compartment()
      • create: new Compartment({ globals: {} })
    • A2

      • adopt: new Compartment()
      • create: new Compartment({ detachGlobals: true })
      • create: new Compartment({ detachGlobals: true, globals: {x, y z} })
    • B

      • adopt: new Compartment({ shareGlobal: true })
      • create: new Compartment()
      • create: new Compartment({ globals: {} })
      • create: new Compartment({ globals: {x, y z} })
  • Adopt, override, or omit resolveHook

    • A:

      • adopt: new Compartment()
      • override: new Compartment({ resolveHook })
      • omit: new Compartment({ resolveHook: null })
    • B

      • adopt: new Compartment({ shareResolveHook: true })
      • override: new Compartment({ resolveHook })
      • omit: new Compartment()
  • Adopt, override, or omit loadHook

    • A

      • adopt: new Compartment()
      • override: new Compartment({ loadHook })
      • omit: new Compartment({ loadHook: null })
    • B

      • adopt: `new Compartment({ shareLoadHook: true })
      • override: new Compartment({ loadHook })
      • omit: new Compartment()
  • Adopt or detach static module record memo

    • A

      • adopt: new Compartment()
      • override: new Compartment({ detach: true })
    • B

      • adopt: new Compartment({ shareRecords: true })
      • override: new Compartment()

I think I would like to convince this group that option B is generally better, even though detached global by default will be surprising the to majority of users. On the other hand, I suspect that attaching the static module record by default will also have surprising negative effects. We could pursue a hybrid, but I know that my mind at least has a hobgoblin.

Please let me know what you think and I’ll update this change to reflect the general favor of the champion group.

I’ve moved this question to a topic #65

@kriskowal
Copy link
Member Author

I am closing this to signal my intention to follow-up with a proposal for ExecutionContext that in combination with more atomic Module and ModuleSource largely obviates the need to specify this API.

@kriskowal kriskowal closed this Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants