Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

Feature: Named exports when importing CJS #81

Closed
giltayar opened this issue May 14, 2018 · 108 comments
Closed

Feature: Named exports when importing CJS #81

giltayar opened this issue May 14, 2018 · 108 comments

Comments

@giltayar
Copy link

giltayar commented May 14, 2018

Currently in NodeJS, if I import from a CJS module,

import ns from `./cjs-module.js`

It will allow this, but won’t allow named exports, i.e. the following is not allowed:

import {namedSomething} from `./cjs-module.js`.

This is called transparent interop.
The reason that NodeJS doesn’t allow named exports is that determining what those exports are MUST happen in the parsing phase of the ESM Spec (according to some, although there is contention about that too), and executing code is not allowed at or prior to that phase. But a CJS module’s exports can only be determined by evaluating the code in it (static analysis of JS is not an option as it is not determined to always give the correct results).

This would maybe have been cool if NodeJS was the first ESM implementation, but people are used to the babel way of doing modules, and in the babel world, named exports from CJS modules are allowed. This is because babel does not conform to the ESM spec to the letter (it can’t, because it just transpiles ESM to CJS).

And, good or bad, developers expect to use named exports when importing CJS.

I see these options:

  1. Continue the existing way (no named exports for CJS)
  2. Don’t conform to the spec when importing CJS
  3. Do late-linking/shaping of named modules based on late evaluation of CJS module
  4. Disallow transparent interop, and enable import.meta.require (or equivalent) to enable importing CJS from ES modules
  5. Enable metadata in the CJS module that can statically describe the exports for ESM, e.g. something like //export default; export foo, bar; at the head of the CJS file, thus enabling named exports when importing the file.

I am sure there are others options, so if you have another option besides those four, please add it here.

It would be great if for each options you specify pros and cons, or at least if you don’t like the option, specify a clear and simple use case that would be problematic if the option was chosen.

Edit by @GeoffreyBooth: Use case 12.

@giltayar
Copy link
Author

This issue was raised in #80

@GeoffreyBooth
Copy link
Member

It’s worth reading the comments in #80 too, especially concerning the observability (or not) of what gets done in the parsing phase.

I would encourage us to try to find a solution that threads the needle if at all possible. This probably isn’t a stark decision between honoring the spec or not. I bet we can find a solution that enables this while still adhering to spec, or adhering to the spec as far as any user-executable code could ever know, or adheres mostly to spec. I think it’s worth exploring the gray area as much as possible.

@devsnek
Copy link
Member

devsnek commented May 14, 2018

personally I'm not a fan of importing cjs at all. (as a bonus we could drop the mjs extension if we didn't support it.) I would prefer something like import { makeRequire } from 'module'; const require = makeRequire(import.meta.url);

@ljharb
Copy link
Member

ljharb commented May 14, 2018

A very large number of the use cases we've documented require being able to import 'cjs' and require('esm') - that's probably one of the more contentious issues overall, of which "should import 'cjs' support named imports" is a small subset.

I think spec compliance is paramount, and I'd love to have import { Children } from 'react', for example, work (where "react" is a CJS module) - but I suspect we'd have to come up with a way to get consensus to change the spec for that to become tenable.

@GeoffreyBooth
Copy link
Member

Some of the discussion in #80 revolved around how to define “spec compliance.” If external code can’t see what Node is doing under the hood, and if Node is compliant as far as the external code can detect, than can that be considered “spec compliance”? Because if so, then that’s a way to have it both ways: do whatever parsing or evaluating that needs to be done in the parsing phase, in a non-observable way that has no side effects, and you get the ability to import CommonJS named exports without violating the spec. The trick is figuring out that “non-observable way that has no side effects.” But I think that’s a straightforward engineering problem that can be solved.

@mcollina
Copy link
Member

I think we should bring this question to TC39.

@ljharb
Copy link
Member

ljharb commented May 14, 2018

@GeoffreyBooth i believe that it's always possible to write code that observes the ordering of evaluation and linking, which is why it would require a spec change in EcmaScript to have node do that and be compliant.

@mcollina I believe @bmeck has already done so, but if there's new information it might be worth another shot.

@MylesBorins
Copy link
Contributor

MylesBorins commented May 14, 2018 via email

@jdalton
Copy link
Member

jdalton commented May 14, 2018

I'm cool having CJS evaluate before resolving named exports. It's what we're doing for builtins modules today (with the assumption errors/side-effects won't happen during evaluation). For the broader case, of more than just builtin modules, if errors do happen during evaluation I don't find the difference that ghastly, since that is something CJS users expect. The concern is scoped to CJS interop, so isn't something that bleeds into browser interop scenario, and can be something that is opted-in-or-out of as needed without broader approval from others like the TC39.

@GeoffreyBooth
Copy link
Member

That’s another good point—could we do this evaluation during the parsing phase for CommonJS imports only? Because the spec doesn’t concern itself with CommonJS, right? So however Node wants to handle CommonJS is up to us, so long as we follow spec with regard to true ES modules?

@devsnek
Copy link
Member

devsnek commented May 14, 2018

i actually proposed this way back in my first pr (nodejs/node#16675) and TC39 discussed it here: https://github.com/rwaldron/tc39-notes/blob/master/es8/2017-11/nov-28.md#9iie-discuss-module-order-instantiationevaluation-guarantees

its worth noting that (afaik, please correct me if i'm wrong) source text refers to whatever the original thing is that rules what is exposed from the import, which is in this case the cjs

@bmeck
Copy link
Member

bmeck commented May 14, 2018

@GeoffreyBooth there have been a variety of approaches implemented and or looked at but they seem to fall into 3 real areas of investigation.

@jdalton
Copy link
Member

jdalton commented May 14, 2018

As far as I know the user-land esm loader works with named exports of CJS modules in ESM without order of evaluation issues or pre-parsing CJS export pragmas (named-export-core). This means that the root issue may be something deeper like being able to perform creation, instantiation, and evaluation phases synchronously.

@bmeck
Copy link
Member

bmeck commented May 14, 2018

@jdalton it relies on the late linking in the last link I provided to my knowledge.

@devsnek
Copy link
Member

devsnek commented May 14, 2018

std/esm i think falls into the third category of having an unset shape until evaluation time.

@jdalton
Copy link
Member

jdalton commented May 14, 2018

Ah, the third category could be closest. It seems like our current --experimental-modules named exports support is close to that as well since it creates the ESM projections (named exports, namespace objects) based on the fully evaluated CJS module. (Synchronous phases may make it mesh better though). This is something doable, with the lower level bits available, today.

It will likely come down to if enough folks think the order nit for CJS interop is worth not having it, opting-in-or-out of it, or worth creating more complicated workarounds for it.

@ljharb
Copy link
Member

ljharb commented May 14, 2018

It's not just scoped to CJS interop; it's something that applies to anything that can be imported, including JSON - although wasm and HTML imports are likely to have a form of static named imports, the ES spec still requires that the names be statically available. It's not just about errors - the order of side effects is impacted as well.

If we can get CJS interop for named imports (transparent or otherwise), that would be excellent, but I don't think spec compliance on evaluation ordering is something node core can or should lightly toss aside.

@jdalton
Copy link
Member

jdalton commented May 14, 2018

It's not just scoped to CJS interop;

Interop, and any affordances made, can be scoped as needed.
WASM and HTML imports don't have an existing rich Node interop history (as Babel+CJS+ESM does).

If we can get CJS interop for named imports (transparent or otherwise), that would be excellent, but I don't think spec compliance on evaluation ordering is something node core can or should lightly toss aside.

CJS interop, however it's handled, will likely need to make affordances for some things (like the order concern). Being open to them in some form or another is a good thing.

@bmeck
Copy link
Member

bmeck commented May 14, 2018

evaluation ordering prevents polyfills and other things from producing side effects before out of order evaluation that are hoisted before them. it is not a light allowance or affordance to be treated as good by premise alone, please see the PR above by @devsnek

@jdalton
Copy link
Member

jdalton commented May 14, 2018

Completely wonky module order as with #16675 isn't desirable but it's not all-#16675 or nothing. As mentioned, there are ways to approach things so that level of mismatch is avoided. It's a balance between expected CJS and ESM behavior for sure. Digging into working implementations, like that of the user-land esm loader, to suss out requirements, limitations, and considerations may be a good first step to see what all would be needed to enable named exports to more than just builtin CJS modules.

@bmeck
Copy link
Member

bmeck commented May 15, 2018

@jdalton it isn't as if your loader is doing something completely undiscussed, we can look at it but need to discuss the topics at hand like we laid out above in PRs and links rather than example implementations that require spec changes for that late binding behavior if we are unable to even agree on if spec violation is desirable.

@weswigham
Copy link
Contributor

weswigham commented May 15, 2018

re: @MylesBorins AFAIK, There's nothing concrete to ask for because as long as you don't consider cjs modules as having esm semantics (they do not), the only place where the spec (as written) is potentially broken is including cjs in the graph at all (because it might create circularities, which are expressly forbidden - see the spec edit needed for wasm modules). I remember reading some notes from one of talks where they talked about execution order - esm execution order is supposedly well-defined by the design of the dependency graph traversal. However for non-es modules which do not participate within the graph in the same way, well, they aren't actually be specified, since they're not es modules (they just have to provide a list of exported names, by whatever means necessary). The final spec requirements for the runtime semantics of resolution, in the end, are simply:

  • The normal return value must be an instance of a concrete subclass of Module Record.
  • If a Module Record corresponding to the pair referencingModule, specifier does not exist or cannot be created, an exception must be thrown.
  • This operation must be idempotent if it completes normally. Each time it is called with a specific referencingModule, specifier pair as arguments it must return the same Module Record instance.

There's nothing that actually states how any non-esm source-text should or should not be executed. The idea that no execution of any js at all should happen during resolution is seems like a fabrication that entered the group's zeitgeist at some point which I cannot find a concrete source for (most all restrictions are placed on SourceTextModuleRecords, which a cjs module should not be, as it is not esm). Someone who believes that is the case may be able to cite the relevant spec points (at which point they can be campaigned to be changed)? But AFAIK there are none. How's the spec to enforce that a host doesn't execute anything for a native module upfront? As far as I can tell, it really doesn't.

@bmeck
Copy link
Member

bmeck commented May 15, 2018

@weswigham Abstract Module Records have specified semantics in the spec but we also discussed exactly what you are trying to avoid in TC39 as @devsnek also mentioned in relation to his PR https://github.com/rwaldron/tc39-notes/blob/master/es8/2017-11/nov-28.md#9iie-discuss-module-order-instantiationevaluation-guarantees . There is agreement that this guarantee is intended but not easily phrased in concrete language, we can create the concrete language if someone seeks to violate this without going to TC39 was my understanding from that meeting. The effort of creating such language is non-trivial and the idea that someone is going to treat Abstract Module Records out of order was expected to be somewhat out of expectations.

There have been other discussions in the spec and related bodies around ordering and why it should act a specific way as I aggregated for a slide deck in November 2017 https://docs.google.com/presentation/d/1RXvvScD8ce2FyLY2aYhbas83WCiBqzIOqdMt4OpkCJM/edit#slide=id.g2a9c910b7b_0_10

Per your points above:

However for non-es modules which do not participate within the graph in the same way, well, they aren't actually be specified, since they're not es modules (they just have to provide a list of exported names, by whatever means necessary).

They are leaf nodes that still call evaluate at a specific time as Abstract Module Records.

Someone who believes that is the case may be able to cite the relevant spec points (at which point they can be campaigned to be changed)?

See the presentation about this topic from November / how https://tc39.github.io/ecma262/#sec-innermoduleevaluation calls module.Evaluate() on non-SourceTextModule records at a specified time.

There's nothing that actually states how any non-esm source-text should or should not be executed.

That is outside of the scope of the JS spec for CJS, but the intention is to have a specific behavior as I described in my preamble above.

How's the spec to enforce that a host doesn't execute anything for a native module upfront? As far as I can tell, it really doesn't.

See the conclusion about it being hard to concretely phrase even if that is the intent.

Beyond this claim that there is no explicit ban on the behavior, we could add the ban at TC39 if required or we could change the timing of when module.Evaluate() happens.

@jdalton
Copy link
Member

jdalton commented May 15, 2018

@bmeck

it isn't as if your loader is doing something completely undiscussed,

Bits and bobs have been discussed in isolation though the lens of the limitations/confines of the existing Node WIP implementation and PRs. However, I don't think the combination of approaches that create a working implementation has been dug into. Since there are working implementations, they can be used to suss out requirements (whatever they may be). Once we identify what it would take we can bucket what we think would or wouldn't need specification changes.

@bmeck
Copy link
Member

bmeck commented May 15, 2018

@jdalton we have pointed out a specification change above that is needed to match your loader's semantics. We haven't even talked about other loaders that I can tell.

@jdalton
Copy link
Member

jdalton commented May 15, 2018

@bmeck while I appreciate the at-a-glance takes I think a deeper look would be a good thing. The jump to required spec changes also seems like it's skipping a few steps and assumes everyone is on your same page regarding what is and what isn't acceptable for CJS interop without requiring a change to the entire language specification.

@bmeck
Copy link
Member

bmeck commented May 15, 2018

@jdalton I disagree as these have been discussed, I have read esm several times, and I am pointing exactly to what esm is doing and related existing discussions for people to read. I have noted that your statements tend to avoid discussing the links I provide and the points contained within; as well as avoiding stating that the discussions are either irrelevant, incorrect, or relate to your implementation. Hence me consistently bringing up the topic of interest related to esm and relevant materials.

@MylesBorins
Copy link
Contributor

@jdalton I would say that perhaps discussing named exports itself is jumping the gun. We haven't reached consensus that importing common.js is something we want to support as "an official default loader".

I think everyone is on board with having named exports for a common.js loader, the requirements change depending on if it is official and default

May I suggest that we hold off getting too deep into the implementation details before we have the requirements?

@GeoffreyBooth GeoffreyBooth changed the title Named exports when importing CJS Feature: Named exports when importing CJS May 20, 2018
@guybedford
Copy link
Contributor

@GeoffreyBooth The spec potentially covers importing of anything that is importable. The fact that it's CJS, or ESM, or "third format" is irrelevant, because it still interacts with the spec-governed (and also engine-governed) JS runtime.

I'm not sure this is quite that clear - WASM will integrate at the module record level so can be a well-defined close integration with JS, while third-party loaders in NodeJS can treat defining a module as simply "returning a module namespace shell" regardless of when execution happens. We have an execute callback in the dynamicInstantiate hook, but users have no obligation to use it - they can define things that execute whenever they want, while providing named exports.

So if we can see this as a gradient from tight integration with ES modules, to more loose integration, I think the question for CommonJS here is simply where we see it on that gradient.

By definition, CommonJS does not execute like any other module record - it defers to its own loader which does its own execution. So it's more like tying two module execution trees together than the WASM integration as "just another module record".

Personally I tend to agree with @weswigham that NodeJS can interpret the spec as it wants here, and that slight execution order differences can be a very worthy tradeoff for the feature of named exports.

@ljharb
Copy link
Member

ljharb commented May 23, 2018

@guybedford any module that shares the JS realm/environment - and thus would have its effects observable - would indeed be governed by the spec. I agree that a wasm or c++ module, for example, could avoid these effects.

I do not see it as a gradient, and as has been indicated, the committee felt that whatever spec changes were needed to ensure that evaluation ordering was not altered would be made if possible. I don't think it's productive to pursue paths that are against the explicit intention of the spec, even if there is a loophole interpretation from the current wording.

@guybedford
Copy link
Contributor

@guybedford any module that shares the JS realm/environment - and thus would have its effects observable - would indeed be governed by the spec. I agree that a wasm or c++ module, for example, could avoid these effects.

To repeat the usual argument it is possible for execution to happen through the resolver, or other out-of-band mechanisms during the module pipeline processing. The modules part of the spec does not have a monopoly on all JS execution, and it's really about where you draw the line on what counts as a "module execution" from the perspective of the spec. Just because I happened to evaluate something that provides named bindings through an ES module interface, doesn't mean its execution has to be treated as an ES module, and if this is wrong then we already invalidate the spec by providing a dynamicInstantiate hook and that hook should be removed, strictly speaking, by such an argument.

The spec is designed to work for users and their needs, not as some dogma to set random walls just because Allen Wirfs-Brock happened to write the wording a certain way one weekend back in 2015.

I do not see it as a gradient, and as has been indicated, the committee felt that whatever spec changes were needed to ensure that evaluation ordering was not altered would be made if possible. I don't think it's productive to pursue paths that are against the explicit intention of the spec, even if there is a loophole interpretation from the current wording.

I'm 100% on the side of following the spec to the letter. There may well be a wording change that we need to propose to TC39 to ensure that this is not a loophole but indeed something that is supported by the specification.

There have been various discussions around this. Allowing namespaces to late-define export names was taken to TC39 and proven dead end at this point. Also previous discussion centered on cross format circular references ES / CJS problems that requires very carful execution considerations. Since we no longer have this problem anymore those previous tight concerns on execution that have been the topic of much TC39 debate over many years (remembers zebra striping!) no longer apply.

I really don't see that it will be an issue for the spec to clearly state that custom module boundaries (like the boundary between the NodeJS CommonJS loader and ES modules, which is a completely different thing to WASM boundary - to distinguish cases here, ask if both module systems can load modules from the other in a cycle) module systems may perform their own execution at whatever execution ordering they deem suitable for themselves, like the NodeJS CommonJS loader, and that their execution semantics do not need to adhere to the exact ES modules specification execution ordering at all as they are simply exposed through a module acting as a boundary, just like our dynamicInstantiate hook, where the execution callback is provided as a hint, but not a mandate.

@ljharb
Copy link
Member

ljharb commented May 23, 2018

Pursuing TC39 changes in parallel is great and I encourage and endorse it; but until those changes are moving through the process, we shouldn't let it impact our decisions here.

@guybedford
Copy link
Contributor

I understand you feel the spec is clear but you do not speak for all of TC39 - I would suggest we first seek TC39 clarification on this one more time, with an agenda item at the next meeting, and then craft a proposal from there if it is deemed necessary.

@giltayar
Copy link
Author

Allowing namespaces to late-define export names was taken to TC39 and proven dead end at this point

I was not aware of this, but if TC39 ruled it out, then it's dead for me too, which basically kills named exports when import-ing CJS, IMO (which begs the question of why we need transparent interop if the backwards compatibility to the "babel" way of doing things is dead).

@guybedford: do you remember the reason late-defining exports were rejected?

@guybedford
Copy link
Contributor

guybedford commented May 23, 2018

@giltayar it would be a large intractable spec change, due to all spec text being written around the concept of export names and export validations being known at instantiate completion. So between define the exports late, or allow earlier CJS execution, we are left with the early CJS execution option.

This is why I'm advocating CJS execution during the instantiate phase as a side-loader alongside the module pipeline.

To maintain more well-defined semantics, we could possibly only resolve CJS instantiation promises after all ES module instantiation promises have resolved. In this way, one could picture the pipeline as:

  1. Instantiate All (pausing on CJS execution points)
  2. ES Module Instantiate completion
  3. Start All CJS Execution triggering CJS instantate competion
  4. ES module execution (ES module spec execution order)

CJS modules could also still be designed to execute in tree order so that the following:

import 'cjs-1';
import 'esm-1';
import 'cjs-2';
import 'esm-2';

would execute the two CJS modules exactly in spec order, during phase (3), and then the ES modules would execute in order during phase (4). The difference then just being that 'cjs-2' executes before 'esm-1', but within the respective module systems we retain post-order tree execution.

@ljharb
Copy link
Member

ljharb commented May 23, 2018

@giltayar transparent interop by default isn’t just about matching babel; imo it’s necessary with or without named imports of CJS for adoption and ecosystem migration.

@bmeck
Copy link
Member

bmeck commented May 23, 2018

Even if we disregard the specification talk for a bit; the problem with out of order execution is that you could no longer order/refactor your imports if they have side effects:

import 'a';
import 'b';

Changing b from ESM to CJS would change the ordering from a, b to b, a. Which would be problematic if b relied on something from a being executed first.

Likewise, changing b from CJS to ESM would change the ordering from b, a to a, b. Which would be problematic if a relied on something from b being executed first.

These problems would be particularly hard to debug because a lot of the ordering is guaranteed and may change as you upgrade your dependencies, and we don't know the ordering just by reading the specifier in this new world either.

@guybedford
Copy link
Contributor

@bmeck yes that is the cost of the approach, that exact execution ordering only breaks down between imports of different module formats, but weighed against an entire ecosystem of expectations it seems a worthwhile one to me.

@bmeck
Copy link
Member

bmeck commented May 23, 2018

@guybedford I'm not sure how this is only a specific minority of imports, it affects any import that may change formats over time. And, when that import changes formats it will be hard to debug.

@ljharb
Copy link
Member

ljharb commented May 23, 2018

The ecosystem also has an expectation - for both require and import - that everything evaluates in lexical order. I think that’s much more important than named imports from CJS (as long as there’s at least default imports of CJS)

@guybedford
Copy link
Contributor

I've yet to see even two CommonJS packages on npm that require exact execution order between them that aren't polyfills - I have personally never written a NodeJS app that had to carefully order require statements apart from dealing with circular references.

Predictable semantics are incredibly important yes, but two-phase execution like this was the original plan for CommonJS laid out by TC39 to begin with through zebra striping. It was never deemed a spec violation then, so I'm not sure why it should be so now.

@devsnek
Copy link
Member

devsnek commented May 23, 2018

@guybedford polyfills and circular requires aren't nothing. if your polyfill is written in esm and we do out-of-band you're screwed.

@guybedford
Copy link
Contributor

If your npm package requires a polyfill to be used and is written in CommonJS, then yes, you would need your polyfill to be written in CommonJS as well.

@bmeck
Copy link
Member

bmeck commented May 23, 2018

I'm not convinced that removing ordering predictability is worth getting named exports when we can just read the properties off a default value. That costs the ecosystem just like not having named exports does. However, I value predictability over named exports since I cannot recreate the feature of predictability/debugging as easily as I can recreate the feature of named exports.

@ljharb
Copy link
Member

ljharb commented May 23, 2018

It's not just polyfills; it's anything that sets up state. Bootstrapping react flux stores, connecting to databases, configuring stateful libraries, etc.

@targos
Copy link
Member

targos commented May 23, 2018

In the internal loader, do we have access to the names that are being imported?
For example, with import { default as x, y, z } from 'cjs', that would be ['default', 'y', 'z']

@bmeck
Copy link
Member

bmeck commented May 23, 2018

@targos we do not, and even if we did there are problems with creating module records from them. Particularly if 2 modules have

import {a} from 'cjs';

and

import {b} from 'cjs';

What is the shape of the module that we create? Or what if they don't list the names:

import * as cjs from 'cjs';
const cjs = import('cjs');

@guybedford
Copy link
Contributor

@ljharb most packages in the JS ecosystem require a top-level call to start these things. I've never once used a library that did any of that work on top-level execution - please by all means find a counter example though.

@targos
Copy link
Member

targos commented May 23, 2018

For the first case, my idea was to create two different module records. I didn't think about the namespace import. That's a dead-end...

@dandv
Copy link

dandv commented Sep 26, 2018

Neophyte chiming in.

I know much less than I wish about modules, but I did manage to publish an ESModule that's backwards-compatible with CJS, by setting main to index (no extension) in package.json. Would this method be worth advocating to module authors, as (obviously) an interim and partial solution until this issue reaches consensus and resolution?

@devsnek
Copy link
Member

devsnek commented Sep 26, 2018

@danbev thanks for chiming in. i believe extensionless main has indeed been brought up before.

the real meat of this thread, however, is dealing with modules that have no esm source (the vast majority of the npm registry).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests