Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

Deep imports for dual packages presents a problem for transpiling for dual-packages #352

Closed
AlexanderOMara opened this issue Jul 19, 2019 · 22 comments

Comments

@AlexanderOMara
Copy link

AlexanderOMara commented Jul 19, 2019

It was recommended I start a separate issue for the issue here.

Currently the documentation suggests that dual packages which offer CJS and ESM modules could have a deep import like pkg/module.mjs which users could import if they want to use ESM.

If the consuming package is ESM only, that's not the worst thing ever (although I wouldn't call it ideal), but if the consuming package is also a dual package, that presents a pain-point for transpiling.

Somehow, you would need to output something like the following two files (although the actual CJS code would be more complicated):

module.mjs

import {something} from 'pkg/module.mjs';

index.js

const {something} = require('pkg');

Essentially the transpiler would need to be smart enough to be able to rewrite import paths for 3rd party modules, with knowledge of the file structure. Additionally, that file structure needs to remain consistent across versions, as the transpiler will only see the version installed at time of transpiling.

Use case 1 (just transforming the CJS, using ESM as-is):
How should the transpiler know that import 'pkg/module.mjs' needs to be rewritten to require('pkg')?

Use case 2 (TypeScript, etc.):
How should the transpiler know that import 'pkg' needs to be rewritten to import 'pkg/module.mjs'?

Presumably both of those use-cases would need to be supported.

I think it would be much simpler if there was a package.json property (like "main" CJS) and/or a default file (like index.js for CJS) which would be be loaded based on if the CJS or ESM module loader was being used.

Prior to the addition of the --es-module-specifier-resolution switch it was possible to simply have an extension-less "main" field and it would simply load either the .js or .mjs file based on the module loader being used (this behavior is still possible with --es-module-specifier-resolution=node).

@devsnek
Copy link
Member

devsnek commented Jul 19, 2019

I don't have an immediate solution, but ftr we are purposely avoiding the cases where require('x') and import 'x' can lead to different resources, which seems to be what you're suggesting in your last couple paragraphs.

Something we were considering before was allowing require('x') to resolve to an esm module and return that module's namespace, and (@bmeck correct me if i'm wrong) as long as loaders run in separate threads i believe that concept should still work.

@GeoffreyBooth
Copy link
Member

So to restate the request (please correct me if I’m missing something), you want to:

  1. Publish a public dual-ESM/CommonJS package, where
  2. The CommonJS version of your package requires only CommonJS dependencies, and
  3. The ESM version of your package imports ESM dependencies when available (for packages that are like yours and provide both ESM and CommonJS versions) and CommonJS dependencies otherwise.

Potential use cases for 1 and 2 are obvious: you want to publish a package that can be used as ESM natively in Node 12+, without dropping support for Node <12. But what exactly is the use case for 3? Why couldn’t your package have only CommonJS dependencies?

From the other thread I think the answer for why you want your package to itself be importing the ESM versions of dual packages is because you want to provide maximum support for tree shaking. Correct me if I’m wrong, but isn’t that only relevant for creating a bundle for browsers? And if so, couldn’t bundlers do this now? If all of your package’s dual-ESM/CommonJS dependencies each have a "module" field, as they probably already do, your bundler will be able to produce a tree-shaken minimal bundle for browsers.

We’re not advocating for the deprecation of the "module" field. It already has a purpose, in that it tells build tools where to find the ESM entry point. As this use case shows, that will continue to be relevant even once Node supports ESM unflagged.

So I think the overall answer is, you would write your package as ESM, and all your import statements to your dependencies would be to those dependencies’ bare specifiers (import 'pkg', not import 'pkg/module.mjs'). Then you have a transpiler create two output versions of your package: dist-commonjs, where everything is converted to require statements; and dist-module, where import 'pkg' is converted to import 'pkg/module.mjs' based on pkg having a "module" field with a value of ./module.mjs. Your own package’s "module" field would point to your original source files, that use import 'pkg'.

This all gets simpler once you can drop CommonJS support. Then your original source files can use import 'pkg/module.mjs' and there’s no need to transpile at all (at least for this purpose, aside from ES-Next features you may want to use). Bundlers creating browser builds would be able to use import 'pkg/module.mjs' without a need to look up a "module" field.

@AlexanderOMara
Copy link
Author

AlexanderOMara commented Jul 20, 2019

... Why couldn’t your package have only CommonJS dependencies? ...

I've considered this, and I don't think it's that simple.

If I'm not mistake, to get the benefits of tree shaking, you have to name your imports.

import {foo} from 'bar';

But if I try to import a CJS module in this way, it doesn't seem to work.

$ node --experimental-modules test.mjs
(node:1210) ExperimentalWarning: The ESM module loader is experimental.
file:///.../test-esm-cjs/test.mjs:1
import {foo} from 'bar';
        ^^^
SyntaxError: The requested module 'bar' does not provide an export named 'foo'
    at ModuleJob._instantiate (internal/modules/esm/module_job.js:93:21)
    at async ModuleJob.run (internal/modules/esm/module_job.js:108:20)
    at async Loader.import (internal/modules/esm/loader.js:134:24)

For reference, this is the CJS code I tried to import (babel output).

"use strict";

Object.defineProperty(exports, "__esModule", {
  value: true
});
exports.foo = void 0;
var foo = 'bar';
exports.foo = foo;

So either I name my imports and it only works with browser bundlers, or I support node and kill off tree shaking.

@GeoffreyBooth
Copy link
Member

I've considered this, and I don't think it's that simple.

Yes, and if you continue reading the rest of my comment, I describe how to support both Node and bundlers that will do tree shaking.

@AlexanderOMara
Copy link
Author

Sorry, I thought you were suggesting I could just sidestep the issue with using the CommonJS module in Node.

I'm not sure I follow your suggestion. Are you saying I should publish 3 versions of the code? CommonJS, ESM with deep imports, and ESM without deep imports? If so, and the "module" field points to the ESM without deep imports, how should the transpiler know which deep import to insert?

@GeoffreyBooth
Copy link
Member

First, are my assumptions (the numbers 1, 2 and 3 in my comment above) correct? You’re trying to support the following consumers:

  • Node <12 CommonJS
  • Node 12+ CommonJS
  • Node 12+ ESM
  • Bundlers creating an output for browsers, with tree shaking

All correct so far? If so, it’s the CommonJS mixed into this that makes things complicated. Consider if you only needed to support Node ESM and bundlers, and all of your dependencies were Node-compatible ESM packages. All of your import statements could be like import { foo } from 'pkg/module.mjs' and both Node and the bundler’s tree shaking would be fine with it. No transpilation or "module" field necessary.

If so, and the "module" field points to the ESM without deep imports, how should the transpiler know which deep import to insert?

You raise a good point, I think this is an oversight in my suggestion above. Since you aim to support CommonJS consumers, "main" needs to continue to point to your CommonJS entry point; and since you aim to support bundlers as consumers, "module" needs to continue to point to an entry point with bundler-suitable ESM. So if the bundler-suitable ESM and the Node-suitable ESM aren’t the same code, in order to generate all three outputs we would need a third field.

But that’s only necessary if it’s impossible for the same ESM code to be usable by both Node and bundlers. If we can write ESM code that’s usable by both, then it’s possible to make do with just "main" and "module". So I guess the question is, can we?

Let’s say that pkg is a package like yours, with a CommonJS entry point defined in "main" and the ESM entry point defined in "module":

// pkg/package.json
{
  "name": "pkg",
  "type": "module",
  "main": "index.cjs",
  "module": "module.js"
}

The Node runtime itself ignores "module"; that field just tells bundlers and other tools where to find the ESM entry point. Your human consumers would read the README to learn the same information. Your package would look something like this:

// package.json
{
  "name": "alexs-package",
  "type": "module",
  "main": "./dist-commonjs/index.js",
  "module": "./dist-module/index.js"
}

// README
Use `import stuff from 'alexs-package/dist-module/index.js';` to use in ESM in Node!

// src/index.ts
import { foo } from 'pkg/module.js';

// dist-module/index.js
import { foo } from 'pkg/module.js';

// dist-commonjs/index.js
const { foo } = require('pkg');

To generate dist-module/index.js, your transpiler would see the specifier pkg/module.js and find the nearest parent package.json file, in this case pkg/package.json; that file contains "type": "module", so therefore pkg/module.js is an ESM file. Since we’re outputting ESM, we’re done; the specifier can be written as is.

To generate dist-commonjs/index.js, your transpiler would do the same lookup as above, but once it discovers that pkg/module.js is an ESM file, it would then have more work to do. In this case, it would look in pkg/package.json and see "module": "module.js", which matches 'pkg/module.js' and therefore pkg/module.js is a reference to the ESM entry point of the package. That’s wonderful news, as the transpiler can exchange an ESM entry point for a CommonJS one, which is defined in "main". So the transpiler could convert 'pkg/module.js to 'pkg/index.js', but that’s equivalent to just 'pkg' so the transpiler writes require('pkg').

This answers your “Use case 1 (just transforming the CJS, using ESM as-is)”. As for your use case 2, which is essentially the reverse of this (transforming CommonJS to ESM), when would that ever need to be supported? As far as I know no one’s yet created a CommonJS-to-ESM transpiler, so it seems theoretical; and I don’t think it’s too much to require that if you want to support both ESM and CommonJS, you need to write your original source as ESM.

@GeoffreyBooth
Copy link
Member

If I’m not mistake, to get the benefits of tree shaking, you have to name your imports. . . . But if I try to import a CJS module in this way, it doesn’t seem to work.

There’s no tree shaking for CommonJS. It’s theoretically possible but no one has implemented a bundler that does it, as far as I know. import { foo } from 'commonjs-only-package' will cause all of commonjs-only-package to be included in your generated output, from any bundler that I’m aware of. This is a prime reason that people are switching to ESM.

In the example from my last comment, you would write such an import like this:

import commonjsOnlyPackage from 'commonjs-only-package';
const { foo } = commonjsOnlyPackage;

This would work in both Node ESM and in your bundler as is, and for CommonJS the transpiler would convert the first line into const commonjsOnlyPackage = require('commonjs-only-package');. Yes you’re importing all of 'commonjs-only-package', but again, that will happen no matter what you do, both in Node and in your bundle.

@AlexanderOMara
Copy link
Author

AlexanderOMara commented Jul 21, 2019

I think you follow.

As for your use case 2, which is essentially the reverse of this (transforming CommonJS to ESM), when would that ever need to be supported?

As of now, TypeScript would not play nice with such a deep import.

Maybe some logic could be cooked up to figure out where the appropriate .d.ts file is from there, but I can't help thinking this is all a lot of extra complexity that is being passed onto transpilers. A transpiler didn't used to have to know anything about 3rd party modules at all.

@GeoffreyBooth
Copy link
Member

Pinging @weswigham as I’m no expert on TypeScript. I would think that TypeScript should be able to handle any type of deep import, whether it’s CommonJS or ESM (and especially if it’s ESM).

I can’t help thinking this is all a lot of extra complexity that is being passed onto transpilers.

There’s already a tremendous amount of complexity heaped onto transpilers. 😄 Like I wrote above, this is only so complex because you’re trying to support so many consumers, and because you’re trying to support tree shaking while also supporting CommonJS. It’s not easy to support both of those last two at once; and that complexity is exactly the kind of thing that’s best bottled up into a transpiler, rather than requiring package authors to need to get it right. The goal for the transpiler author should be that package authors only need to write source that evaluates correctly as ESM in Node, and the transpiler should be able to handle all the rest (which in my example here, I think is achieved).

@AlexanderOMara
Copy link
Author

AlexanderOMara commented Jul 21, 2019

I thought something like this was a simpler option all around: #273

I'm not sure why it's desirable to have require('x') and import 'x' both go to only one module system, as it makes ESM seems like a second-class citizen in the transitional phase.

@ljharb
Copy link
Member

ljharb commented Jul 21, 2019

They are both first class module systems, neither is legacy, and both will be around for the foreseeable future.

@GeoffreyBooth
Copy link
Member

#273 is basically rejected because of #273 (comment) / #273 (comment).

I assume you mean that ESM feels second-class because of the suggestion that "main" is the CommonJS entry point and ESM is just a deep import without its own privileged field in package.json. But remember that’s only the case if your package aims to support both CommonJS and ESM consumers and you’ve already been publishing your package as CommonJS for awhile, so therefore making "main" point at an ESM file would be a breaking change. For a new package published today, or for a semver major upgrade to your existing package, you could make "main" point to the ESM entry point and have your README say “and for CommonJS, use require('pkg/commonjs').”

@AlexanderOMara
Copy link
Author

I'm not sure what the exact reasoning is behind those comments.

I really don't see how that's any better to be honest. One of the package systems gets second-rate support, when really both will be in use for the foreseeable future. Some packages will be one way, some the other, and developers are going to have to keep track of it all.

@GeoffreyBooth
Copy link
Member

GeoffreyBooth commented Jul 21, 2019

I’m not sure what the exact reasoning is behind those comments.

You mean the reasoning of why we can’t just do what was proposed in #273? Basically, it’s because the ESM and CommonJS versions of a dual package aren’t actually interchangeable; and they’re definitely not the same objects.

Say we merged in #273; or you’re using --experimental-modules in Node 11, with a package pkg that has "main": "./index" where you have index.js and index.mjs side by side. Either way, now you can require('pkg') and that pulls in pkg/index.js; and you can import 'pkg' and that references pkg/index.mjs.

Because those statements actually reference different files on disk, they’re not the same variables in memory. In other words:

import esmPkg from 'pkg';

import { createRequire } from 'module';
const require = createRequire(import.meta.url);
const cjsPkg = require('pkg');

esmPkg === cjsPkg; // false

cjsPkg.foo = 3;
esmPkg.foo; // undefined

This becomes a major problem for packages that are singletons. Imagine if your package is a store like Redux or Vuex. You don’t want require statements referencing one store and import statements referencing a different store. Or consider plugins that are attached to globals, like how Vuex is attached to Vue. If there are two Vues, Vuex might get attached to one and not the other.

This isn’t a theoretical concern. It’s actually already happened in the real world, in Node 11 --experimental-modules: graphql/graphql-js#1479 (comment)

Also there’s no guarantee that the CommonJS and ESM versions of packages are interchangeable. Transpilers aren’t perfect, and there are subtle differences introduced between source ESM and output CommonJS JavaScript, and those differences might matter. There’s not even any guarantee that the CommonJS version was generated via a transpiler; it could be completely different code.

@AlexanderOMara
Copy link
Author

Because those statements actually reference different files on disk, they’re not the same variables in memory. ... This becomes a major problem for packages that are singletons.

Is that a real concern? Are there any packages that actually use both ESM and CJS in this way in a single module? Or do you mean cross-module? Because if the versions listed in the package.json file aren't compatible then there was really never a guarantee of a single version of any given module (NPM will just give one module a different version to load).

Of course there could some subtle differences since it's using the same module system you are using, but I'm not sure I would call that a bad thing.

Also, all of these will still exist with deep imports, right?

@weswigham
Copy link
Contributor

weswigham commented Jul 21, 2019

Transpilers have no possible way to smooth over a dependency whose entry point may or may not be esm at runtime under the current model - it's impossible unless you're like webpack and packing the entire world of JS at a single point of time into a single file (or known bundles) and locking everything down. ❤️ Anything based on compile-time package traversal is ultimately untenable since the runtime may be entirely different. The best we can do is allowing people to choose how they import/require things and leave the problem to the ecosystem. Right now a given specifier can only resolve under one module system - that's a good first step, but to make it much better, we need a specifier to resolve to the exact same module under either resolver entrypoint, this way a user doesn't need to worry about which (of require or import) is "right", which is what @devsnek alluded to in his comment.

@AlexanderOMara AlexanderOMara changed the title Deep imports for dual packages presents a problem for transpiling Deep imports for dual packages presents a problem for transpiling and dual-packages Jul 23, 2019
@AlexanderOMara AlexanderOMara changed the title Deep imports for dual packages presents a problem for transpiling and dual-packages Deep imports for dual packages presents a problem for transpiling for dual-packages Jul 23, 2019
@GeoffreyBooth
Copy link
Member

@AlexanderOMara sorry for the delayed reply.

Is that a real concern? Are there any packages that actually use both ESM and CJS in this way in a single module?

Yes, this has come up in the real world: graphql/graphql-js#1479 (comment) and https://github.com/Pokute/graphql-esm-bug. @jkrems and I created a repo to illustrate the issue and explain it through both text and examples: https://github.com/jkrems/singleton-issue

Also, all of these will still exist with deep imports, right?

The same issue would still exist with deep imports if extensions were optional. In other words, just as 'pkg' resolving to different files in CommonJS versus ESM presents an issue, 'pkg/foo' resolving to different files (foo.js and foo.mjs, say) presents the same issue. This is essentially how the hazard was discovered: in the Node 7-11 --experimental-modules, the package.json main could have a value like ./index that resolved to index.mjs for ESM and index.js for CommonJS.

Hence this is why file extensions are required in the current --experimental-modules implementation, and why a package can have only one entry point. The current implementation was designed to prevent this hazard from happening for users. Since 'pkg' can only point to one file regardless of environment and 'pkg/module.mjs' can also only point to exactly one file regardless of environment, the hazard is not present.

@AlexanderOMara
Copy link
Author

Yes, this has come up in the real world: graphql/graphql-js#1479 (comment) and https://github.com/Pokute/graphql-esm-bug. @jkrems and I created a repo to illustrate the issue and explain it through both text and examples: https://github.com/jkrems/singleton-issue

In the case of the dual-esm-commonjs-package sample:

That's not exactly a new issue. That's a known problem with trying to use instanceof on instances that are cross-module or cross-realm, and can definitely happen when only using CJS (or ESM). With NPM, if a package pkg-b you are using requires SomeClass from pkg-a version 1.0.0 but you are using pkg-a versions 2.0.0 to create SomeClass and pass that into pkg-b and it does an instanceof check, it will also fail.

Older versions of NPM which didn't maximize flatness made this happen even more often.

For the extensionless-imports example, I'm guessing nobody has actually done that. If you do strange things like that in your own package, you should probably expect strange results.

Also, all of these will still exist with deep imports, right?

The same issue would still exist with deep imports if extensions were optional. In other words, just as 'pkg' resolving to different files in CommonJS versus ESM presents an issue, 'pkg/foo' resolving to different files (foo.js and foo.mjs, say) presents the same issue. This is essentially how the hazard was discovered: in the Node 7-11 --experimental-modules, the package.json main could have a value like ./index that resolved to index.mjs for ESM and index.js for CommonJS.

Hence this is why file extensions are required in the current --experimental-modules implementation, and why a package can have only one entry point. The current implementation was designed to prevent this hazard from happening for users. Since 'pkg' can only point to one file regardless of environment and 'pkg/module.mjs' can also only point to exactly one file regardless of environment, the hazard is not present.

But what if I do that magic where my CJS modules require('pkg') and my ESM modules import 'pkg/module.mjs'? Can I not still run into all of these edge-cases? It's just now more difficult to make a dual-package in the first place for some reason?

@GeoffreyBooth
Copy link
Member

That’s not exactly a new issue. That’s a known problem with trying to use instanceof on instances that are cross-module or cross-realm, and can definitely happen when only using CJS (or ESM). With NPM, if a package pkg-b you are using requires SomeClass from pkg-a version 1.0.0 but you are using pkg-a versions 2.0.0 to create SomeClass and pass that into pkg-b and it does an instanceof check, it will also fail.

True. Though while I’d imagine that instanceof checks aren’t that common, attaching something to a parent (like Vuex attaching to Vue) is a pretty common pattern. As is a package depending on a parent, like graphql-yoga depending on the dual package graphql in the original issue.

For the extensionless-imports example, I’m guessing nobody has actually done that. If you do strange things like that in your own package, you should probably expect strange results.

Within one’s own app or package, yes, it’s contrived. By I think as an exported path within a package, it’s very plausible. I can easily imagine a package like 'aws-sdk/s3' mapping to ./s3/index.mjs and ./s3/index.js, for example, which is the same as extensionless-imports.

But what if I do that magic where my CJS modules require('pkg') and my ESM modules import 'pkg/module.mjs'? Can I not still run into all of these edge-cases?

No, the issues aren’t present when the specifiers differ between CommonJS and ESM. In this example, if your ESM code had import 'pkg', that would be the same singleton as your CommonJS code’s require('pkg'). Using pkg/module.mjs instead makes it clear that you’re not expecting the same singleton. If you’re going to do some build transpilation magic to use different specifiers in your CommonJS versus your ESM code, you’ll also have to account for this.

@ljharb
Copy link
Member

ljharb commented Aug 15, 2019

It may be a common pattern; the ecosystem solution for this (and any duplication in the graph) is peerDependencies - which would prevent the problem from occurring with ESM as well.

@bmeck
Copy link
Member

bmeck commented Aug 4, 2020

this appears to be solved now thanks to "exports", closing but feel free to re-open.

@bmeck bmeck closed this as completed Aug 4, 2020
@AlexanderOMara
Copy link
Author

Yep, conditional exports solves this issue very nicely!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants