Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking module for the present and the future #55221

Open
andrewbranch opened this issue Jul 31, 2023 · 3 comments
Open

Rethinking module for the present and the future #55221

andrewbranch opened this issue Jul 31, 2023 · 3 comments
Assignees
Labels
In Discussion Not yet reached consensus Needs Proposal This issue needs a plan that clarifies the finer details of how it could be implemented. Suggestion An idea for TypeScript

Comments

@andrewbranch
Copy link
Member

andrewbranch commented Jul 31, 2023

What does --module actually mean?

Which of these is a better definition for the flag as it exists today? Which is a better fit for the future?

  1. The output module syntax you want to be emitted
  2. A declarative description of the module system that will process your emitted code at bundle-time or runtime

When the possible values of module were limited to amd, umd, commonjs, system, and es2015, the former definition was perfectly fine. When es2020 and es2022 were added, which added syntax features like import.meta and top-level await that couldn’t be transformed into other module emit targets besides system, it started to feel like module described not just an output format, but the intrinsic capabilities of some external system. With node16 and nodenext, the scope of the module flag suddenly expanded to include a new module format detection algorithm used by the target module system and special interop rules between module formats, while it stopped directly controlling the output format, since the format of every output file would be fully determined by Node.js’s format detection algorithm.

The latter interpretation of module, one that fully describes the target module system, works well for node16/nodenext, but trying to project that definition onto the other, older module values makes them feel kind of incoherent.

All the values except node16/nodenext are kind of weird

Some of the important characteristics of the module system described by --module nodenext are:

  • Multiple module formats are supported (CJS/ESM)
  • The module format of each file is determined via some algorithm
  • Modules of different formats interact with each other in specific ways

If we try to infer from existing what the other module values say about these characteristics, the result is confusing. For example, you might expect that --module esnext means an ESM-only module system that must reject CommonJS/AMD/System modules—after all, you’re not allowed to write import foo = require("./mod") in that mode. But you are allowed to import a dependency that declares CommonJS constructs like that.

None of these module modes have any restriction on the kinds of modules that can be imported, nor do they particularly make any effort to detect what kind of module a dependency is. Essentially, type checking between modules proceeds as if everything is CommonJS, even when we’re explicitly emitting esnext. This can be observed direclty by writing a default import of a .d.ts file that only declares named exports:

// @module: esnext
// @esModuleInterop: true

// @Filename: /esm.d.ts
export const x: string;

// @Filename: /main.ts
import esm from "./esm";
esm.x; // string, no error, what??

This behavior is enabled by esModuleInterop/allowSyntheticDefaultImports, but those settings should only affect how the exports of CommonJS modules appear (and arguably only to imports written other CommonJS modules, since esModuleInterop is an emit setting that only emits code into CommonJS outputs). There’s no attempt to distinguish between what happens when two ES modules interact, two CJS modules interact, or an ES module imports a CJS module. This is perhaps, historically, because we had no idea what the actual module format of the JS file described by the declaration file is. (It would have been really nice for declaration emit to have always encoded the output module format, but here we are.)

Even if we had perfect information about the module format of every file, the distinction between I want to emit ESM and My module system can only handle ESM is potentially useful, and these old module modes can only describe the former. Essentially, they all describe the same hypothetical module system, where any module format can be loaded interchangeably.

Supporting bundlers

Webpack and esbuild vary their handling of ESM→CJS imports based on whether the importing file would be recognized as ESM according to Node.js’s module format detection algorithm. According to the node16/nodenext prior art, the module flag is the trigger that should enable this behavior.1 Unlike in Node.js, files in these bundlers’ module systems are not always unambiguously ESM or CJS. When a file has a .ts/.js extension, and the ancestor package.json doesn’t have a "type" field at all, they’re not treated as CJS; they just don’t get the aforementioned special Node.js-compatible import behavior.

Other bundlers don’t implement this Node.js compatibility behavior (at least by default). They’re already fairly well served by --module esnext, with the exception of the bug described in the previous section (#54752). It seems like we could improve on all the older module modes by including file extension and package.json "type" fields as a heuristic for when a default export of should be synthesized, and to avoid emitting syntax into .mjs or .cjs files that would be invalid in Node.js. (#50647, #54573)

Options

Decisions I think are on the table:

  • Should we make at least one new module value for bundlers?
  • Should we make two new module values for bundlers, one for Webpack/esbuild (Node.js-style interop) and one for all others? Or, should the Node.js-style interop behavior be triggered by a separate flag?
  • Should we add module format detection heuristics to the old module values to fix #54752, #50647, and #54573, or deprecate them in favor of new ones?
  • If we deprecate old module values, what new ones do we actually need?
  • Instead of encoding everything significant into module, would it be better to have a more granular set of flags describing what formats are supported, how they interoperate, what output format to emit (when ambiguous via detection), and what ECMAScript spec version is supported?

My proposed minimal change:

  • Add (completely bikesheddable names, I hate them) --module bundler and --module bundler-node-compatible, or --module bundler and another flag enabling Node.js-compatible interop. Ignore everything else.

Why I’d rather rethink module as a whole than do the minimal change:

  • I want to be able to give a single coherent explanation of what module means for documentation purposes.
  • Emitting conflicting syntax into .mjs and .cjs files is a pretty bad behavior, and we should fix or deprecate every mode that does it.
  • #54752
  • In the future, we may want to make a true ESM-only module mode to represent the browser or another future runtime, and it’s annoying that --module esnext is a poor fit for that.

Footnotes

  1. Today, the module format detection (the setting of impliedNodeFormat) is actually triggered by moduleResolution, not module, but I think this doesn‘t make sense. Make module control impliedNodeFormat and moduleResolution control just module resolution #54788 swaps the trigger, and that change can go unnoticed since we already made moduleResolution: nodenext and module: nodenext inseparable at Require module/moduleResolution to match when either is node16/nodenext #54567.

@andrewbranch andrewbranch added Suggestion An idea for TypeScript Needs Proposal This issue needs a plan that clarifies the finer details of how it could be implemented. In Discussion Not yet reached consensus labels Jul 31, 2023
@andrewbranch andrewbranch self-assigned this Jul 31, 2023
@andrewbranch
Copy link
Member Author

I want to draw out two things that were discussed in the design meeting #55271.

First, there was broad agreement that it would be worth updating the old esnext/commonjs/etc. modes to fix issues like #50647 by having them refuse to emit CJS syntax into .mjs files or ESM syntax into .cjs files, and that we don’t need to wait until 6.0 to at least experiment and see how breaky that kind of change would be. This could be done either by issuing a program error, or updating those modes to take file extension (and perhaps package.json "type") into account and really treat those files as the module format their extension implies, even though it may disagree in name with the module value. I’m leaning toward trying the latter, because as I discussed in the issue body, you can look at the semantics of how imports and exports in declaration files work in these modes and notice that they are not actually intended to limit program files to just ESM or just CJS. For example, in --module esnext, you’re allowed to import a declaration file that uses import x = require("...") and export = ..., so it seems to acknowledge that CJS files exist and can be imported, so there doesn’t seem to be a strong reason to refuse to emit CJS syntax into a .cjs file in that program.

Secondly, @weswigham floated the idea of creating several granular, advanced-usage flags that control individual aspects of the module system, and rolling them up into named presets reflecting real known runtimes, e.g. esbuild, webpack, etc. The individual controls that might be relevant are:

  • Module format detection algorithm: There are two algorithms in use today: what node16/nodenext do, and no detection at all. We may need a variation of the node16/nodenext algorithm that leaves files unaffected by file extension or package.json "type" in an indeterminate format for use with bundlers, since non-mjs files aren’t properly CJS, they just have different interop behavior in Webpack/esbuild.
  • Supported module formats: We discussed the potential utility of a true ESM-only mode that would error when attempting to import unambiguously CJS dependencies. That said, no existing mode and no planned bundler mode would use this.
  • Module format interop controls: This is a big one that is currently just baked into node16/nodenext. To support bundlers, we need one switch that can control whether require(esm) is allowed, and one switch for controlling the shape of CJS modules when imported.
  • Emit format for indeterminate/ambiguous format files: I don’t know that this really needs to be configurable for bundler usage, since you usually don’t want to emit at all, and any place where TS needs to reason about the runtime behavior should assume that no module transformations took place (new --module none?). But this seems like the most plausible way to represent the behavior we might want out of a fixed/improved --module esnext or --module commonjs, where we always emit the “right” syntax into .mjs and .cjs files, but we have direct control over the emit format of other files—which ones count as indeterminate would depend on the module format detection algorithm mentioned earlier. This also could be the right option for picking a module format for ts.transpileModule, where you truly are just forcing an output syntax.
  • ESM spec version or feature list: whereas we currently have --module es2015, es2020, es2022, and esnext, which differ only in whether language features like import.meta and top-level await are available, we might want to move this to a separate option that can be applied to any module mode that supports emitting any ESM-format files (potentially all the existing ones, if we update every existing mode to emit ESM into .mjs files).

I feel fairly confident that this set of levers would let us model everything we currently have and everything that we’d like to add in the near term. It still makes me a bit uncomfortable to expose all these as public API though, as they would really take the “advanced” section of the tsconfig options to a new level. On the other hand, if we could use these to dramatically lower the barrier to giving users named presets that are a really good fit for their runtime/bundler, that might be a good tradeoff. (That does necessitate another decision about preset versioning—do we need a bun2023 and a bunnext? Or are we ok with changing presets as needed in each new TS version?)

@fatcerberus
Copy link

fatcerberus commented Sep 15, 2023

Today, the module format detection (the setting of impliedNodeFormat) is actually triggered by moduleResolution, not module, but I think this doesn‘t make sense.

I could go either way on this. If you say that "resolution" is only the process of "resolve a module specifier to a file on disk", then that's fair, but I think it could be argued that the process of module resolution also covers what kind of module it resolves to (in particular imagine a world in which you could write esm:foo or commonjs:foo where the module loader sees the exact same resolved filename for both). The other thing is that module, intuitively, is the answer to the question "What environment do you want to emit modules for?", and under that interpretation it doesn't make much sense for that to control what kind of module an import resolves to, as that's a (mostly) orthogonal concern.

FWIW, I myself considered module type detection to be an inherent part of resolution when I implemented neoSphere's current module loader:
https://github.com/spheredev/neosphere/blob/main/src/neosphere/module.c#L398
I think I originally tried a solution wherein module type was handled later, during module loading, but that caused more problems than it solved; for example there was no way to say I wanted Node-like semantics for require but not for import, since all the loader saw was a filename and not, e.g., the contents of package.json.

@jonkoops
Copy link

We discussed the potential utility of a true ESM-only mode that would error when attempting to import unambiguously CJS dependencies. That said, no existing mode and no planned bundler mode would use this.

I would just like to express my support for this option. I increasingly have (mostly) standards compliant pure ESM codebases and dependencies, and CommonJS interoperability is steadily becoming more of a burden than a boon to productivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In Discussion Not yet reached consensus Needs Proposal This issue needs a plan that clarifies the finer details of how it could be implemented. Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

3 participants