Compare ESM named exports to types #166

laverdet · 2024-03-25T22:04:15Z

This implements the idea in #35. The patch is incomplete in some ways:

Some type-only exports are not correctly filtered by my check. For example chalk reports ModifierName (and others) are missing, but these are type aliases. I can't figure out the correct predicate to only raise runtime types.
Some packages reexport * namespaces from other packages, for example @reduxjs/toolkit (see here). If we want to check these then the module's dependencies will need to be downloaded too. This would be a major change, or we can just bail the check in this case.
Some packages (react, many others) list package.json in their entries. The TypeScript compiler advertises that the top-level keys of this JSON file are importable when they are not. import { name } from "react/package.json" assert { type: "json" }; will pass compilation but fail to evaluate under nodejs. I'm surprised that even under the strictest modern settings (target: esnext, verbatimModuleSyntax: true, module: nodenext, moduleResolution: nodenext) that TypeScript has this behavior. Filtering JSON entries would probably be fine.
The acorn part could be rewritten to use the TypeScript AST instead. This thought eluded me until I had already finished the implementation.

The result of getEsmModuleNamespace should be the same as the result of Object.keys(await import(moduleName)). This information is then compared against the advertised types and an error is raised if the types are incorrect under node16-esm resolution mode (there are many, many problems in the npm ecosystem).

The static analysis had to be done outside of TypeScript since its behavior does not match the reality of what nodejs sees. TypeScript would have you believe that the following program is well-formed but it cannot be evaluated under nodejs:

// commonjs.cjs
module.exports = {
	name: 1,
};

// module.mjs
import { name } from "./common.cjs";

Example including diagnostic output from request and lodash, which do not actually export any of the names that their types would have you believe:

-> % node dist/index.js --from-npm request                             
🚨 /node_modules/request/index.js [
  'defaults',   'defaults', 'get',
  'get',        'get',      'post',
  'post',       'post',     'put',
  'put',        'put',      'head',
  'head',       'head',     'patch',
  'patch',      'patch',    'del',
  'del',        'del',      'delete',
  'delete',     'delete',   'initParams',
  'initParams', 'forever',  'jar',
  'cookie',     'debug'
]

┌───────────────────┬──────────────────────┐
│                   │ "request"            │
├───────────────────┼──────────────────────┤
│ node10            │ 🟢                   │
├───────────────────┼──────────────────────┤
│ node16 (from CJS) │ 🟢 (CJS)             │
├───────────────────┼──────────────────────┤
│ node16 (from ESM) │ 🕵️ Named ESM exports │
├───────────────────┼──────────────────────┤
│ bundler           │ 🟢                   │
└───────────────────┴──────────────────────┘

-> % node dist/index.js --from-npm lodash 
🚨 /node_modules/lodash/lodash.js [
  'VERSION',           'templateSettings',  'chunk',
  'compact',           'concat',            'difference',
  'differenceBy',      'differenceBy',      'differenceBy',
  'differenceBy',      'differenceBy',      'differenceBy',
  'differenceBy',      'differenceWith',    'differenceWith',
  'differenceWith',    'differenceWith',    'drop',
  'dropRight',         'dropRightWhile',    'dropWhile',
  'fill',              'fill',              'fill',
  'fill',              'findIndex',         'findLastIndex',
  'first',             'flatten',           'flattenDeep',
  'flattenDepth',      'fromPairs',         'fromPairs',
  'head',              'indexOf',           'initial',
  'intersection',      'intersectionBy',    'intersectionBy',
  'intersectionBy',    'intersectionBy',    'intersectionBy',
  'intersectionWith',  'intersectionWith',  'intersectionWith',
  'intersectionWith',  'join',              'last',
  'lastIndexOf',       'nth',               'pull',
  'pull',              'pullAll',           'pullAll',
  'pullAllBy',         'pullAllBy',         'pullAllBy',
  'pullAllBy',         'pullAllWith',       'pullAllWith',
  'pullAllWith',       'pullAllWith',       'pullAt',
  'pullAt',            'remove',            'reverse',
  'slice',             'sortedIndex',       'sortedIndex',
  'sortedIndexBy',     'sortedIndexOf',     'sortedLastIndex',
  'sortedLastIndexBy', 'sortedLastIndexOf', 'sortedUniq',
  'sortedUniqBy',      'tail',              'take',
  'takeRight',         'takeRightWhile',    'takeWhile',
  'union',             'unionBy',           'unionBy',
  'unionBy',           'unionBy',           'unionBy',
  'unionWith',         'unionWith',         'unionWith',
  'uniq',              'uniqBy',            'uniqWith',
  'unzip',             'unzipWith',         'unzipWith',
  'without',           'xor',               'xorBy',
  'xorBy',
  ... 491 more items
]

lodash v4.17.21
@types/lodash v4.17.0

🕵️ docs docs


┌───────────────────┬──────────────────────┐
│                   │ "lodash"             │
├───────────────────┼──────────────────────┤
│ node10            │ 🟢                   │
├───────────────────┼──────────────────────┤
│ node16 (from CJS) │ 🟢 (CJS)             │
├───────────────────┼──────────────────────┤
│ node16 (from ESM) │ 🕵️ Named ESM exports │
├───────────────────┼──────────────────────┤
│ bundler           │ 🟢                   │
└───────────────────┴──────────────────────┘

changeset-bot · 2024-03-25T22:04:18Z

⚠️ No Changeset found

Latest commit: 9a62a49

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

andrewbranch · 2024-06-13T17:51:12Z

Hey @laverdet, sorry for the extremely late acknowledgment here. Thanks for putting this together! Both work and life have been very busy in the last few months. Thanks for your patience. If you’re interested in continuing on this, I have some feedback. Otherwise, feel free to ignore, and I may pick up where you left off at some future date.

Overall, I’d like to present this check to the user as more of an all-or-nothing thing, like “TS thinks there are named exports but Node.js sees none, probably due to cjs-module-lexer being unable to statically analyze them” instead of “These N specific named exports seem to be missing at runtime.” That said, I’m not against storing the diff of named exports in the problem objects (or at least being set up to store them) so we can do more with it later.

Some type-only exports are not correctly filtered by my check. For example chalk reports ModifierName (and others) are missing, but these are type aliases. I can't figure out the correct predicate to only raise runtime types.

I can help with this when the rest appears to be in good shape.

Some packages reexport * namespaces from other packages, for example @reduxjs/toolkit (see here). If we want to check these then the module's dependencies will need to be downloaded too. This would be a major change, or we can just bail the check in this case.

There’s a lot already that’s incomplete because we don’t download dependencies. It’s fine. This also is less relevant if the CLI/UI isn’t trying to report specifically which named exports are missing at runtime.

Filtering JSON entries would probably be fine.

Agreed 👍

The acorn part could be rewritten to use the TypeScript AST instead. This thought eluded me until I had already finished the implementation.

Yeah, it would be nice not to depend on two different parsers, especially when the TS one is already running on everything.

laverdet · 2024-06-17T23:55:05Z

With the updated code I believe the only known outstanding issues are:

Incomplete expected TypeScript runtime type detection in namedExports.ts
Documentation needed. Common cases of cjs-module-lexer bailouts: "default" function with bag of fake named exports (see: request), entirely unlexable (see: body-parser), partial bailout (see: semver). These all affect handwritten CommonJS modules. This rule would also catch non-bailout conditions where the handwritten TypeScript declarations are simply incorrect.

Overall, I’d like to present this check to the user as more of an all-or-nothing thing

Maybe I'm misunderstanding your statement but it's not always that simple, especially for handwritten CJS. semver is a very good case study. They have a cjs-module-lexer bailout in their "index" file. This causes all named exports after that line to be missing. The nuanced diagnostic information here is helpful.

$ git diff
diff --git a/packages/core/src/internal/checks/namedExports.ts b/packages/core/src/internal/checks/namedExports.ts
index 5b0dbc1..35b6cdf 100644
--- a/packages/core/src/internal/checks/namedExports.ts
+++ b/packages/core/src/internal/checks/namedExports.ts
@@ -50,6 +50,7 @@ export default defineCheck({
     })();
     if (exports) {
       const missing = expectedNames.filter((name) => !exports.includes(String(name))).map(String);
+      console.log(missing);
       if (missing.length > 0) {
         return {
           kind: "NamedExports",

$ node ./packages/cli/dist/index.js -p semver
[
  'compareIdentifiers',
  'rcompareIdentifiers',
  'SEMVER_SPEC_VERSION',
  'RELEASE_TYPES'
]

semver v7.6.2
@types/semver v7.5.8

🕵️ Module advertises named ESM exports which will not exist at runtime. https://github.com/arethetypeswrong/arethetypeswrong.github.io/blob/main/docs/problems/NamedExports.md


┌───────────────────┬──────────────────────┐
│                   │ "semver"             │
├───────────────────┼──────────────────────┤
│ node10            │ 🟢                   │
├───────────────────┼──────────────────────┤
│ node16 (from CJS) │ 🟢 (CJS)             │
├───────────────────┼──────────────────────┤
│ node16 (from ESM) │ 🕵️ Named ESM exports │
├───────────────────┼──────────────────────┤
│ bundler           │ 🟢                   │
└───────────────────┴──────────────────────┘

andrewbranch

Good example with semver. The main thing I was trying to avoid was a bunch of errors of the opposite problem: runtime exports that don’t exist in the types. But since we’re only reporting types exports that don’t exist at runtime, I take your point that it can be useful to list them, to a point. I do think the CLI / web UI shouldn’t vomit a list of hundreds of exports (even where console.log(missing) truncates the lodash list looks excessive), and it is sometimes the case that all named exports are missing, in which case it doesn’t seem useful to list anything.

andrewbranch · 2024-06-18T15:59:05Z

packages/core/src/internal/esm/cjsResolve.ts

Currently, we model Node.js/bundler resolution with ts.resolveModuleName ignoring declaration files, which is not fully accurate. It’s been on my mental list for a while to replace that with enhanced-resolve, which would fix a few edge cases as well as letting me make the bundler JS resolution be a little more bundler-like, e.g. resolving the top-level module field in package.json. If you had access to enhanced-resolve here, would you still need cjsResolve.ts and esmResolve.ts for any reason?

I wasn't aware of enhanced-resolve, that's good to know. It has some inaccuracies (require('./file.js?') should not resolve since require accepts filesystem paths, not url fragments). I don't think there's any issues which would be showstoppers though. I'll take a look at integrating that module instead.

My instinct is to avoid "bundler-like" behavior because that's gotten us into this anarchy where we have a dozen different resolution algorithms. I found out a few weeks ago that bun consults paths from tsconfig.json during runtime to guide its resolution algorithm and it ruined my day.

that's gotten us into this anarchy where we have a dozen different resolution algorithms

100%. But the point of modeling it in this tool would be to point out the cases where that causes concrete problems. TS itself doesn’t have any of those behaviors, and since I’m currently using TS to try to show where type resolution and bundler resolution disagree, there are false negatives. If we were to switch the bundler resolution to be more bundler-y, new problems could be exposed where we’re pretty sure the bundler is going to resolve to one thing and TS is going to resolve to something that doesn’t match. At any rate, I don’t expect that to be part of this PR, just pointing out that the dependency might be able to save on a lot of local code.

I took a look at swapping over to enhanced-resolve. A big problem is that it doesn't report module format (esm, cjs, wasm, etc) from the resolution, it only reports the resolved path. So we would need to perform an additional step to infer the module format on our own. Webpack obviously wouldn't care about this information because they can operate in a cjs + esm superset mode but this is the exact condition we're trying to report.

Also they require a fully contravariant implementation of basically the entire node:fs surface area (see: types.d.ts). Obviously the correct way to approach these types would be what you've done in this package and define an expected subset of functionality. So it's any-town, U.S.A. trying to shim Package into enhanced-resolve.

If I hadn't already gone through the legwork of implementing the specified algorithms then I'd probably just go with enhanced-resolve and hold my nose through the ick. But integrating it now I think makes the PR strictly worse. If you insist I can drop it in though. It would reduce our LOC so that is an advantage.

packages/core/src/internal/esm/cjsBindings.ts

laverdet · 2024-06-18T16:51:21Z

But since we’re only reporting types exports that don’t exist at runtime, I take your point that it can be useful to list them, to a point.

Correct, a runtime symbol which isn't typed is not a problem. This is fairly common as well, react's __SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED comes to mind. Or any project which uses stripInternal: true. I would not recommend raising an issue for this condition.

it is sometimes the case that all named exports are missing, in which case it doesn’t seem useful to list anything

This is probably by far the more common case and worth calling out specifically. Perhaps problemKindInfo could be a Record of functions which accept a Problem parameter and in turn let you enhance the reported diagnostics with information from the problem?

andrewbranch · 2024-06-18T17:00:54Z

I have some ideas for future UI that would let you drill into individual problem occurrences and see details, generated from problem object fields other than kind. The summary text is intended to be unspecific so that each problem summary appears at most one time. For the sake of ensuring your work gets merged, I think the best strategy is not to try to implement something like that now, but to expose the list of missing named exports in the --json output (which is also viewable in the web UI) as you’ve already done.

andrewbranch · 2024-08-27T15:12:32Z

I think this is pretty much ready to go, but I screwed up the conflict resolution in the GitHub UI and can’t seem to push a merge commit from the CLI. I’m going to merge this into a staging branch, fix up the lockfile, and figure out the test failures. I’ll open a new PR into main and tag you in it when it’s ready to go. Thanks so much for all your work! ❤️

laverdet added 2 commits March 25, 2024 16:32

WIP: cjs named exports

7010f60

CJS & ESM static analysis

bf2827a

laverdet added 7 commits June 17, 2024 17:19

Remove acorn parser for ESM exports

ab65eef

Don't report duplicate named exports

0cc1392

Catch ESM named export errors

dcaabc6

Skip missing JSON keys

ef329fa

Scratch document for NamedExports

caf8bff

Remove demonstrative console.log

f66ec66

Fix broken CJS reexport

e906b1a

Remove unused module type information

89c71ac

andrewbranch reviewed Jun 18, 2024

View reviewed changes

laverdet and others added 3 commits June 18, 2024 14:56

Add isMissingAllNamed

4c184a7

Remove top-level await

e051d9f

Merge branch 'main' into main

9a62a49

andrewbranch changed the base branch from main to named-exports August 27, 2024 15:08

andrewbranch merged commit c9b8035 into arethetypeswrong:named-exports Aug 27, 2024
1 check failed

andrewbranch mentioned this pull request Aug 27, 2024

Named exports #197

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare ESM named exports to types #166

Compare ESM named exports to types #166

laverdet commented Mar 25, 2024

changeset-bot bot commented Mar 25, 2024 •

edited

Loading

andrewbranch commented Jun 13, 2024

laverdet commented Jun 17, 2024

andrewbranch left a comment

andrewbranch Jun 18, 2024

laverdet Jun 18, 2024

andrewbranch Jun 18, 2024

laverdet Jun 18, 2024

laverdet commented Jun 18, 2024

andrewbranch commented Jun 18, 2024

andrewbranch commented Aug 27, 2024

Compare ESM named exports to types #166

Compare ESM named exports to types #166

Conversation

laverdet commented Mar 25, 2024

changeset-bot bot commented Mar 25, 2024 • edited Loading

⚠️ No Changeset found

andrewbranch commented Jun 13, 2024

laverdet commented Jun 17, 2024

andrewbranch left a comment

Choose a reason for hiding this comment

andrewbranch Jun 18, 2024

Choose a reason for hiding this comment

laverdet Jun 18, 2024

Choose a reason for hiding this comment

andrewbranch Jun 18, 2024

Choose a reason for hiding this comment

laverdet Jun 18, 2024

Choose a reason for hiding this comment

laverdet commented Jun 18, 2024

andrewbranch commented Jun 18, 2024

andrewbranch commented Aug 27, 2024

changeset-bot bot commented Mar 25, 2024 •

edited

Loading