Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare ESM named exports to types #166

Merged
merged 13 commits into from
Aug 27, 2024

Conversation

laverdet
Copy link
Contributor

This implements the idea in #35. The patch is incomplete in some ways:

  • Some type-only exports are not correctly filtered by my check. For example chalk reports ModifierName (and others) are missing, but these are type aliases. I can't figure out the correct predicate to only raise runtime types.
  • Some packages reexport * namespaces from other packages, for example @reduxjs/toolkit (see here). If we want to check these then the module's dependencies will need to be downloaded too. This would be a major change, or we can just bail the check in this case.
  • Some packages (react, many others) list package.json in their entries. The TypeScript compiler advertises that the top-level keys of this JSON file are importable when they are not. import { name } from "react/package.json" assert { type: "json" }; will pass compilation but fail to evaluate under nodejs. I'm surprised that even under the strictest modern settings (target: esnext, verbatimModuleSyntax: true, module: nodenext, moduleResolution: nodenext) that TypeScript has this behavior. Filtering JSON entries would probably be fine.
  • The acorn part could be rewritten to use the TypeScript AST instead. This thought eluded me until I had already finished the implementation.

The result of getEsmModuleNamespace should be the same as the result of Object.keys(await import(moduleName)). This information is then compared against the advertised types and an error is raised if the types are incorrect under node16-esm resolution mode (there are many, many problems in the npm ecosystem).

The static analysis had to be done outside of TypeScript since its behavior does not match the reality of what nodejs sees. TypeScript would have you believe that the following program is well-formed but it cannot be evaluated under nodejs:

// commonjs.cjs
module.exports = {
	name: 1,
};

// module.mjs
import { name } from "./common.cjs";

Example including diagnostic output from request and lodash, which do not actually export any of the names that their types would have you believe:

-> % node dist/index.js --from-npm request                             
🚨 /node_modules/request/index.js [
  'defaults',   'defaults', 'get',
  'get',        'get',      'post',
  'post',       'post',     'put',
  'put',        'put',      'head',
  'head',       'head',     'patch',
  'patch',      'patch',    'del',
  'del',        'del',      'delete',
  'delete',     'delete',   'initParams',
  'initParams', 'forever',  'jar',
  'cookie',     'debug'
]

┌───────────────────┬──────────────────────┐
│                   │ "request"            │
├───────────────────┼──────────────────────┤
│ node10            │ 🟢                   │
├───────────────────┼──────────────────────┤
│ node16 (from CJS) │ 🟢 (CJS)             │
├───────────────────┼──────────────────────┤
│ node16 (from ESM) │ 🕵️ Named ESM exports │
├───────────────────┼──────────────────────┤
│ bundler           │ 🟢                   │
└───────────────────┴──────────────────────┘
-> % node dist/index.js --from-npm lodash 
🚨 /node_modules/lodash/lodash.js [
  'VERSION',           'templateSettings',  'chunk',
  'compact',           'concat',            'difference',
  'differenceBy',      'differenceBy',      'differenceBy',
  'differenceBy',      'differenceBy',      'differenceBy',
  'differenceBy',      'differenceWith',    'differenceWith',
  'differenceWith',    'differenceWith',    'drop',
  'dropRight',         'dropRightWhile',    'dropWhile',
  'fill',              'fill',              'fill',
  'fill',              'findIndex',         'findLastIndex',
  'first',             'flatten',           'flattenDeep',
  'flattenDepth',      'fromPairs',         'fromPairs',
  'head',              'indexOf',           'initial',
  'intersection',      'intersectionBy',    'intersectionBy',
  'intersectionBy',    'intersectionBy',    'intersectionBy',
  'intersectionWith',  'intersectionWith',  'intersectionWith',
  'intersectionWith',  'join',              'last',
  'lastIndexOf',       'nth',               'pull',
  'pull',              'pullAll',           'pullAll',
  'pullAllBy',         'pullAllBy',         'pullAllBy',
  'pullAllBy',         'pullAllWith',       'pullAllWith',
  'pullAllWith',       'pullAllWith',       'pullAt',
  'pullAt',            'remove',            'reverse',
  'slice',             'sortedIndex',       'sortedIndex',
  'sortedIndexBy',     'sortedIndexOf',     'sortedLastIndex',
  'sortedLastIndexBy', 'sortedLastIndexOf', 'sortedUniq',
  'sortedUniqBy',      'tail',              'take',
  'takeRight',         'takeRightWhile',    'takeWhile',
  'union',             'unionBy',           'unionBy',
  'unionBy',           'unionBy',           'unionBy',
  'unionWith',         'unionWith',         'unionWith',
  'uniq',              'uniqBy',            'uniqWith',
  'unzip',             'unzipWith',         'unzipWith',
  'without',           'xor',               'xorBy',
  'xorBy',
  ... 491 more items
]

lodash v4.17.21
@types/lodash v4.17.0

🕵️ docs docs


┌───────────────────┬──────────────────────┐
│                   │ "lodash"             │
├───────────────────┼──────────────────────┤
│ node10            │ 🟢                   │
├───────────────────┼──────────────────────┤
│ node16 (from CJS) │ 🟢 (CJS)             │
├───────────────────┼──────────────────────┤
│ node16 (from ESM) │ 🕵️ Named ESM exports │
├───────────────────┼──────────────────────┤
│ bundler           │ 🟢                   │
└───────────────────┴──────────────────────┘

Copy link

changeset-bot bot commented Mar 25, 2024

⚠️ No Changeset found

Latest commit: 9a62a49

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@andrewbranch
Copy link
Collaborator

Hey @laverdet, sorry for the extremely late acknowledgment here. Thanks for putting this together! Both work and life have been very busy in the last few months. Thanks for your patience. If you’re interested in continuing on this, I have some feedback. Otherwise, feel free to ignore, and I may pick up where you left off at some future date.

Overall, I’d like to present this check to the user as more of an all-or-nothing thing, like “TS thinks there are named exports but Node.js sees none, probably due to cjs-module-lexer being unable to statically analyze them” instead of “These N specific named exports seem to be missing at runtime.” That said, I’m not against storing the diff of named exports in the problem objects (or at least being set up to store them) so we can do more with it later.

Some type-only exports are not correctly filtered by my check. For example chalk reports ModifierName (and others) are missing, but these are type aliases. I can't figure out the correct predicate to only raise runtime types.

I can help with this when the rest appears to be in good shape.

Some packages reexport * namespaces from other packages, for example @reduxjs/toolkit (see here). If we want to check these then the module's dependencies will need to be downloaded too. This would be a major change, or we can just bail the check in this case.

There’s a lot already that’s incomplete because we don’t download dependencies. It’s fine. This also is less relevant if the CLI/UI isn’t trying to report specifically which named exports are missing at runtime.

Filtering JSON entries would probably be fine.

Agreed 👍

The acorn part could be rewritten to use the TypeScript AST instead. This thought eluded me until I had already finished the implementation.

Yeah, it would be nice not to depend on two different parsers, especially when the TS one is already running on everything.

@laverdet
Copy link
Contributor Author

With the updated code I believe the only known outstanding issues are:

  • Incomplete expected TypeScript runtime type detection in namedExports.ts
  • Documentation needed. Common cases of cjs-module-lexer bailouts: "default" function with bag of fake named exports (see: request), entirely unlexable (see: body-parser), partial bailout (see: semver). These all affect handwritten CommonJS modules. This rule would also catch non-bailout conditions where the handwritten TypeScript declarations are simply incorrect.

Overall, I’d like to present this check to the user as more of an all-or-nothing thing

Maybe I'm misunderstanding your statement but it's not always that simple, especially for handwritten CJS. semver is a very good case study. They have a cjs-module-lexer bailout in their "index" file. This causes all named exports after that line to be missing. The nuanced diagnostic information here is helpful.

$ git diff
diff --git a/packages/core/src/internal/checks/namedExports.ts b/packages/core/src/internal/checks/namedExports.ts
index 5b0dbc1..35b6cdf 100644
--- a/packages/core/src/internal/checks/namedExports.ts
+++ b/packages/core/src/internal/checks/namedExports.ts
@@ -50,6 +50,7 @@ export default defineCheck({
     })();
     if (exports) {
       const missing = expectedNames.filter((name) => !exports.includes(String(name))).map(String);
+      console.log(missing);
       if (missing.length > 0) {
         return {
           kind: "NamedExports",

$ node ./packages/cli/dist/index.js -p semver
[
  'compareIdentifiers',
  'rcompareIdentifiers',
  'SEMVER_SPEC_VERSION',
  'RELEASE_TYPES'
]

semver v7.6.2
@types/semver v7.5.8

🕵️ Module advertises named ESM exports which will not exist at runtime. https://github.com/arethetypeswrong/arethetypeswrong.github.io/blob/main/docs/problems/NamedExports.md


┌───────────────────┬──────────────────────┐
│                   │ "semver"             │
├───────────────────┼──────────────────────┤
│ node10            │ 🟢                   │
├───────────────────┼──────────────────────┤
│ node16 (from CJS) │ 🟢 (CJS)             │
├───────────────────┼──────────────────────┤
│ node16 (from ESM) │ 🕵️ Named ESM exports │
├───────────────────┼──────────────────────┤
│ bundler           │ 🟢                   │
└───────────────────┴──────────────────────┘

Copy link
Collaborator

@andrewbranch andrewbranch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good example with semver. The main thing I was trying to avoid was a bunch of errors of the opposite problem: runtime exports that don’t exist in the types. But since we’re only reporting types exports that don’t exist at runtime, I take your point that it can be useful to list them, to a point. I do think the CLI / web UI shouldn’t vomit a list of hundreds of exports (even where console.log(missing) truncates the lodash list looks excessive), and it is sometimes the case that all named exports are missing, in which case it doesn’t seem useful to list anything.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we model Node.js/bundler resolution with ts.resolveModuleName ignoring declaration files, which is not fully accurate. It’s been on my mental list for a while to replace that with enhanced-resolve, which would fix a few edge cases as well as letting me make the bundler JS resolution be a little more bundler-like, e.g. resolving the top-level module field in package.json. If you had access to enhanced-resolve here, would you still need cjsResolve.ts and esmResolve.ts for any reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of enhanced-resolve, that's good to know. It has some inaccuracies (require('./file.js?') should not resolve since require accepts filesystem paths, not url fragments). I don't think there's any issues which would be showstoppers though. I'll take a look at integrating that module instead.

My instinct is to avoid "bundler-like" behavior because that's gotten us into this anarchy where we have a dozen different resolution algorithms. I found out a few weeks ago that bun consults paths from tsconfig.json during runtime to guide its resolution algorithm and it ruined my day.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's gotten us into this anarchy where we have a dozen different resolution algorithms

100%. But the point of modeling it in this tool would be to point out the cases where that causes concrete problems. TS itself doesn’t have any of those behaviors, and since I’m currently using TS to try to show where type resolution and bundler resolution disagree, there are false negatives. If we were to switch the bundler resolution to be more bundler-y, new problems could be exposed where we’re pretty sure the bundler is going to resolve to one thing and TS is going to resolve to something that doesn’t match. At any rate, I don’t expect that to be part of this PR, just pointing out that the dependency might be able to save on a lot of local code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look at swapping over to enhanced-resolve. A big problem is that it doesn't report module format (esm, cjs, wasm, etc) from the resolution, it only reports the resolved path. So we would need to perform an additional step to infer the module format on our own. Webpack obviously wouldn't care about this information because they can operate in a cjs + esm superset mode but this is the exact condition we're trying to report.

Also they require a fully contravariant implementation of basically the entire node:fs surface area (see: types.d.ts). Obviously the correct way to approach these types would be what you've done in this package and define an expected subset of functionality. So it's any-town, U.S.A. trying to shim Package into enhanced-resolve.

If I hadn't already gone through the legwork of implementing the specified algorithms then I'd probably just go with enhanced-resolve and hold my nose through the ick. But integrating it now I think makes the PR strictly worse. If you insist I can drop it in though. It would reduce our LOC so that is an advantage.

packages/core/src/internal/esm/cjsBindings.ts Outdated Show resolved Hide resolved
@laverdet
Copy link
Contributor Author

But since we’re only reporting types exports that don’t exist at runtime, I take your point that it can be useful to list them, to a point.

Correct, a runtime symbol which isn't typed is not a problem. This is fairly common as well, react's __SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED comes to mind. Or any project which uses stripInternal: true. I would not recommend raising an issue for this condition.

it is sometimes the case that all named exports are missing, in which case it doesn’t seem useful to list anything

This is probably by far the more common case and worth calling out specifically. Perhaps problemKindInfo could be a Record of functions which accept a Problem parameter and in turn let you enhance the reported diagnostics with information from the problem?

@andrewbranch
Copy link
Collaborator

I have some ideas for future UI that would let you drill into individual problem occurrences and see details, generated from problem object fields other than kind. The summary text is intended to be unspecific so that each problem summary appears at most one time. For the sake of ensuring your work gets merged, I think the best strategy is not to try to implement something like that now, but to expose the list of missing named exports in the --json output (which is also viewable in the web UI) as you’ve already done.

@andrewbranch andrewbranch changed the base branch from main to named-exports August 27, 2024 15:08
@andrewbranch
Copy link
Collaborator

I think this is pretty much ready to go, but I screwed up the conflict resolution in the GitHub UI and can’t seem to push a merge commit from the CLI. I’m going to merge this into a staging branch, fix up the lockfile, and figure out the test failures. I’ll open a new PR into main and tag you in it when it’s ready to go. Thanks so much for all your work! ❤️

@andrewbranch andrewbranch merged commit c9b8035 into arethetypeswrong:named-exports Aug 27, 2024
1 check failed
@andrewbranch andrewbranch mentioned this pull request Aug 27, 2024
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants