Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tree-shaking d.ts files #47137

Closed
5 tasks done
steveetm opened this issue Dec 13, 2021 · 13 comments
Closed
5 tasks done

Tree-shaking d.ts files #47137

steveetm opened this issue Dec 13, 2021 · 13 comments
Labels
Declined The issue was declined as something which matches the TypeScript vision Suggestion An idea for TypeScript

Comments

@steveetm
Copy link

Tree-shaking with d.ts files

Importing part of modules using the root index.d.ts file.

πŸ” Search Terms

d.ts tree-shaking type import destructuring

βœ… Viability Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
  • This feature would agree with the rest of TypeScript's Design Goals.

⭐ Suggestion

Parse only the required d.ts file.

πŸ’» Use Cases

We have several modules which uses larger modules with extensive typings, e.g aws-sdk.
When we import only S3

import { S3 } from 'aws-sdk';

aws-sdk/index.d.ts got parsed. These bigger modules can add up, increasing required resources for compilation with increased compile time.

If I import S3 directly, then the matching d.ts file parsed only:

import S3 from 'aws-sdk/clients/s3';

I did some experiment changing few of these imports and examined a few tsc traces, parse part went down for 10 seconds to 4, obviously the parsed declaration number went down accordingly.

Tried to look for various config combinations, but found nothing about this.

@RyanCavanaugh
Copy link
Member

This isn't possible to do in a way that preserves correctness. Any source file might have a declare block which augments another module's definitions, and an import can appear anywhere in a file.

@RyanCavanaugh RyanCavanaugh added Declined The issue was declined as something which matches the TypeScript vision Suggestion An idea for TypeScript labels Dec 13, 2021
@steveetm
Copy link
Author

@RyanCavanaugh can you help me to understand this, as I am not quite sure why can't do this in a correct way.
If somehow only the required declarations got imported, then later the compiler run into a declaration block which tries to augment a non-existing namespace, then what happens? imo it should create that namespace and go on, as I don't use it, I don't care. Also don't see why it is important that import can appear anywhere in a file.

@RyanCavanaugh
Copy link
Member

If somehow only the required declarations got imported

How does it know which declarations are required?

@steveetm
Copy link
Author

steveetm commented Dec 13, 2021

This is how index.d.ts looks like:

import {GlobalConfigInstance} from './lib/config';

export * from './lib/core';
export * from './clients/all';
export var config: GlobalConfigInstance

export as namespace AWS;

when I say import { S3 } from 'aws-sdk'; the compiler must know, that I need no other exported member. Obviously it must parse to the point where it runs into another export import, but when it runs into client/all.d.ts then it will see exports like this:

export import Route53Domains = require('./route53domains');
export import S3 = require('./s3');
export import S3Control = require('./s3control');

as we still know we only interested in S3 import, it is safe to totally ignore everything else.
I am perfectly aware that I am looking at this as totally naive, but I think it should work and definitely doable. If it is really not, then I am interested in technical limitations. I am pretty sure, even if it is doable it is not necessary trivial.

If the problem is, that S3 namespace augmentation could happen export import Route53Domains = require('./route53domains'); then it may true, but these are only generated d.ts file, it can be solved at genaration time. Also I think if that could happen it kinda anti-pattern somewhere.

@steveetm
Copy link
Author

Also thinking about to introduce some kind of 'pure' type exports where tsc can mark namespaces where it is safe to assume that no other d.ts in that module will modify anything else.

@RyanCavanaugh
Copy link
Member

Consider this example:

// a.ts
import { foo } from "bar";

// bar's entry point, wherever that is
export import foo from "./stuff"
(hundreds of thousands of more lines)
export import baz from "./types";

// ./stuff
export const a: SOME_TYPE;

// ./types
declare global {
  interface SOME_TYPE { }
}
export baz: string;

@steveetm
Copy link
Author

I think you miss the point. When editing bar compiler still have to do a lot of stuff, thats ok. But I am speaking about optimising the consumed index.d.ts from bar. In your example declare global should simply disallow any kind of dead type elimination. But when we generate d.ts files for module baz, tsc can do some extra work to check if it safe to import types like that, (it can even expand start for easier lookup?)
So imagine there is no 'declare global'
baz/index.d.ts:

/// <treeshakeable />
export { a } from "./stuff";
export { baz } from "./types";

Consuming that d.ts tsc can see it is fine to do like that, and only follow import/exports for the required key.

When creating baz/*.d.ts and compiler run into something which would break that, simply skip that directive.

@RyanCavanaugh
Copy link
Member

You said

Parse only the required d.ts file

Did you mean emit only the "required" .d.ts file?

@steveetm
Copy link
Author

Absolutely no. Or depends on the implementation of this feature, as if it requires slightly modified emitted d.ts, then yes, the corresponding @types/module has to be recompiled.

Then my module can see that it is safe or not to import only the required type from baz, because when we initially emitted baz (which is a separate, third party module), we examined, and decided it is totally safe to use it that way.

Here is some comparison, initial build times:

S3 from 'aws-sdk/clients/s3' { S3 } from 'aws-sdk'
Files: 2486 2816
Lines of Library: 7918 7918
Lines of Definitions: 201784 806500
Lines of TypeScript: 116557 116557
Lines of JavaScript: 0 0
Lines of JSON: 170 170
Lines of Other: 0 0
Nodes of Library: 21726 21726
Nodes of Definitions: 614544 2349462
Nodes of TypeScript: 455155 455162
Nodes of JavaScript: 0 0
Nodes of JSON: 496 496
Nodes of Other: 0 0
Identifiers: 400199 1036550
Symbols: 653745 1022188
Types: 221019 221019
Instantiations: 5217282 5217282
Memory used: 678216K 1175930K
Assignability cache size: 112104 112109
Identity cache size: 14695 14695
Subtype cache size: 8490 8487
Strict subtype cache size: 190061 190061
I/O Read time: 0.24s 0.48s
Parse time: 0.62s 2.31s
ResolveModule time: 0.20s 0.27s
ResolveTypeReference time: 0.00s 0.00s
Program time: 1.20s 3.28s
Bind time: 0.28s 1.37s
Check time: 18.12s 16.34s
I/O Write time: 0.00s 0.00s
printTime time: 0.04s 0.05s
Emit time: 0.04s 0.05s
Total time: 19.64s 21.03s

This is not much, but the memory usage already a lot higher, but not compare the incremental build performance during watch:

S3 from 'aws-sdk/clients/s3' { S3 } from 'aws-sdk'
Files: 2486 2816
Lines of Library: 7918 7918
Lines of Definitions: 201784 806500
Lines of TypeScript: 116557 116557
Lines of JavaScript: 0 0
Lines of JSON: 170 170
Lines of Other: 0 0
Nodes of Library: 21726 21726
Nodes of Definitions: 614544 2349462
Nodes of TypeScript: 455155 455162
Nodes of JavaScript: 0 0
Nodes of JSON: 496 496
Nodes of Other: 0 0
Identifiers: 400199 1036550
Symbols: 218894 587286
Types: 78 78
Instantiations: 0 0
Memory used: 415063K 2687265K
Assignability cache size: 0 0
Identity cache size: 0 0
Subtype cache size: 0 0
Strict subtype cache size: 0 0
I/O Read time: 0.06s 0.45s
Parse time: 0.77s 2.39s
ResolveModule time: 0.22s 0.23s
ResolveTypeReference time: 0.00s 0.00s
Program time: 1.20s 3.29s
Bind time: 0.27s 1.09s
Total time: 1.47s 4.37s

This is now insane. Check the memory usage, the nodes count, and the build time. Nonsense.
You can say, lets simply force myself to use 'aws-sdk/clients/module'. Thats ok, as long as I don't run into another @types/module which imports a part of aws-sdk in its d.ts file, but using only a part of it. My build time is screwed again. And this is only one big module where tons of unused types got parsed during rebuilds

@RyanCavanaugh
Copy link
Member

It seems like the only place where this could be done (i.e. places where there are "islands" of no out-reaching dependencies) is where the modules already have the structure in place such that you could deep-import e.g. aws-sdk/clients/s3. aws-sdk is by far the largest definition file setup we're aware of so I think measuring in terms of the absolute worst-case scenario overstates the value that could be had here.

In the general case, this is going to mean that features like auto-import fail to offer up symbols that people expect to find. It's also very unclear how libraries would figure out that they are structured in a way (no globals, no augmentations, no top-level module decls, etc) that would make this a safe optimization.

You can actually achieve this behavior today by specifying --noResolve and manually providing a file list.

@steveetm
Copy link
Author

steveetm commented Dec 15, 2021

Yes, aws-sdk is the largest, but there are several not so large modules and these penalties add up.
I don't really understand your argument about auto-import feature, that should be done once, on initial build, after that the IDE should have all the information of available symbols. It is pretty unlikely that node_modules change during incremental builds, but if they do so, lets check all available symbols once again.

I absolutely don't understand this difference with incremental builds, as aws-sdk should be cached already, tsc must see it is not modified, why bother checking/parsing or doing anything with it ever again? It could even throw unused declarations after first run tho.

About figuring out if libraries structured in a good way I thought there is already an affectsGlobalScope flag in tsbuildinfo, if flag is exists then it is a bailout from optimisation. Or add another flag to see if it is a side-effect free ES module. Yea, probably still unclear, but honestly I don't see how one module with 99% of its code being used add 3 seconds of incremental build time.

And saying it is related to aws-sdk only isn't quite right, lets check material-ui
import AlarmAdd from "@material-ui/icons/AlarmAdd";
vs.
import {AlarmAdd} from "@material-ui/icons";

Nodes of Definitions: 7
vs.
Nodes of Definitions: 44450

This can not be normal. As I said, these are add up.

@craigphicks
Copy link

@steveetm
Just an observation - with libraries moving towards ESM format and that AWS library being CommonJS, it wouldn't make sense to develop tree shaking methods for CommonJS libraries.
Moreover, with an ESM lib, you could make a proxy intermediate lib using rollup - although I'm not sure that would be less work than manually tree "picking" (as you demonstrated) which is the only current way to select from CommonJS libs.

@typescript-bot
Copy link
Collaborator

This issue has been marked as "Declined" and has seen no recent activity. It has been automatically closed for house-keeping purposes.

@typescript-bot typescript-bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Declined The issue was declined as something which matches the TypeScript vision Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

4 participants