Please support loading node_modules from an archive #27501

guilt · 2019-04-30T20:34:39Z

node_modules is the folder where Node typically picks up libraries for a runtime/application. Often, the number of files being distributed with a stock Node.js distribution is insane and very slow to load with random I/O on some flash/external memories. Assuming that one has a decent amount of RAM, it must be able to partially extract an archive to RAM and load the relevant JS files on demand. I would like this support for the stock distribution as well.

Is your feature request related to a problem? Please describe.
I have troubles loading some node projects from a slower SD card on R Pi; Buying a faster disk with a better queue is not an option, because I can buy one for myself, not for the many users out there.

Describe the solution you'd like
Please allow node to alternatively use a node_modules.zip, as a start?.

Describe alternatives you've considered

Delete many .JS files in node_modules/lib
Write a bin2cpp based v8 eval (increases RAM usage if many modules unused)
Accept fate

The text was updated successfully, but these errors were encountered:

devsnek · 2019-04-30T20:39:03Z

maybe duplicate of #1278

devsnek · 2019-04-30T22:51:22Z

@guilt what kind of performance hit are we talking about here? can you get around this problem by reusing running node instances? node is sorta designed to be long-lived.

bnoordhuis · 2019-05-01T03:22:14Z

Is "slow random I/O" seek time? What latencies are we talking?

If seek time is your problem then reading from an archive helps only a little because you'll be jumping back and forth just as much. Compression/locality means you'll read fewer blocks in total but since those blocks need not be contiguous on disk, seek times will still be the limiting factor.

Sequentially reading and decompressing the archive to memory alleviates that problem but then you're forced to keep a lot of files around on the off chance that the application needs them.

If you sample a few node_modules directories, I think you'll find that > 50% of files will never be imported (think tests, fixtures, docs, etc.), making aggressive caching a poor trade-off.

You could evict such files from the cache LRU-style but then you're halfway to reinventing the kernel's disk cache.

Talking about, there are several FUSE zipfs file systems. Maybe you can experiment with those and check what performance improvements (if any) you get?

guilt · 2019-05-01T04:28:08Z

@devsnek agree. It affects scripts (toolchains) more than services. But if node was not meant for writing scripts, running NPM on cold boot may not be optimal, by that logic. I did not want to separate these workloads logically when filing this.

@bnoordhuis yes, it is equivalent to seek, except it has to do with the flash controllers as opposed to moving heads on a HDD.

There may be a combination of problems here:

node_modules may store files that do not matter for end user; This is a distribution issue.
definitely random read / I/O controller related problems

Fuse introduces its own syscall overhead in addition to the archive seek overhead, I'm open to other suggestions for benchmarking too.

bnoordhuis · 2019-05-01T11:00:06Z

squashfs? It's a mainline kernel module so there's a pretty good chance your kernel comes with pre-built support for it; and if not, building it from source is quite easy. It supports gzip, lzo, xz and, with recent kernels, zstd.

addaleax · 2019-05-01T22:06:27Z

Fwiw, I think Electron already supports something along these lines? Maybe they have some input here? @codebytere @ryzokuken

Fishrock123 · 2019-05-10T22:17:33Z

See also #11903, which was "backlogged". CC also @bmeck

Antonius-S · 2019-05-21T07:58:37Z

Consider WebPack, I tried it with some big projects and it seem to work.

GrosSacASac · 2019-06-13T15:36:44Z

I had a project some time ago that used a lot of npm scripts, each starting node and doing something. It also had a big npm script that run all of them one by one. I refactored it so that independent task could run in parallel. And also I made sure that node itself was only started once. This combined effort reduced the time of the full build by 80%.

GrosSacASac · 2019-06-13T15:40:44Z

What could help you also is to use a tool to bundle your files. So each task would only open one big self-contained file.

arcanis · 2019-08-15T10:17:18Z

Fwiw, I think Electron already supports something along these lines? Maybe they have some input here? @codebytere @ryzokuken

Electron uses ASAR archives, which are supported through a partial fs implementation. Yarn 2 does the same thing, but we use regular Zip archives and we cover a wider area of the fs interface.

Fwiw some the problems I've identified in those approaches the way they're currently implemented:

It works well in a context where all modules operate in the same domain, but I'm not sure it's still the case if they all live in separate ones.
It's very hard to test the fs module. Splitting the FS testsuite into its own package that both Node and third-party FS could use would be very interesting.

On the other hand, some things work really well:

We don't only add support for Zip archives, we also have a virtual FS that we use to avoid having to create thousands of symlinks just to disambiguate peer dependencies. A simple support for Zip archives wouldn't solve that, so we'd still need to use our FS layer for that (note: we might be able to eventually get rid of this depending how the new module API turns out).
Some packages tend to use the fs methods on themselves, and adding this Zip support at the fs layer gives a better compatibility with those than if it was a dedicated "Resource API".

Overall, I think allowing modules to officially extend the regular fs interface would be extremely valuable - but the API should be standardized to prevent potential incompatibilities as much as possible.

devsnek · 2019-08-15T12:22:40Z

@bmeck

bmeck · 2019-08-15T13:46:49Z

@arcanis I think virtualizing fs is likely a separate issue to be discussed.

Some packages tend to use the fs methods on themselves, and adding this Zip support at the fs layer gives a better compatibility with those than if it was a dedicated "Resource API".

I have concerns but think loading from different virtual systems should be supported. Not all systems are compatible with all features of fs and sometimes things like case sensitivity can be incorrectly implemented/detected at runtime. I do not think loading https using fs.readFileSync for example is desirable and also tend to find similar things not compatible. Archive based loading is already proposed for the web using http signed exchanges. I think even while it may be tempting to only support fs compatible systems of loading, it is not within the scope of this issue and providing custom fs utility is unrelated to loading archives and the formats of those archives. A higher order runtime (such as electron/yarn/tink/meteor) etc. can provide custom shims for backwards compatibility if they desire but a resource loading API would be preferred for the general use case and as such is its own issue.

github-actions · 2022-02-25T15:59:42Z

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

github-actions · 2022-03-29T13:24:48Z

There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment.

For more information on how the project manages feature requests, please consult the feature request management document.

ChALkeR added feature request Issues that request new features to be added to Node.js. module Issues and PRs related to the module subsystem. labels Apr 30, 2019

bnoordhuis mentioned this issue Aug 15, 2019

worker_threads and NODE_OPTIONS #29117

Closed

github-actions bot added the stale label Feb 25, 2022

targos added this to Node.js feature requests Feb 25, 2022

targos moved this to Pending Triage in Node.js feature requests Feb 25, 2022

targos moved this from Pending Triage to Stale in Node.js feature requests Feb 25, 2022

github-actions bot closed this as completed Mar 29, 2022

avivkeller removed this from Node.js feature requests Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please support loading node_modules from an archive #27501

Please support loading node_modules from an archive #27501

guilt commented Apr 30, 2019 •

edited

Loading

devsnek commented Apr 30, 2019

devsnek commented Apr 30, 2019

bnoordhuis commented May 1, 2019

guilt commented May 1, 2019 •

edited

Loading

bnoordhuis commented May 1, 2019

addaleax commented May 1, 2019

Fishrock123 commented May 10, 2019

Antonius-S commented May 21, 2019

GrosSacASac commented Jun 13, 2019

GrosSacASac commented Jun 13, 2019

arcanis commented Aug 15, 2019

devsnek commented Aug 15, 2019

bmeck commented Aug 15, 2019 •

edited

Loading

github-actions bot commented Feb 25, 2022

github-actions bot commented Mar 29, 2022

Please support loading node_modules from an archive #27501

Please support loading node_modules from an archive #27501

Comments

guilt commented Apr 30, 2019 • edited Loading

devsnek commented Apr 30, 2019

devsnek commented Apr 30, 2019

bnoordhuis commented May 1, 2019

guilt commented May 1, 2019 • edited Loading

bnoordhuis commented May 1, 2019

addaleax commented May 1, 2019

Fishrock123 commented May 10, 2019

Antonius-S commented May 21, 2019

GrosSacASac commented Jun 13, 2019

GrosSacASac commented Jun 13, 2019

arcanis commented Aug 15, 2019

devsnek commented Aug 15, 2019

bmeck commented Aug 15, 2019 • edited Loading

github-actions bot commented Feb 25, 2022

github-actions bot commented Mar 29, 2022

guilt commented Apr 30, 2019 •

edited

Loading

guilt commented May 1, 2019 •

edited

Loading

bmeck commented Aug 15, 2019 •

edited

Loading