Replaced readdirp with walk-filtered #412

kmalakoff · 2015-12-07T19:33:13Z

See #410

es128 · 2015-12-07T20:09:28Z

Forgot to add graceful-fs as a dep. Odd that the one Travis job passed.

Will try it locally when I get some time to focus on it later.

kmalakoff · 2015-12-07T20:29:42Z

I've removed graceful-fs so no longer a dependency.

es128 · 2015-12-08T22:03:45Z

@kmalakoff this is looking very promising. Check it out:

Comparing Walk walk-filtered/node_modules
Serial (fs) x 15.42 ops/sec ±1.65% (75 runs sampled)
Parallel (fs) x 32.94 ops/sec ±1.11% (79 runs sampled)
Parallel (gfs) x 32.70 ops/sec ±0.91% (79 runs sampled)
Parallel limit (fs, 10) x 32.25 ops/sec ±0.71% (78 runs sampled)
Parallel limit (fs, 50) x 33.28 ops/sec ±0.99% (80 runs sampled)
Parallel limit (fs, 100) x 33.49 ops/sec ±0.64% (80 runs sampled)
Parallel limit (gfs, 10) x 31.15 ops/sec ±1.11% (75 runs sampled)
Parallel limit (gfs, 50) x 32.88 ops/sec ±0.91% (79 runs sampled)
Parallel limit (gfs, 100) x 32.70 ops/sec ±0.96% (79 runs sampled)
readdirp x 30.64 ops/sec ±2.10% (75 runs sampled)
glob x 21.28 ops/sec ±2.49% (55 runs sampled)
Fastest is Parallel limit (fs, 100),Parallel limit (fs, 50),Parallel (fs)

Comparing Walk jquery
Serial (fs) x 1.76 ops/sec ±1.26% (13 runs sampled)
Parallel (fs) x 3.72 ops/sec ±2.01% (23 runs sampled)
Parallel (gfs) x 3.59 ops/sec ±3.14% (22 runs sampled)
Parallel limit (fs, 10) x 3.64 ops/sec ±1.21% (22 runs sampled)
Parallel limit (fs, 50) x 3.78 ops/sec ±0.97% (23 runs sampled)
Parallel limit (fs, 100) x 3.66 ops/sec ±2.22% (23 runs sampled)
Parallel limit (gfs, 10) x 3.35 ops/sec ±3.05% (21 runs sampled)
Parallel limit (gfs, 50) x 3.71 ops/sec ±1.20% (23 runs sampled)
Parallel limit (gfs, 100) x 3.62 ops/sec ±3.44% (22 runs sampled)
readdirp x 3.28 ops/sec ±1.91% (21 runs sampled)
glob x 2.25 ops/sec ±1.47% (16 runs sampled)
Fastest is Parallel limit (fs, 50),Parallel (fs)

💯 Congrats

If you add me to the repo, I can push some branches so you can play along with the experiments I'm running. I want to try with some actual filtering, look for a perf cliff on too much parallelism, use randomly named dirs to negate effects of caching at the filesystem level, etc.

After all that, I have some other ideas to try out on walk-filtered itself.

kmalakoff · 2015-12-09T03:00:20Z

@es128 awesome. I've added you to the repo.

Just let me know when you want to merge into master so I can review the changes.

kmalakoff · 2015-12-09T04:42:42Z

@es128 beside this inner loop optimization approach, I'd also make a perf suite for chokidar to get a more integrated view so you can experiment at a higher level.

Plus, I'd take a look at the less reductionist and more holistic approaches like bounding of performance through a global concurrency option like:

function eachArrays(limit) {
  this.limit = limit;
  this.inFlight = 0;
  this.parallelArrays = [];
}

eachArrays.prototype.each(array, fn, callback) { this.arrays.push({array, fn, callback}); }
eachArrays.prototype.end(callback) { /* all done */ }

And you might want to drop readdirp and try your ideas of late stat by doing some library refactoring to make it easier.

I just did the low hanging by conforming to the readdipr API, but there really so many angles to improving this.

I might prioritize like:

bounding performance
high level perf tests and minimal refactoring for optimizations, eg. dropping the readdirp API and using walk directly
inner loop optimizations since you've confirmed that it is already pretty good
deeper refactoring (which I think may have the side effects of shaking out some of the bugs in the chokidar issues)

paulmillr · 2015-12-09T13:56:01Z

@kmalakoff could you add me there too please?

kmalakoff · 2015-12-09T15:48:51Z

@paulmillr done

kmalakoff · 2015-12-09T16:23:52Z

@es128 I was thinking one optimization could be to get rid of the event emitter in walk-filtered. Because filter gets called with each element, it is possible to process them on the spot.

Originally, I had in readdirp-walk:

    walk(realRoot, filter, callback);
     .on('file', /* emit */ )
     .on('directory', /* emit */ )

but then simplified it to the following and emitted in the filter:

    walk(realRoot, filter, callback);

es128 · 2015-12-09T16:57:27Z

I like the ideas of using walk-filtered directly and not staying married to readdirp's API as well as passing results directly to callback/handler functions since it's simple enough that we should not need EventEmitter or any other abstraction.

Agree with your point that filter can be refactored to serve both purposes. The only time the result of filtering needs to be provided back to walk-filtered is to tell it whether to traverse a dir.

paulmillr · 2015-12-09T22:48:09Z

♨️

kmalakoff · 2015-12-10T06:50:26Z

I've updated the API to get rid of event emitters and also changed the API a bit to make it fit better into this new paradigm. The signature is now:

walk(rootPath, function(path, stat) { return undefined | true | false }, [true /* includeStat */ | options,] done)

It means that if you are doing perf test, you'll need to add a filter function now that it is mandatory.

Also, I've created an issue for the parallelism bounding and choosing a sensible default for parallelism:

kmalakoff/readdirp-walk#1
kmalakoff/readdirp-walk#3

paulmillr · 2016-03-15T20:02:35Z

I'd love to get the ball rolling on this.

@kmalakoff could you add me to the repo / NPM as a maintainer?

paulmillr · 2016-03-15T20:42:49Z

I don't see how the pull request improves the performance. Just tried the readdirp-walk version with useFsEvents: false on Typescript repo, still pretty terrible (130% load for >40s)

es128 · 2016-03-15T20:44:07Z

It doesn't impact watch performance for the most part, you should just be looking for time until the ready event. Only really matters with large/deep file trees.

paulmillr · 2016-03-15T20:45:36Z

you should just be looking for time until the ready event

That's exactly what i'm looking at. CPU usage before the ready event is very high, just like it was before. Nothing changes.

es128 · 2016-03-15T20:54:55Z

Note from the benchmark results that this only slightly outperformed readdirp. But it does offer more control that could be exposed to users who cannot tolerate the CPU spike and can trade it off for extra walk time by tweaking the parallelism settings. Also, I was hoping to find opportunities to squeeze out even more performance by stripping out stat calls, although @bpasero has mentioned that eagerly calling fs.readdir on every entry and handling the errors wasn't going to work out. I haven't tried it for myself so far.

paulmillr · 2016-03-15T21:00:00Z

Reducing concurrency to 1 or 5 still shows the same %CPU usage. No changes.

es128 · 2016-03-15T21:06:25Z

How interesting. I'll see if I can take some time to explore this as well.

bpasero · 2016-03-16T06:00:58Z

Fyi, we chose to avoid fs.readdir() in vs code over fs.lstat() because I found this to be faster. However faster also means that the CPU and memory hit will get higher because the faster you walk the FS, the more resources are needed, especially if you do this all in parallel.

No matter how good you optimise, if chokidar needs to know the full list of files and folders upfront, you will always pay either a high CPU/Memory price or it just takes a lot of time.

At some point I was hoping node.js would pick this issue up and provide a better way of traversing the file system. Very similar to how python chose to provide a os.scandir() function that optimises exactly for this case: https://www.python.org/dev/peps/pep-0471/

paulmillr · 2016-03-16T14:08:04Z

@bpasero although this is an issue only when polling mode is enabled. Should we just target everyone to a "fast" mode?

bpasero · 2016-03-16T14:14:25Z

@paulmillr but my understanding is that even for native fsevents watching, the disk scan is done on startup to get the watcher ready. so, polling might be worse, but for native watching you pay the price of scanning one potential large folder right on startup. My measurements show many hundreds of MB of memory being allocated doing so for larger folders.

I have even seen an issue on Linux where I was running out of file handles (ENOSPC error) because the recursive walking installs file watchers on each folder recursively.

This brings me to another question I had: I realize VS Code is using chokidar in a way that is not ideal: We install a watcher on the root of the folder being opened and just receive all events. A better approach for us might be to just add more folders to watch as soon as we know we need to watch (e.g. some UI element becomes active and needs to watch on a file or folder). And also dispose those watchers once we know we do not need them anymore.

Now my question: Is it possible at all to watch on a folder and prevent chokidar from doing the file scan on that folder? Because if there was such a way, VS Code could use chokidar in such an efficient way that the full disk scan is not needed after all.

paulmillr · 2016-03-16T14:32:29Z

Comparing two chokidars here (with and without the PR):

TypeScript repo
fsevents are enabled
CPU hits 80% for 3 seconds and then it's 0% for both cases
Memory hits 83MB for both cases

So, I still don't see how the PR brings any difference here. It's possible that the PR would mitigate the ENOSPC though.

Is it possible at all to watch on a folder and prevent chokidar from doing the file scan on that folder

Not possible. We are doing the minimal thing here already.

es128 · 2016-03-16T14:45:58Z

Is it possible at all to watch on a folder and prevent chokidar from doing the file scan on that folder

We'd lose a lot of event accuracy... I guess we could toy with the idea that you pass in an object representing the file tree upfront if you already have it so chokidar doesn't have to do one itself. But at first glance that doesn't seem like a really feasible path.

bpasero · 2016-03-16T14:52:56Z

I am not saying it makes our life much easier because I actually like the idea that you have just one watcher on root receiving the events and you can do the filtering on top. The real issue is having to do the initial file scan. Can you refresh my mind again why this is needed?

paulmillr · 2016-03-16T14:54:51Z

Actually I don't think it's needed for non-polling cases.

es128 · 2016-03-16T14:57:11Z

It is for double-checking misreported events which come from all of our watch methods. But clearly some are worse than others, and we can experiment with reducing our dependency on tracking the file tree.

In the fsevents case, events often hit the directory in a not specific enough way, so comparing a new readdir to what we know about the tree is necessary to know what was added or removed.

kmalakoff · 2016-03-16T15:02:49Z

@paulmillr I'm not sure if it is a good option to only compare with and without the PR until this approach has fully realized its value. This PR had two purposes:

get rid of the massive parallel scanning blast on initial scan - I needed the initial scan to play nicely in the background so my electron app could remain responsive during the scan on large folders
lay the foundations for a simpler scanning process and to remove readirp so that optimizations and/or finer control could be implemented - readirp for example stores all results, always lstats, has crazy control flow making changes difficult, etc and @es128 wanted to try a less optimistic stat approach, etc

What I recommend is that you guys would take this over and optimize since I mainly focussed on my use case of providing an option to serialize the initial scan (which you guys said was uncommon), but provided a nice clean, compatible implementation to start from. Note: this initial submission uses walk-filtered through a readirp emulation layer that passes the readirp tests meaning further optimizations are possible by removing readirp features that are not needed by chokidar. This is why I would recommend using walk-fitered directly (without readirp emulation) as a starting place.

@bpasero have you looked at webpack's use of chokidar as they control the registering of directory scanning on a directory-by-directory basis with a depth of 0. Also, depending on your use case around displayed UI, you could maybe use pathwatcher from the atom editor which has a good example in https://github.com/atom/tree-view.

bpasero · 2016-03-16T15:05:17Z

@kmalakoff depth: 0 sounds like exactly what I could use to avoid recursion. @paulmillr correct?

es128 · 2016-03-16T15:06:31Z

@bpasero yes. It'll still do a readdir for the top level, but not recursively.

bpasero · 2016-03-16T15:39:26Z

This might be an option for VS Code to explore then 👍

Replaced readdirp with walk-filtered

cbad0c8

Remove graceful-fs

ea3e4d9

kmalakoff added 2 commits December 7, 2015 15:40

Moved readdirp into separate module readdirp-walk

e378452

Added a concurrency option for controlling walk parallelism

4ccb313

es128 mentioned this pull request Jan 6, 2016

Windows: Cannot delete watched folder with not-empty subfolder #422

Closed

es128 mentioned this pull request Mar 9, 2016

Mac: High CPU usage for large folders #447

Closed

es128 mentioned this pull request Apr 17, 2017

High CPU usage on OSX with no polling #597

Closed

This was referenced Sep 1, 2017

Replaced watch task with less CPU consuming native alternative (fixes #51) SqrTT/prophet#47

Merged

Reduce CPU consumption of the file watching SqrTT/prophet#51

Closed

es128 mentioned this pull request Jan 9, 2018

Network drives on win32: 24x slower #665

Closed

paulmillr closed this Mar 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replaced readdirp with walk-filtered #412

Replaced readdirp with walk-filtered #412

kmalakoff commented Dec 7, 2015

es128 commented Dec 7, 2015

kmalakoff commented Dec 7, 2015

es128 commented Dec 8, 2015

kmalakoff commented Dec 9, 2015

kmalakoff commented Dec 9, 2015

paulmillr commented Dec 9, 2015

kmalakoff commented Dec 9, 2015

kmalakoff commented Dec 9, 2015

es128 commented Dec 9, 2015

paulmillr commented Dec 9, 2015

kmalakoff commented Dec 10, 2015

paulmillr commented Mar 15, 2016

paulmillr commented Mar 15, 2016

es128 commented Mar 15, 2016

paulmillr commented Mar 15, 2016

es128 commented Mar 15, 2016

paulmillr commented Mar 15, 2016

es128 commented Mar 15, 2016

bpasero commented Mar 16, 2016

paulmillr commented Mar 16, 2016

bpasero commented Mar 16, 2016

paulmillr commented Mar 16, 2016

es128 commented Mar 16, 2016

bpasero commented Mar 16, 2016

paulmillr commented Mar 16, 2016

es128 commented Mar 16, 2016

kmalakoff commented Mar 16, 2016

bpasero commented Mar 16, 2016

es128 commented Mar 16, 2016

bpasero commented Mar 16, 2016

Replaced readdirp with walk-filtered #412

Replaced readdirp with walk-filtered #412

Conversation

kmalakoff commented Dec 7, 2015

es128 commented Dec 7, 2015

kmalakoff commented Dec 7, 2015

es128 commented Dec 8, 2015

kmalakoff commented Dec 9, 2015

kmalakoff commented Dec 9, 2015

paulmillr commented Dec 9, 2015

kmalakoff commented Dec 9, 2015

kmalakoff commented Dec 9, 2015

es128 commented Dec 9, 2015

paulmillr commented Dec 9, 2015

kmalakoff commented Dec 10, 2015

paulmillr commented Mar 15, 2016

paulmillr commented Mar 15, 2016

es128 commented Mar 15, 2016

paulmillr commented Mar 15, 2016

es128 commented Mar 15, 2016

paulmillr commented Mar 15, 2016

es128 commented Mar 15, 2016

bpasero commented Mar 16, 2016

paulmillr commented Mar 16, 2016

bpasero commented Mar 16, 2016

paulmillr commented Mar 16, 2016

es128 commented Mar 16, 2016

bpasero commented Mar 16, 2016

paulmillr commented Mar 16, 2016

es128 commented Mar 16, 2016

kmalakoff commented Mar 16, 2016

bpasero commented Mar 16, 2016

es128 commented Mar 16, 2016

bpasero commented Mar 16, 2016