Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced readdirp with walk-filtered #412

Closed
wants to merge 4 commits into from
Closed

Replaced readdirp with walk-filtered #412

wants to merge 4 commits into from

Conversation

kmalakoff
Copy link
Contributor

See #410

@es128
Copy link
Contributor

es128 commented Dec 7, 2015

Forgot to add graceful-fs as a dep. Odd that the one Travis job passed.

Will try it locally when I get some time to focus on it later.

@kmalakoff
Copy link
Contributor Author

I've removed graceful-fs so no longer a dependency.

@es128
Copy link
Contributor

es128 commented Dec 8, 2015

@kmalakoff this is looking very promising. Check it out:

Comparing Walk walk-filtered/node_modules
Serial (fs) x 15.42 ops/sec ±1.65% (75 runs sampled)
Parallel (fs) x 32.94 ops/sec ±1.11% (79 runs sampled)
Parallel (gfs) x 32.70 ops/sec ±0.91% (79 runs sampled)
Parallel limit (fs, 10) x 32.25 ops/sec ±0.71% (78 runs sampled)
Parallel limit (fs, 50) x 33.28 ops/sec ±0.99% (80 runs sampled)
Parallel limit (fs, 100) x 33.49 ops/sec ±0.64% (80 runs sampled)
Parallel limit (gfs, 10) x 31.15 ops/sec ±1.11% (75 runs sampled)
Parallel limit (gfs, 50) x 32.88 ops/sec ±0.91% (79 runs sampled)
Parallel limit (gfs, 100) x 32.70 ops/sec ±0.96% (79 runs sampled)
readdirp x 30.64 ops/sec ±2.10% (75 runs sampled)
glob x 21.28 ops/sec ±2.49% (55 runs sampled)
Fastest is Parallel limit (fs, 100),Parallel limit (fs, 50),Parallel (fs)
Comparing Walk jquery
Serial (fs) x 1.76 ops/sec ±1.26% (13 runs sampled)
Parallel (fs) x 3.72 ops/sec ±2.01% (23 runs sampled)
Parallel (gfs) x 3.59 ops/sec ±3.14% (22 runs sampled)
Parallel limit (fs, 10) x 3.64 ops/sec ±1.21% (22 runs sampled)
Parallel limit (fs, 50) x 3.78 ops/sec ±0.97% (23 runs sampled)
Parallel limit (fs, 100) x 3.66 ops/sec ±2.22% (23 runs sampled)
Parallel limit (gfs, 10) x 3.35 ops/sec ±3.05% (21 runs sampled)
Parallel limit (gfs, 50) x 3.71 ops/sec ±1.20% (23 runs sampled)
Parallel limit (gfs, 100) x 3.62 ops/sec ±3.44% (22 runs sampled)
readdirp x 3.28 ops/sec ±1.91% (21 runs sampled)
glob x 2.25 ops/sec ±1.47% (16 runs sampled)
Fastest is Parallel limit (fs, 50),Parallel (fs)

💯 Congrats

If you add me to the repo, I can push some branches so you can play along with the experiments I'm running. I want to try with some actual filtering, look for a perf cliff on too much parallelism, use randomly named dirs to negate effects of caching at the filesystem level, etc.

After all that, I have some other ideas to try out on walk-filtered itself.

@kmalakoff
Copy link
Contributor Author

@es128 awesome. I've added you to the repo.

Just let me know when you want to merge into master so I can review the changes.

@kmalakoff
Copy link
Contributor Author

@es128 beside this inner loop optimization approach, I'd also make a perf suite for chokidar to get a more integrated view so you can experiment at a higher level.

Plus, I'd take a look at the less reductionist and more holistic approaches like bounding of performance through a global concurrency option like:

function eachArrays(limit) {
  this.limit = limit;
  this.inFlight = 0;
  this.parallelArrays = [];
}

eachArrays.prototype.each(array, fn, callback) { this.arrays.push({array, fn, callback}); }
eachArrays.prototype.end(callback) { /* all done */ }

And you might want to drop readdirp and try your ideas of late stat by doing some library refactoring to make it easier.

I just did the low hanging by conforming to the readdipr API, but there really so many angles to improving this.

I might prioritize like:

  1. bounding performance
  2. high level perf tests and minimal refactoring for optimizations, eg. dropping the readdirp API and using walk directly
  3. inner loop optimizations since you've confirmed that it is already pretty good
  4. deeper refactoring (which I think may have the side effects of shaking out some of the bugs in the chokidar issues)

@paulmillr
Copy link
Owner

@kmalakoff could you add me there too please?

@kmalakoff
Copy link
Contributor Author

@paulmillr done

@kmalakoff
Copy link
Contributor Author

@es128 I was thinking one optimization could be to get rid of the event emitter in walk-filtered. Because filter gets called with each element, it is possible to process them on the spot.

Originally, I had in readdirp-walk:

    walk(realRoot, filter, callback);
     .on('file', /* emit */ )
     .on('directory', /* emit */ )

but then simplified it to the following and emitted in the filter:

    walk(realRoot, filter, callback);

@es128
Copy link
Contributor

es128 commented Dec 9, 2015

I like the ideas of using walk-filtered directly and not staying married to readdirp's API as well as passing results directly to callback/handler functions since it's simple enough that we should not need EventEmitter or any other abstraction.

Agree with your point that filter can be refactored to serve both purposes. The only time the result of filtering needs to be provided back to walk-filtered is to tell it whether to traverse a dir.

@paulmillr
Copy link
Owner

♨️

@kmalakoff
Copy link
Contributor Author

I've updated the API to get rid of event emitters and also changed the API a bit to make it fit better into this new paradigm. The signature is now:

walk(rootPath, function(path, stat) { return undefined | true | false }, [true /* includeStat */ | options,] done)

It means that if you are doing perf test, you'll need to add a filter function now that it is mandatory.

Also, I've created an issue for the parallelism bounding and choosing a sensible default for parallelism:

kmalakoff/readdirp-walk#1
kmalakoff/readdirp-walk#3

@paulmillr
Copy link
Owner

I'd love to get the ball rolling on this.

@kmalakoff could you add me to the repo / NPM as a maintainer?

@paulmillr
Copy link
Owner

I don't see how the pull request improves the performance. Just tried the readdirp-walk version with useFsEvents: false on Typescript repo, still pretty terrible (130% load for >40s)

@es128
Copy link
Contributor

es128 commented Mar 15, 2016

It doesn't impact watch performance for the most part, you should just be looking for time until the ready event. Only really matters with large/deep file trees.

@paulmillr
Copy link
Owner

you should just be looking for time until the ready event

That's exactly what i'm looking at. CPU usage before the ready event is very high, just like it was before. Nothing changes.

@es128
Copy link
Contributor

es128 commented Mar 15, 2016

Note from the benchmark results that this only slightly outperformed readdirp. But it does offer more control that could be exposed to users who cannot tolerate the CPU spike and can trade it off for extra walk time by tweaking the parallelism settings. Also, I was hoping to find opportunities to squeeze out even more performance by stripping out stat calls, although @bpasero has mentioned that eagerly calling fs.readdir on every entry and handling the errors wasn't going to work out. I haven't tried it for myself so far.

@paulmillr
Copy link
Owner

Reducing concurrency to 1 or 5 still shows the same %CPU usage. No changes.

@es128
Copy link
Contributor

es128 commented Mar 15, 2016

How interesting. I'll see if I can take some time to explore this as well.

@bpasero
Copy link

bpasero commented Mar 16, 2016

Fyi, we chose to avoid fs.readdir() in vs code over fs.lstat() because I found this to be faster. However faster also means that the CPU and memory hit will get higher because the faster you walk the FS, the more resources are needed, especially if you do this all in parallel.

No matter how good you optimise, if chokidar needs to know the full list of files and folders upfront, you will always pay either a high CPU/Memory price or it just takes a lot of time.

At some point I was hoping node.js would pick this issue up and provide a better way of traversing the file system. Very similar to how python chose to provide a os.scandir() function that optimises exactly for this case: https://www.python.org/dev/peps/pep-0471/

@paulmillr
Copy link
Owner

@bpasero although this is an issue only when polling mode is enabled. Should we just target everyone to a "fast" mode?

@bpasero
Copy link

bpasero commented Mar 16, 2016

@paulmillr but my understanding is that even for native fsevents watching, the disk scan is done on startup to get the watcher ready. so, polling might be worse, but for native watching you pay the price of scanning one potential large folder right on startup. My measurements show many hundreds of MB of memory being allocated doing so for larger folders.

I have even seen an issue on Linux where I was running out of file handles (ENOSPC error) because the recursive walking installs file watchers on each folder recursively.

This brings me to another question I had: I realize VS Code is using chokidar in a way that is not ideal: We install a watcher on the root of the folder being opened and just receive all events. A better approach for us might be to just add more folders to watch as soon as we know we need to watch (e.g. some UI element becomes active and needs to watch on a file or folder). And also dispose those watchers once we know we do not need them anymore.

Now my question: Is it possible at all to watch on a folder and prevent chokidar from doing the file scan on that folder? Because if there was such a way, VS Code could use chokidar in such an efficient way that the full disk scan is not needed after all.

@paulmillr
Copy link
Owner

Comparing two chokidars here (with and without the PR):

  • TypeScript repo
  • fsevents are enabled
  • CPU hits 80% for 3 seconds and then it's 0% for both cases
  • Memory hits 83MB for both cases

So, I still don't see how the PR brings any difference here. It's possible that the PR would mitigate the ENOSPC though.

Is it possible at all to watch on a folder and prevent chokidar from doing the file scan on that folder

Not possible. We are doing the minimal thing here already.

@es128
Copy link
Contributor

es128 commented Mar 16, 2016

Is it possible at all to watch on a folder and prevent chokidar from doing the file scan on that folder

We'd lose a lot of event accuracy... I guess we could toy with the idea that you pass in an object representing the file tree upfront if you already have it so chokidar doesn't have to do one itself. But at first glance that doesn't seem like a really feasible path.

@bpasero
Copy link

bpasero commented Mar 16, 2016

I am not saying it makes our life much easier because I actually like the idea that you have just one watcher on root receiving the events and you can do the filtering on top. The real issue is having to do the initial file scan. Can you refresh my mind again why this is needed?

@paulmillr
Copy link
Owner

Actually I don't think it's needed for non-polling cases.

@es128
Copy link
Contributor

es128 commented Mar 16, 2016

It is for double-checking misreported events which come from all of our watch methods. But clearly some are worse than others, and we can experiment with reducing our dependency on tracking the file tree.

In the fsevents case, events often hit the directory in a not specific enough way, so comparing a new readdir to what we know about the tree is necessary to know what was added or removed.

@kmalakoff
Copy link
Contributor Author

@paulmillr I'm not sure if it is a good option to only compare with and without the PR until this approach has fully realized its value. This PR had two purposes:

  1. get rid of the massive parallel scanning blast on initial scan - I needed the initial scan to play nicely in the background so my electron app could remain responsive during the scan on large folders

  2. lay the foundations for a simpler scanning process and to remove readirp so that optimizations and/or finer control could be implemented - readirp for example stores all results, always lstats, has crazy control flow making changes difficult, etc and @es128 wanted to try a less optimistic stat approach, etc

What I recommend is that you guys would take this over and optimize since I mainly focussed on my use case of providing an option to serialize the initial scan (which you guys said was uncommon), but provided a nice clean, compatible implementation to start from. Note: this initial submission uses walk-filtered through a readirp emulation layer that passes the readirp tests meaning further optimizations are possible by removing readirp features that are not needed by chokidar. This is why I would recommend using walk-fitered directly (without readirp emulation) as a starting place.

@bpasero have you looked at webpack's use of chokidar as they control the registering of directory scanning on a directory-by-directory basis with a depth of 0. Also, depending on your use case around displayed UI, you could maybe use pathwatcher from the atom editor which has a good example in https://github.com/atom/tree-view.

@bpasero
Copy link

bpasero commented Mar 16, 2016

@kmalakoff depth: 0 sounds like exactly what I could use to avoid recursion. @paulmillr correct?

@es128
Copy link
Contributor

es128 commented Mar 16, 2016

@bpasero yes. It'll still do a readdir for the top level, but not recursively.

@bpasero
Copy link

bpasero commented Mar 16, 2016

This might be an option for VS Code to explore then 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants