-
-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching #101
Caching #101
Conversation
@jamestalmage |
@@ -1,4 +1,5 @@ | |||
.nyc_output | |||
.nyc_cache |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be put in the home directory instead? It's annoying having to .gitignore this in every repo using nyc. ESLint does this by default: https://github.com/eslint/eslint/blob/master/docs/user-guide/command-line-interface.md#--cache-location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ESLint does this by default
It looks to me like ESLint's default behavior is similar to this one.
Path to the cache location. Can be a file or a directory. If none specified .eslintcache will be used. The file will be created in the directory where the eslint command is executed.
I agree, it is annoying to have to ignore it every time. I am not sure defaulting to the home directory is the best choice though. One advantage of the annoying folder in your project's base directory is that you notice it, and it becomes obvious what it is for. Hiding it away in your home directory puts it out of mind when you are trying to troubleshoot problems. Maybe we could put it in ./node_modules/nyc/.nyc_cache
, that way they end up clearing the cache if they rimraf node_modules
while troubleshooting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or better yet, maybe we should put it in:
./node_modules/.cache/nyc
, and encourage other modules to use ./node_modules/.cache/<module-name>
as well. Then trash node_modules/.cache
becomes the de facto way to clear caches when troubleshooting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work too.
Moved this comment to here, since it was a big wall of text and not entirely applicable to this issue: |
@@ -161,7 +161,7 @@ describe('nyc', function () { | |||
}) | |||
|
|||
describe('custom require hooks are installed', function () { | |||
it('wraps modules with coverage counters when the custom require hook compiles them', function () { | |||
/* it('wraps modules with coverage counters when the custom require hook compiles them', function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@novemberborn - we still have a few tests that test index.js
, including this one. I believe you had a PR that fixed some of these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I need to go back and see what's still needed…
994dd67
to
84fce89
Compare
This provides a minor speed improvement (~10% on the AVA test suite). It defers source-map processing, saving sourceMap data alongside instrumented sources in the cache. SourceMapCache is only instantiated for `report` runs.
Just pushed another commit that squeezes out an additional 2 or 3 seconds from the AVA test suite. It works by caching source-map information as well. Saved only once when it sees a cache miss for the source. If there is a cache hit, it does not even extract the source map (source map extraction can be expensive for large files). It does not apply source map information when exiting the process, but instead adds a The only downside to this is that you can no longer use just the contents of |
Some stats from running the AVA test suite:
|
Dropped AVA test suite coverage penalty from 5 seconds to 4 seconds
The latest commit drops That dropped the coverage penalty from
|
See: jamestalmage@f14f98f It lazy-loads any dependency that we are not guaranteed to need in a forked process, that includes:
That appears to shave of some time. The AVA test runs go from a shaky 29s/30s to a solid 29s on every run. It breaks a bunch of tests so I am going to hold off on it until this is merged and we develop some more accurate benchmarking. I think I am reaching the limits of using AVA's test suite and |
Shaved off another second. if (!this.include || micromatch.any(filename, this.include)... |
|
||
this.exclude = ['**/node_modules/**'].concat(config.exclude || ['test/**', 'test{,-*}.js']) | ||
if (!Array.isArray(this.exclude)) this.exclude = [this.exclude] | ||
this.exclude = ['**/node_modules/**'].concat(arrify(config.exclude || ['test/**', 'test{,-*}.js'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could call this._prepGlobPatterns()
here rather than reassigning this.exclude
?
}) | ||
} | ||
|
||
NYC.prototype.cleanup = function () { | ||
if (!process.env.NYC_CWD) rimraf.sync(this.tmpDirectory()) | ||
if (!process.env.NYC_CWD) rimraf.sync(this.tempDirectory()) | ||
} | ||
|
||
NYC.prototype._createOutputDirectory = function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit of a misnomer now.
Cool.
I'm just guessing here but that code path looks like it might be optimized quite well by V8. IIRC it also caches regex compilation.
That's nicer though if the performance penalty isn't too high I'd still rather require them normally. There's now extra weirdness in using
Nothing jumps out at me other than istanbuljs/append-transform@73f7aba#commitcomment-15067878. I'll try and push up some commits removing |
Source maps are applied to a report that was just read from disk. Cloning is needed for the tests but otherwise unnecessary. Change path-related tests to look for the path property on the report, without comparing the report contents. That's silly now that applySourceMaps() doesn't do the cloning. No need to copy coverage.b values when mapping branches.
Use 'report' and fileReport' to be more consistent with index.js.
It was only used by the tests, which can use addMap.
This is no longer used. This was also the last runtime dependency on lodash, which has now been moved to the dev dependencies.
@jamestalmage see jamestalmage#1 :) |
source-map-cache improvements
b08651a
to
afedbfb
Compare
OK, I merged @novemberborn's commits simplifying I think this PR is in a good place and ready for a merge. |
👍 I'm not familiar enough with the code base to be able to review this properly, but from a quick skim of the changes it looks pretty good to me. Excited about the perf improvements! Super nice work @jamestalmage :) |
Just added a test to create cache collisions. It spawns 4 processes (all wrapped by I added some logging statements to our cache hook (not included in the commit), and I can verify that all 4 processes see a cache miss (the coordination worked). All 4 processes also successfully write back to the cache, no error is thrown even though they are all trying to write to the same file. It does not appear we have to worry about write collisions, and that competing |
That's not my experience. I don't think it locks the file unless you do |
I have tried a few more iterations of my test with various changes trying to cause a problem, but I can't. Even in Node 0.10. I don't know how
We don't care about the previous contents of the file, we just clobber with data we know to be good. There is no read-mutate-write sequence
It seems crazy to me that |
Good point. Guess it's a non-issue in this case then.
That's just what many people have told me. No idea if it's actually accurate.
Can you open an issue on Node.js about better docs regarding that? I would definitely like to see that too. |
@jamestalmage I like the idea of caching for large projects that benefit from it. It does, however, introduce quite a few more moving parts. I've learned making nyc run in as many setups and environments as possible (cough Windows) that every moving part is going to be more edge-cases for us to debug on various outlier platforms.... So, my ask for caching is that we make it a configuration option, and my temptation in Would love to hear other people's thoughts. |
The difference is noticeable even with the 50 line module I am working on now.
I think your concerns are reasonable, especially before we get AppVeyor up and running. I think we should eventually turn it on by default once we see green from AppVeyor, and we have dogfooded it for a while. A feature toggle is certainly a good idea as it gives us a way to eliminate the cache implementation as the source of reported bugs. |
@jamestalmage cool, I think we're on the same page 👍 turning this on by default just feels a fairly big change for a |
👍 |
This incorporates feedback on the istanbuljs#101 PR from @novemberborn and @sindresorhus - rename: `createOutputDirectory()` -> `createDatastoreDirectories()` - rename: `_addSource()` -> `_maybeInstrumentSource()` - drop `tmpDirectory()` entirely, just use `tempDirectory()` - make `hashCache` and `loadedMaps` instance members instead of closure references - use `lazy-req` instead of trying to defer requires inline in the code. Future Changes: - I eventually revert out the `lazy-req` stuff once the `source-map-cache` no longer uses `lodash`.
The codebase shifted significantly while I was working on PR istanbuljs#101. This is a commit after rebasing to fix a few errors that surfaced, and incorporate the new changes in support of Windows. Includes: slight tweak to get tests passing in Windows (cherry picked from commit 25eb6a7)
So...
Caching makes a pretty big difference.
The AVA test suite went from
42
s to32
s on my machine, even with this relatively trivial implementation. It generates a new directory:cwd/.nyc_cache
and writes cached files there based on themd5
hash of their pre-instrumented content.It is probably worth checking out
cacha
for this.@floatdrop, my impression of avajs/ava#189 was that it was still a WIP. Where is
cacha
currently? Is it ready for us to push into production? If not, I am happy to help any way I can.