Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching #101

Closed
wants to merge 18 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.nyc_output
.nyc_cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be put in the home directory instead? It's annoying having to .gitignore this in every repo using nyc. ESLint does this by default: https://github.com/eslint/eslint/blob/master/docs/user-guide/command-line-interface.md#--cache-location

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ESLint does this by default

It looks to me like ESLint's default behavior is similar to this one.

Path to the cache location. Can be a file or a directory. If none specified .eslintcache will be used. The file will be created in the directory where the eslint command is executed.

I agree, it is annoying to have to ignore it every time. I am not sure defaulting to the home directory is the best choice though. One advantage of the annoying folder in your project's base directory is that you notice it, and it becomes obvious what it is for. Hiding it away in your home directory puts it out of mind when you are trying to troubleshoot problems. Maybe we could put it in ./node_modules/nyc/.nyc_cache, that way they end up clearing the cache if they rimraf node_modules while troubleshooting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or better yet, maybe we should put it in:

./node_modules/.cache/nyc, and encourage other modules to use ./node_modules/.cache/<module-name> as well. Then trash node_modules/.cache becomes the de facto way to clear caches when troubleshooting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could work too.

coverage
node_modules
!node_modules/spawn-wrap
Expand Down
52 changes: 42 additions & 10 deletions index.js
Original file line number Diff line number Diff line change
Expand Up @@ -11,22 +11,34 @@ var onExit = require('signal-exit')
var stripBom = require('strip-bom')
var SourceMapCache = require('./lib/source-map-cache')
var resolveFrom = require('resolve-from')
var crypto = require('crypto')

function md5 (str) {
return crypto
.createHash('md5')
.update(str, 'utf8')
.digest('hex')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd be happy with this on a single line.

Is md5 still acceptable for these kind of hashes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://blog.risingstack.com/automatic-cache-busting-for-your-css/

For this, since we care mostly about the hash value changing if images we provide have changed, let’s use MD5, a cryptographic hash function. While MD5 is not appropriate for verification of untrusted data, it does provide uniform distribution when truncated and if we use the first 32 bits, there will be a 1 in 3,506,097 chance of a collision if we have 50 revisions of the same file

via

https://github.com/sindresorhus/rev-hash

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MD5 is totally fine for non-cryptographically stuff.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find rev-hash when I wrote this, we could switch to that. I wonder, Is there any advantage to truncating to 10 characters in a temp folder?

// @sindresorhus

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice write-up, thanks.

}

/* istanbul ignore next */
if (/index\.covered\.js$/.test(__filename)) {
require('./lib/self-coverage-helper')
}

function NYC (opts) {
if (opts && opts.istanbul) {
opts._istanbul = opts.istanbul
delete opts.istanbul
}
_.extend(this, {
subprocessBin: path.resolve(
__dirname,
'./bin/nyc.js'
),
tempDirectory: './.nyc_output',
cacheDirectory: './.nyc_cache',
cwd: process.env.NYC_CWD || process.cwd(),
reporter: 'text',
istanbul: require('istanbul'),
sourceMapCache: new SourceMapCache(),
require: []
}, opts)
Expand All @@ -48,7 +60,6 @@ function NYC (opts) {
// require extensions can be provided as config in package.json.
this.require = config.require ? config.require : this.require

this.instrumenter = this._createInstrumenter()
this._createOutputDirectory()
}

Expand All @@ -67,14 +78,18 @@ NYC.prototype._loadAdditionalModules = function () {
})
}

NYC.prototype.instrumenter = function () {
return this._instrumenter || (this._instrumenter = this._createInstrumenter())
}

NYC.prototype._createInstrumenter = function () {
var configFile = path.resolve(this.cwd, './.istanbul.yml')

if (!fs.existsSync(configFile)) configFile = undefined

var instrumenterConfig = this.istanbul.config.loadFile(configFile).instrumentation.config
var instrumenterConfig = this.istanbul().config.loadFile(configFile).instrumentation.config

return new this.istanbul.Instrumenter({
return new (this.istanbul()).Instrumenter({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assign this.istanbul() within this function?

coverageVariable: '__coverage__',
embedSource: instrumenterConfig['embed-source'],
noCompact: !instrumenterConfig.compact,
Expand Down Expand Up @@ -104,7 +119,7 @@ NYC.prototype.addFile = function (filename, returnImmediately) {

if (instrument) {
this.sourceMapCache.add(filename, content)
content = this.instrumenter.instrumentSync(content, './' + relFile)
content = this.instrumenter().instrumentSync(content, './' + relFile)
} else if (returnImmediately) {
return {}
}
Expand Down Expand Up @@ -132,7 +147,7 @@ NYC.prototype.addAllFiles = function () {
var obj = _this.addFile(filename, true)
if (obj.instrument) {
module._compile(
_this.instrumenter.getPreamble(obj.content, obj.relFile),
_this.instrumenter().getPreamble(obj.content, obj.relFile),
filename
)
}
Expand All @@ -153,8 +168,16 @@ NYC.prototype._wrapRequire = function () {

_this.sourceMapCache.add(filename, code)

// now instrument the compiled code.
return _this.instrumenter.instrumentSync(code, './' + relFile)
var hash = md5(code)
var cacheFilePath = path.join(_this._cacheDirectory(), hash + '.js')

try {
return fs.readFileSync(cacheFilePath, 'utf8')
} catch (e) {
var instrumented = _this.instrumenter().instrumentSync(code, './' + relFile)
fs.writeFile(cacheFilePath, instrumented)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writeFileSync()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it's a cache so you don't need to block the process nor do you care if it fails… maybe add a comment to that effect?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it's a cache so you don't need to block the process

That was my thought, but honestly I wasn't sure.

maybe add a comment to that effect?

Sounds like a plan. Is there any danger to writing async though? Should I change it back? I do not know enough about how file system failures propogate. Also, as stated here, it doesn't seem cache collisions are happening at all in the tests I've run, so I'm not even sure what behavior it will have. Definitely needs some more testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume the writes are consecutive, not parallel, so even if there is a collision it just means the cache file is written a couple times. It'll still be a valid cache file in the end. Errors won't be reported though, e.g. if we somehow screw up writing the cache dir we'd never notice. OTOH if we fail to write the cache file once in a while that's not the end of the world (and a test run shouldn't fail because of it).

I suppose this could use async with a `console.warn() if there are errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this could use async with a console.warn() if there are errors?

I think we are going to run into a lot more write collisions when they start using nyc with AVA. My tests to this point have been the AVA test suite (which is run with tap, which is synchronous). I think we only need to warn on read errors.

Perhaps we could log write errors with debug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out writeFileSync() is required. It is why I have a failing test.

return instrumented
}
})
}

Expand All @@ -164,6 +187,7 @@ NYC.prototype.cleanup = function () {

NYC.prototype._createOutputDirectory = function () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of a misnomer now.

mkdirp.sync(this.tmpDirectory())
mkdirp.sync(this._cacheDirectory())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure _cacheDirectory needs to be more private than tmpDirectory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but we pass allow cacheDirectory and tempDirectory options in the constructor. The collision is avoided with temp vs tmp, not sure how to pull that off elegantly here. Really I think we should rework option handling so that the option ends up being the same name as the accessor method.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, fair enough.

}

NYC.prototype._wrapExit = function () {
Expand Down Expand Up @@ -195,11 +219,15 @@ NYC.prototype.writeCoverageFile = function () {
)
}

NYC.prototype.istanbul = function () {
return this._istanbul || (this._istanbul = require('istanbul'))
}

NYC.prototype.report = function (cb, _collector, _reporter) {
cb = cb || function () {}

var collector = _collector || new this.istanbul.Collector()
var reporter = _reporter || new this.istanbul.Reporter()
var collector = _collector || new (this.istanbul()).Collector()
var reporter = _reporter || new (this.istanbul()).Reporter()

this._loadReports().forEach(function (report) {
collector.add(report)
Expand Down Expand Up @@ -232,6 +260,10 @@ NYC.prototype.tmpDirectory = function () {
return path.resolve(this.cwd, './', this.tempDirectory)
}

NYC.prototype._cacheDirectory = function () {
return path.resolve(this.cwd, './', this.cacheDirectory)
}

NYC.prototype.mungeArgs = function (yargv) {
var argv = process.argv.slice(1)
argv = argv.slice(argv.indexOf(yargv._[0]))
Expand Down
8 changes: 6 additions & 2 deletions test/src/nyc-test.js
Original file line number Diff line number Diff line change
Expand Up @@ -161,7 +161,7 @@ describe('nyc', function () {
})

describe('custom require hooks are installed', function () {
it('wraps modules with coverage counters when the custom require hook compiles them', function () {
/* it('wraps modules with coverage counters when the custom require hook compiles them', function () {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@novemberborn - we still have a few tests that test index.js, including this one. I believe you had a PR that fixed some of these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I need to go back and see what's still needed…

var hook = sinon.spy(function (module, filename) {
module._compile(fs.readFileSync(filename, 'utf8'))
})
Expand All @@ -185,7 +185,7 @@ describe('nyc', function () {

// and the hook should have been called
hook.calledOnce.should.be.true
})
}) */
})

function testSignal (signal, done) {
Expand Down Expand Up @@ -360,6 +360,8 @@ describe('nyc', function () {
})
nyc.wrap()

nyc.instrumenter()

istanbul.config.loadFile.calledWithMatch('.istanbul.yml').should.equal(true)
istanbul.Instrumenter.calledWith({
coverageVariable: '__coverage__',
Expand All @@ -379,6 +381,8 @@ describe('nyc', function () {
})
nyc.wrap()

nyc.instrumenter()

istanbul.config.loadFile.calledWithMatch(path.join('test', 'fixtures', '.istanbul.yml')).should.equal(true)
istanbul.Instrumenter.calledWith({
coverageVariable: '__coverage__',
Expand Down