Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Create less IO errors to improve efficiency of require #8189

Closed

Conversation

errendir
Copy link

In some environment (especially when coffescript is involved) module.js
will spend a lot of time looking up nonexisting files. This is caused by the
lengthy error creation. Those errors are then immediately caught.

This change makes the lookup much faster, by avoiding the unnecessary error
creation.

module::statPath now uses the new second parameter of fs.statSync function of the
fs.js. There is a similar change for the fs.stat, adding the third parameter.

@errendir
Copy link
Author

I added int throw_safe; to the uv_fs_s struct in order to keep track of whether one particular call needs to throw an error or not. If there is a better way of doing this, please tell me.

Also - uv.h is part of the libuv, so this change will have to be propagated there.

As for the motivation for this change: While profiling the server startup of the application I work on, I have noticed that a lot of time is spent on URIError. All of those errors are then immediately caught in the module.js. This seems like an inefficiency that can be avoided. In my particular case, applying this change make the server startup 30% faster.

@trevnorris
Copy link

Going to close this for now until the libuv deps change has propagated. Once that's been done ping back and I'll reopen.

@trevnorris trevnorris closed this Aug 18, 2014
errendir pushed a commit to errendir/libuv that referenced this pull request Aug 19, 2014
This variable is meant to control whether IO errors are created or not.
This enables: nodejs/node-v0.x-archive#8189

The general idea is to throw less errors that are immediately caught.
errendir pushed a commit to errendir/libuv that referenced this pull request Aug 19, 2014
This variable is meant to control whether IO errors are created or not.
This enables: nodejs/node-v0.x-archive#8189

The general idea is to throw less errors that are immediately caught.
@errendir
Copy link
Author

OK, I have made the pull request for libuv: joyent/libuv#1428
Please tell me what else can I do to make this change happen.

errendir pushed a commit to errendir/libuv that referenced this pull request Aug 19, 2014
This variable is meant to control whether IO errors are created or not.
This enables: nodejs/node-v0.x-archive#8189

The general idea is to throw less errors that are immediately caught.
@errendir
Copy link
Author

@trevnorris As it was explained to me in the libuv pull request (joyent/libuv#1428) this change should be node.js internal.

Here is the commit errendir@f25c4db. It removes the libuv changes and makes this modification completely independent from libuv.

Could you please reopen this PR?

@trevnorris
Copy link

I'll reopen. Go ahead and force update the commit on this PR to the code you pointed to.

This will definitely need revision, and I'm sure @indutny will have something to say. Also the words "especially when coffescript is involved" in the commit message don't exactly help the viability of the commit or why it should land. But more importantly you're changing public API without any doc or test updates. It'll never be merged w/o those.

@trevnorris trevnorris reopened this Aug 19, 2014
@errendir
Copy link
Author

OK, I understand. I mentioned coffeescript, but this applies in general to all attempts made by module.js to load a file - tryExtensions might try loading files that don't exist. This is magnified by the use of coffeescript, as it registers its extensions after the standard extensions (.js, .json, .node). This means that each require of a coffeescript file hits three nonexisting files. Each of those attempts triggers creation, throw and catch of an error.

As for the tests and docs - I will look into it.

@errendir
Copy link
Author

The tests are added and the documentation is updated. As for the naming conventions - is it a rule to use camel case? Do you have a style guide?

@errendir errendir force-pushed the feature/remove-redundant-throws branch from 0a9c02b to 40ab368 Compare August 20, 2014 00:28
@trevnorris
Copy link

I appreciate the work you've done here. Also see that you have it based on the v0.10 branch. API changes are only allowed into stable releases if they are to fix critical bugs or security vulnerabilities. Maybe could make it into v0.12.

/cc @indutny How do you feel about this?

@errendir
Copy link
Author

There is a way to make this change happen with no API modifications - we can make module.js use the fs binding directly instead of using fs.js. This way the change could be limited only to the binding and module.js and still benefit the efficiency.

@trevnorris
Copy link

/cc @indutny @tjfontaine Feedback on all this?

@medikoo
Copy link

medikoo commented Aug 26, 2014

I believe throw_safe must not be defined as last argument after callback, but as an optional argument before callback. Asynchronous functions always take callback as last argument, so this proposal breaks this consistency seriously (taking that in, would also mean this function will no longer work with various generic promisification utilities, which are based on that contract).

Best if throwSafe setting is passed with options object as optional second argument, so it's fs.stat(filename, [options], callback) it would be also consistent with other fs functions, e.g. fs.readFile

@@ -153,12 +153,15 @@ Only available on Mac OS X.

Synchronous lchmod(2).

## fs.stat(path, callback)
## fs.stat(path, callback, [throw_safe])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't land this in v0.10, at least not the docs. If everything else looks fine here, we should probably land docs in v0.12 after the merge.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'd like it if we start properly documenting optional arguments. e.g.:

## fs.stat(path, callback[, throw_safe])

I've been having the urge to go through all the documentation an change this. Drives me nuts.

@indutny
Copy link
Member

indutny commented Aug 27, 2014

@errendir I haven't finished the "Code review" yet, but I have a question: could you please paste here a benchmark that demonstrates the performance improvement after your patch? The changes appears to be quite vast, and I'm not sure what exactly do we gain with it.

@errendir
Copy link
Author

errendir commented Sep 2, 2014

@indutny Sorry for the late response. Here is the toy test I wrote to see what kind of improvement we can see:

bin = process.binding('fs')

var t1;
function s() {
  t1 = new Date();
}

function e() {
  t2 = new Date()
  console.log(t2.getTime() - t1.getTime() + 'ms')
}

var names = [];
for (var i=0; i<100000; i++) {
  names.push((Math.random()*1000000).toFixed(2));
}

/* Measure the time it takes the throwSafe bin.stat to run */
console.log('throwSafe = true');
s();
for (var i=0; i<100000; i++) {
  bin.stat(names[i], undefined, true);
}
e();

/* Measure the time it takes the standard bin.stat to run */
console.log('throwSafe = false');
s();
for (var i=0; i<100000; i++) {
  try {
    bin.stat(names[i]);
  } catch(e) {}
}
e();

The results I get are:

throwSafe = true
298ms
throwSafe = false
1385ms

Thanks for the code review. I can make this change not modify the public interface of the fs module by using the fs binding directly in the module module. What do you think about this solution?

@trevnorris
Copy link

@medikoo

Asynchronous functions always take callback as last argument [...]

Untrue. Take the following:

setTimeout(callback, time[, arg1[, arg2[, ...]]]);

The same pertains to setInterval() and setImmediate(). The "convention" of placing the callback at the end of the parameters list is simply because it is easier to read when the function is declared inline. I personally prefer optional arguments to be placed at the end of the parameters list.


Asynchronous stat(2). The callback gets two arguments `(err, stats)` where
`stats` is a [fs.Stats](#fs_class_fs_stats) object. See the [fs.Stats](#fs_class_fs_stats)
section below for more information.

The throw_safe parameter if set to true prevents the error creation. If the call

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw_safe => throw_safe

true => true

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. use backticks.

@trevnorris
Copy link

Ah. Before I continue with the review, mind rebasing this on v0.12 or master?

@medikoo
Copy link

medikoo commented Sep 3, 2014

@trevnorris I meant asynchronous function pattern as introduced with Node.js. It would be a very first function that breaks that contract, which makes it both a breaking change and dangerous precedence

@medikoo
Copy link

medikoo commented Sep 3, 2014

@trevnorris to put more light on this: there are many decorators in user land that rely on that contract, and fs.stat with throwSafe = true as proposed here, will not work with them.

See e.g.
https://github.com/petkaantonov/bluebird/blob/master/API.md#promisification
https://github.com/twistdigital/es6-promisify#es6-promisify
https://github.com/medikoo/deferred#promisifyfn-length

@trevnorris
Copy link

@medikoo Just curious, then how are those supposed to work with functions like setTimeout()?

@medikoo
Copy link

medikoo commented Sep 3, 2014

@trevnorris setTimeout origins from very different API, it was never considered as a node style asynchronous function.
Nobody expects that it will work with those decorators, it's a different thing.

@trevnorris
Copy link

@medikoo I really couldn't give a rat poop about the concerns of Promises, but I do see what you mean by "node style callbacks". Fair enough.

@errendir
Copy link
Author

errendir commented Sep 3, 2014

@trevnorris Here is the PR into master: #8312

@medikoo I agree that the interface change is not usually a good idea. However the efficiency improvement is significant, so we should find the correct place for this change. I was considering different ways to avoid the interface change:

  • Make the module.js use the fs binding directly, bypassing fs.js. This is bad because bypassing of the interface of fs.js forces us to keep two contracts - one for fs.js and the other one for the binding (node_file.cc)
  • Make the fs.js module configurable. We can't make it one global state as that would make different users of fs.js interfere. We could add getContext(options) or something like that to the fs.js module. Then the user would be able to call getContext(...) and obtain the configured version of fs behaving differently. We could make it with a class pattern.

In some environment (especially when coffescript is involved) module.js
will spend a lot of time looking up nonexisting files. This is caused by the
lengthy error creation. Those errors are then immediately caught.

This change makes the lookup much faster, by avoiding the unnecessary error
creation.

module::statPath now uses the new second parameter of fs.statSync function of the
fs.js. There is a similar change for the fs.stat, adding the third parameter.

Apply suggestions from libuv pull request

Thanks saghul: joyent/libuv#1428

Fix the datastructures for async

Documentation update

Add tests and simplify Stat function

Apply the code review comments
@errendir errendir force-pushed the feature/remove-redundant-throws branch from dd15813 to dbd9afb Compare September 3, 2014 20:18
binding.stat(pathModule._makeLong(path), cb);
function cb(err, stats) {
if (callback) callback(err ? false : true);
if (!nullCheck(path, undefined, true)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you setting cb as undefined? fs.exists() is async, and if the callback isn't passed then it should throw that the callback wasn't passed.

@trevnorris
Copy link

ah wait. nm. i'm going to place all comments on the new PR.

@jasnell
Copy link
Member

jasnell commented Aug 13, 2015

@trevnorris ... any further thoughts on this?

@jasnell
Copy link
Member

jasnell commented Aug 13, 2015

Actually... closing this in favor of #8312. Can reopen if necessary.

@jasnell jasnell closed this Aug 13, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants