-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fs: Optimizations 3: The Revenge #12105
Conversation
82e924e
to
972740f
Compare
Looking at the CITGM results, I'm not sure why |
/cc @nodejs/collaborators |
Amazing work, @mscdex. I'm thrilled to see this progress. I skimmed through it and nothing stuck out to me as worth discussing. I'll give it another run later, and hope some other @nodejs/collaborators will give it a pass too. |
@mscdex WRT |
@lpinca Yep, but who knows when it will get switched over to TurboFan since it seems it currently doesn't? Perhaps it is switched over in a later version of V8... There still might be some parts where we won't expect there to be many properties, so using a |
Yes it makes sense. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work, lots of changes but they're very regular (in that you applied the same optimization to a lot of places).
Left some comments but those are mainly questions. LGTM
@@ -43,6 +43,18 @@ const internalUtil = require('internal/util'); | |||
const assertEncoding = internalFS.assertEncoding; | |||
const stringToFlags = internalFS.stringToFlags; | |||
const getPathFromURL = internalURL.getPathFromURL; | |||
const O_RDONLY = constants.O_RDONLY | 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the | 0
here adn not in constants
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just copied these verbatim from lib/internal/fs.js. It implicitly converts the value to a 32-bit integer, so there may be a reason for that, performance or otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mscdex it gives a hint to the compiler that it's a SMI.
I'm just wondering why it isn't done on the constant itself. I guess that since this is copied over it shouldn't be a big deal here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I wasn't sure of the upper ceiling for SMIs offhand, but yes that's probably what it would be.
if (typeof options === 'string') { | ||
defaultOptions = util._extend({}, defaultOptions); | ||
defaultOptions = shallowClone(defaultOptions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any benchmarks comparing util._extend
Object.assign
and shallowClone
?
Would it be faster to do defaultOptions = Object.create(defaultOptions)
and use the fact it's on the prototype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested Object.assign()
with V8 5.7 but I would guess it's still slow. shallowClone
is really fast because it literally copies everything as-is (including frozen/sealed state, property attributes, etc.). From my own separate benchmarks, shallowClone()
was something like at least twice as fast as for (... in ...)
and util._extend()
uses Object.keys()
to iterate instead which the last time I tested is even slower than for (... in ...)
in many/most cases.
lib/fs.js
Outdated
@@ -167,7 +167,7 @@ function nullCheck(path, callback) { | |||
} | |||
|
|||
function isFd(path) { | |||
return (path >>> 0) === path; | |||
return (typeof path === 'number' && ~~path === path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change faster? I'm not a fan of the old code but doing ~~path === path
should at least be motivated IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the conversion by itself seems to be slightly faster in V8 5.7.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe leave a comment on that? I'm not sure how many people understand what ~~
does here.
@@ -255,11 +255,15 @@ Object.defineProperties(fs, { | |||
}); | |||
|
|||
function handleError(val, callback) { | |||
if (typeof val === 'string') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind explaining this change?
It looks like it means wrapping a lot of other calls with if(typeof x !== 'string')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just optimizing for the case where URL
instances are passed to fs
methods. handleError()
is passed the result of a function which converts URL
instances to string paths. By checking for string path results first, we can avoid the instanceof
, which is probably a less common case.
}; | ||
|
||
const kReadFileBufferLength = 8 * 1024; | ||
function ReadFileContext(callback, encoding, isUserFd) { | ||
this.req = new FSReqWrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, moving things here looks a lot cleaner. I think any slots saved here could be a big win
@@ -470,7 +480,7 @@ function readFileAfterRead(err, bytesRead) { | |||
|
|||
function readFileAfterClose(err) { | |||
var context = this.context; | |||
var buffer = null; | |||
var buffer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason a default null isn't assigned anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unnecessary because it's re-assigned some kind of value below every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mscdex I'm guessing that it was put there as an optimization (so there is always a value here) - I don't see why it would make things faster to assign to null in this particular case though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without looking, my guess is that a default case/branch was missing at some point in the past, but that is no longer the case in this function.
fs.write(fd, buffer, offset, length, position, function(writeErr, written) { | ||
if (writeErr) { | ||
function writeAll(fd, isUserFd, buffer, offset, length, position, callback) { | ||
var req = new FSReqWrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice.
@@ -0,0 +1,50 @@ | |||
// test the throughput of the fs.ReadStream class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: capital T
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just copied from the other benchmark. TBH I don't think the comment even needs to exist.
972740f
to
0fe387d
Compare
@benjamingr I've changed the |
|
@@ -319,39 +329,46 @@ fs.existsSync = function(path) { | |||
} | |||
}; | |||
|
|||
const defaultReadFileOpts = Object.create(null, { | |||
flag: { value: O_RDONLY, writable: true } | |||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain how much performance impact this may have, but because of the Object.create(null, ...)
:
const { O_RDONLY } = fs.constants;
const defaultReadFileOpts = Object.create(null, {
flag: { value: O_RDONLY, writable: true }
});
console.log(%HasFastProperties(defaultReadFileOpts));
// Prints false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's expected because of the changes made in V8 to Object.create(null)
, but at least it's only created once during startup, so the performance cost of creating the object is amortized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually thinking about property access, not object creation. Shallow-cloning an object created with Object.create(StorageObject.prototype, ...)
(so that it has fast properties) is much faster (~40%), since slow properties are only faster for map-like usage:
benchmark
const { shallowClone } = process.binding('util');
const { constants: { O_RDONLY } } = require('fs');
const n = 1e7;
function StorageObject() {}
StorageObject.prototype = Object.create(null);
const direct = Object.create(null, {
flag: { value: O_RDONLY, writable: true }
});
const proxied = Object.create(StorageObject.prototype, {
flag: { value: O_RDONLY, writable: true }
});
console.log(%HasFastProperties(direct)); // false
console.log(%HasFastProperties(proxied)); // true
function cloneDirect() { return shallowClone(direct); }
function cloneProxied() { return shallowClone(proxied); }
// warmup
for (var i = 0; i < n; i++)
cloneDirect();
for (var i = 0; i < n; i++)
cloneProxied();
console.time('direct');
for (var i = 0; i < n; i++)
cloneDirect();
console.timeEnd('direct');
console.time('proxied');
for (var i = 0; i < n; i++)
cloneProxied();
console.timeEnd('proxied');
false
true
direct: 624.346ms
proxied: 369.772ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mscdex ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping this issue.
No, I haven't bothered checking because in that issue it's pretty clear that there is quite a negative impact across the board with it enabled. I'm guessing (or my hope is) that the V8 team is still working on optimizing the new pipeline.
As mentioned, the only place where this really matters as this PR currently stands is |
@MylesBorins ... any thoughts on this one re:graceful-fs and the CITGM results? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on a quick review LGTM given the data showing the perf benefits and assuming a green CI.
0fe387d
to
73acfea
Compare
Another CITGM run FWIW: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/713/ |
The main focus of this commit is to: * Avoid options object copying where possible * Use V8's object cloning only for default options objects where we are in control of the object's configuration since the object clone will also copy properties' configuration, frozen/sealed state, etc. * Refactor writeFile*()/appendFile*() to reuse shared logic without the redundant argument checking and to reuse FSReqWrap instances more
73acfea
to
53a7974
Compare
@nodejs/v8 Would it be possible for us to float our own V8 patch to add an |
@mscdex ... it is too late to get this in to 8.0.0 at this point. |
This needs a rebase. Does this need anything else to be ready to land? |
@BridgeAR my last comment still stands, we need either an Also, I'd need to re-benchmark these changes anyway. |
I suppose it's too late to get this into 9.0.0. Changed the milestone to 10.0.0. |
@mscdex Are you still interested in getting this in? This would need a rebase but we now have a CI job to run the benchmarks. |
@joyeecheung Yes, but it's blocked on changes needed in V8. |
I am confident that the performance gain is not a lot anymore with newer V8 versions. @mscdex would you be so kind and rebase this and rerun the benchmarks? |
var flag; | ||
var encoding; | ||
if (options != null) { | ||
options = getOptions(options, defaultReadFileOpts); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of duplicates, you have O_RDONLY in defaultReadFileOpts and in 552 and 555 line. Better to remove defaultReadFileOpts
and add defaltReadFileFlag
. stringToFlags
should be in getOptions.
const defaltReadFileFlag = O_RDONLY;
if (options != null) {
options = getOptions(options);
flag = options.flag || defaultReadFileOpts;
encoding = options.encoding;
} else {
flag = defaultReadFileOpts;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And getOptions
should be like:
function getOptions(options) {
const type = typeof options;
if (type === 'string') {
return { flag: stringToFlags(options) };
} else if (type !== 'object') {
throw new TypeError('"options" must be a string or an object, got ' +
typeof options + ' instead.');
}
if (options.flag)
options.flag = stringToFlags(options);
if (options.encoding !== 'buffer')
assertEncoding(options.encoding);
return options;
}
Ping @mscdex |
Closing due to long inactivity. @mscdex please feel free to reopen if you want to continue working on it! |
I removed this from the node 10 milestone. Feel free to reopen and assign a milestone when appropriate. |
As part of the continuing
fs
optimization saga, I offer the latest from our courageous adventurer, node.js core, and their continuing crusade against slow code!These commits bring improvements to just about every method in
fs
, some are affected more greatly than others. I'm tentatively marking this as semver-major mostly for two reasons: the switch to using binding methods directly instead of using exportedfs
methods and secondly due to the stricter (or what I consider to be more correct) check when usinggetOptions()
in sync methods, namely that anoptions
parameter should not be checked for a function type.Here are results for some of the relevant, existing benchmarks:
These benchmarks compare binaries that both include V8 5.7 and have cfc8422 reverted (see my comments on that commit's PR for more discussion on that -- although as of this PR it would only affect
fs.realpath*()
, which was not really touched in this PR).I did add a few new benchmark files, but adding new benchmarks for every single
fs
method and running each one would take forever and would most likely either extend past the semver-major landing deadline for node v8 (which I am hoping this will make it into yet) or would be too close to the deadline, not giving enough time for review.Will our brave and mighty hero, node.js core, survive the onslaught of V8 optimization changes? Will they finally accrue enough frequent flier miles to get that vacation they've always wanted? Tune in next time....!
CI: https://ci.nodejs.org/job/node-test-pull-request/7080/
CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/676/
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)