-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: remove spread opts and toString of integrity #71
Conversation
$ npm view ssri engines
{ node: '^14.17.0 || ^16.13.0 || >=18.0.0' } we can use opts?.algorithms et al |
@@ -161,7 +162,7 @@ class Hash { | |||
if (!match) { | |||
return | |||
} | |||
if (strict && !SPEC_ALGORITHMS.some(a => a === match[1])) { | |||
if (strict && SPEC_ALGORITHMS[match[1]] !== true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep SPEC_ALGORITHMS
as an array and use .includes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using includes is little bit slower than just the property index, do you have some reason to keep it as an array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a balance against performance and developer experience. I think "a little bit" slower is ok here given that the vast majority of an npm install is disk and io bound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a mistake, using includes is faster if we don't know the value:
const Benchmark = require('benchmark');
const suite = new Benchmark.Suite();
const SPEC_ALGORITHMS = {
sha256: true,
sha384: true,
sha512: true,
};
const SPEC_ALGORITHMS_ARRAY = Object.keys(SPEC_ALGORITHMS);
const randomAndUnkown = [...SPEC_ALGORITHMS_ARRAY, 'test'];
suite
.add('includes', function () {
const random = randomAndUnkown[Math.floor(Math.random() * randomAndUnkown.length)];
const r = SPEC_ALGORITHMS_ARRAY.includes(random);
})
.add('index access', function () {
const random = randomAndUnkown[Math.floor(Math.random() * randomAndUnkown.length)];
const r = SPEC_ALGORITHMS[random] === true;
})
.on('cycle', function (event) {
console.log(String(event.target));
})
.run({ async: false });
Perf:
includes x 83,638,547 ops/sec ±1.96% (92 runs sampled)
index access x 28,349,129 ops/sec ±2.07% (90 runs sampled)
My assumptions not always are good, I forgot that random access for an object is slower.
I will turn back to includes.
I really appreciate the time you put into this, can we maybe break this up so that the changes aren't so huge in one PR? Specifically I'm worried about the default opts handling. It'd be nice to isolate those changes against the other tweaks. |
@wraithgar What about 3 PRs:
It will be better? |
Yes I think 3 PRs would be ok. |
@wraithgar First one created at #72, when it was merged, I create the next one. |
I was looking the CPU profiler of pnpm and I saw this call:
https://github.com/pnpm/pnpm/blob/ef6c22e129dc3d76998cee33647b70a66d1f36bf/store/cafs/src/getFilePathInCafs.ts#L29-L30
I thought about what I could do to optimize and then I found good performance improvements.
Removing spread of opts
The first thing I notice was the spread of
defaultOpts
in every method, sometimes being called twice without needing.So remove all the calls, before this change:
With the deletion of
opts
:benchmark.js
Faster toString of Integrity
I look at a bunch of maps and filters and I just rewrite everything to perform the same operation without a bunch of loops.
With this optimization, we gain a little bit more performance:
But with this change, I introduce a little breaking change, before, when calling
toString
of Integrity, the order of the hashes was defined by the order of the hashes during the insert/parsing.Now, to optimize and avoid calling
Object.keys
onstrict
mode, I just call the properties directly, so the order will always be deterministic as:sha512
,sha384
, andsha256
. If I change the order of these calls, the tests break.If you think this is a problem, I can call
Object.keys
even in strict mode (-40k/ops).Faster integrity check when is stream
I also take a look at streams mode because PNPM also verify the integrity of the files using streams.
The initial version was already fast compare to the main:
I also saw that
checkStream
doesn't support the optionsingle
and almost all verifications that are done by PNPM only verify a single hash, so I see an opportunity to push the performance a little bit further.But I did an experiment, If we ignore all the checkStream codes and jump to the final verification, we can achieve this performance:
I put the code in the file above, the assumption is: if we verify only one hash, we can skip a lot of verifications.
So I think I could be good to
ssri
to export single hash verifications, what do you think?benchmark-stream.js
In general, with these optimizations, we had a bump of more than 2x in performance.