Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking: Forge is much slower than SJCL? #470

Open
kspearrin opened this issue Jan 15, 2017 · 16 comments
Open

Benchmarking: Forge is much slower than SJCL? #470

kspearrin opened this issue Jan 15, 2017 · 16 comments

Comments

@kspearrin
Copy link

kspearrin commented Jan 15, 2017

I'm writing a simple benchmark to test the speed of Forge vs our current solution, SJCL. Using AES256 CBC to encrypt a UTF word 10k times. With SJCL this takes 195ms. With Forge, the same test is taking 2500ms. Am I doing something wrong? Your benchmarks seem to indicate that Forge is much faster than SJCL.

Forge: https://jsfiddle.net/nfwpuhdx/4/
SJCL: https://jsfiddle.net/fjmr2abq/1/

(see console for output)

@mattcollier
Copy link
Contributor

A quick look at the Chrome profiler indicates that forge is spending a lot of time in: ctx.generateSync

ctx.generateSync = function(count) {

Which is called by forge.random.getBytesSync:

forge/lib/random.js

Lines 95 to 110 in 57ebd54

/**
* Gets random bytes asynchronously. If a native secure crypto API is
* unavailable, this method tries to make the bytes more unpredictable by
* drawing from data that can be collected from the user of the browser,
* eg: mouse movement.
*
* @param count the number of random bytes to get.
*
* @return the random bytes in a string.
*/
ctx.getBytesSync = function(count) {
return ctx.generate(count);
};
return ctx;
}

And indeed, when the following line from your demo is moved in front the for loop, the encryption is performs as expected.

var ivBytes = forge.random.getBytesSync(16);

So, the performance difference seems to be related to the PRNG implementations.

@kspearrin
Copy link
Author

kspearrin commented Jan 15, 2017

Updated benchmark with decrypt added as well. Forge seems to do better with decrypt since I guess there is no PRNG going on:

Forge: https://jsfiddle.net/nfwpuhdx/8/
SJCL: https://jsfiddle.net/fjmr2abq/7/

@mattcollier
Copy link
Contributor

I implemented a Web Crypto RNG which should isolate the encryption functions.

Forge: https://jsfiddle.net/nfwpuhdx/9/
SJCL: https://jsfiddle.net/fjmr2abq/8/

Forge appears to be ~20% faster.

@kspearrin
Copy link
Author

Great! Any reason that's not just baked into forge.random.getBytesSync already?

@mattcollier
Copy link
Contributor

Forge does appear to use the Web Crypto API under certain circumstances. Not sure how this fits in with the demo you've created.

if(typeof window !== 'undefined') {

@mattcollier
Copy link
Contributor

Here's a further refinement of the Web Crypto RNG for Forge which also happens to be more similar to the SJCL implementation.

https://jsfiddle.net/nfwpuhdx/10/

@kspearrin
Copy link
Author

@mattcollier I didn't really see any difference between version 9 and version 10 as far as performance. Were you expecting there to be?

@mattcollier
Copy link
Contributor

No, no performance benefit, but it is a better match to what is implemented in the SJCL demo for comparison purposes.

So, after studying the PRNG implementation in forge I have found that on each call to forge.random.getBytesSync() forge is reseeding itself which involves a lot of overhead, including generating 1K of entropy and distributing it into pools. This process does actually utilize the Web Crypto API, when available, to fill the entropy pool.

On my system, 10K iterations of just forge.random.getBytesSync(16) takes 3000ms.

I found that the reseeding on every request is occurring because of this: https://github.com/digitalbazaar/forge/blob/master/lib/prng.js#L144

Which was part of this commit: 88beb1f

I have found that removing that one line reduces the runtime for 10K iterations of forge.random.getBytesSync(16) to 75ms.

@dlongley
Copy link
Member

dlongley commented Jan 16, 2017

By default, forge follows what the Fortuna algorithm calls for: that the key is to be reset for every request, however small. That may be a bit too paranoid when considering performance. Forge implements Fortuna at the JS level even if a secure native PRNG is available. It only uses the native PRNG to provide the seed entropy. Fortuna ensures we always have a sufficient quantity of pseudorandom data to draw from.

We could perhaps change the underlying implementation to rely more directly on secure native PRNGs, but as I recall, some browsers quickly start throwing QuotaExceeded exceptions as they don't always implement something like Fortuna to provide sufficient pseudorandom data.

@dlongley
Copy link
Member

In any event, thanks @mattcollier for looking into this and determining the difference in this particular benchmark was actually in the PRNG implementation. We do need to build more benchmarking tests into the test suite to keep an eye on performance.

@kspearrin
Copy link
Author

I'm trying to understand why SJCL is doing different then since their implementation seems to be based on Fortuna as well. They do mention some changes in the documentation at the top.

Would it be safe to use window.crypto.getRandomValues() for IVs like in @mattcollier's examples for this demo implementation?

@dlongley
Copy link
Member

@kspearrin, yeah, I think that would be safe. Of course, every browser is different so each PRNG may have different quality, but if a browser implements that API, it is supposed to be a secure PRNG.

@dlongley
Copy link
Member

@kspearrin,

Looks like SJCL changes the key by using its own generated pseudorandom bytes instead of doing a full reseed:

https://github.com/bitwiseshiftleft/sjcl/blob/00bd42dae9ebc3a28708db2053ffa2e88f99ad1e/core/random.js#L326-L343

I imagine that if we did the same it would eliminate the large difference.

@kspearrin
Copy link
Author

Is that a reasonable optimization that could be added to forge?

@dlongley
Copy link
Member

@kspearrin,

I think so. When rekeying we need to produce the key material from a source that is assumed to behave like it's random, so we should be able to reuse the output of the block cipher itself, provided that we still reseed every 1MiB of data. I believe reseeding even on that boundary is only open to a theoretical attack, but we should continue to do it unless we can get input from a reputable cryptographer that says it's not necessary.

So the optimization we would add would move the slowdown from every request to 1MiB boundaries.

@kspearrin
Copy link
Author

kspearrin commented Jan 16, 2017

Great. Please let me know when/if this makes its way into the base Forge library. Until then I'll continue to evaluate directly using window.crypto.getRandomValues() in its place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants