-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
very slow on long (base64 encoded) strings #28
Comments
it seems better if this truncation was done in this library, but I'm not sure where's best. when running on large inputs it seems to spend the majority of its time running the I'm wondering whether it's enough to just assume the start is representative. i.e. just run it on the first 100 bytes, then multiply the entropy up by the remaining characters, e.g.: diff --git a/zxcvbn.c b/zxcvbn.c
index ebe9e31..d929ff6 100644
--- a/zxcvbn.c
+++ b/zxcvbn.c
@@ -52,6 +52,9 @@
/* Minimum number of characters in a incrementing/decrementing sequence match */
#define MIN_SEQUENCE_LEN 3
+/* Maximum number of characters to perform full entropy calculation over */
+#define MAX_DETAIL_LEN 100
+
/* Year range for data matching */
#define MIN_YEAR 1901
#define MAX_YEAR 2050
@@ -1574,7 +1577,8 @@ double ZxcvbnMatch(const char *Pwd, const char *UserDict[], ZxcMatch_t **Info)
ZxcMatch_t *Zp;
Node_t *Np;
double e;
- int Len = strlen(Pwd);
+ const int StrLen = strlen(Pwd);
+ const int Len = StrLen < MAX_DETAIL_LEN ? StrLen : MAX_DETAIL_LEN;
const uint8_t *Passwd = (const uint8_t *)Pwd;
uint8_t *RevPwd;
/* Create the paths */
@@ -1704,6 +1708,11 @@ double ZxcvbnMatch(const char *Pwd, const char *UserDict[], ZxcMatch_t **Info)
/* Make e hold entropy result and adjust to log base 2 */
e = Nodes[Len].Dist / log(2.0);
+ /* guesstimate remainder by assuming the start is representative */
+ if (Len < StrLen) {
+ e *= StrLen / (double)Len;
+ }
+
if (Info)
{
/* Construct info on password parts */
|
I have implemented a character limit of 100 on the entropy calculation as you suggested, but using a different method. There doesn't seem to be one correct way of handling the excess length, I chose to include an entropy figure for them based on (password length) - 100. This allowed the test cases to pass without warning of incorrect lengths and recognizes that a 500 character password has more entropy than its first 100 characters (although the figure for the last 400 characters would be low). |
0745d23 looks like you're putting more effort into giving an estimate than I'd bother with. any secret with more than a few hundred bits of entry is beyond any feasible brute forcing, but I guess it's nice to give some sort of number back thanks! |
Concur, can you shift your PR to just incorporate this latest version of zxcvbn. Although we do "prefer" the system's version of the library... which will backfire on Linux distros (the prime target for the fix). So more thought needs to be put into place on whether we decide to fully embed the code or allow the user to choose system or internal. I prefer the former. |
Glad to see this was fixed already in your implementation. I was issuing warnings to other implementations due to some reports of crafted password attacks from users of my library. There is even an exploit script written to target libraries like this: https://github.com/twosixlabs/acsploit/blob/fd5602adf9f312482b8010abf6b4691f08929bc4/acsploit/exploits/passwords/zxcvbn.py The length limitation seems to do alright in stopping things from getting out of hand, even with specially crafted passwords you can still only get on the order of hundreds of milliseconds to make a guess (with my implementation at least). |
this algorithm gets very slow on long strings hence the original upstream version recommends truncating long inputs to 100 characters, for example see dropbox/zxcvbn#69 for downstream projects running into issues
the specific issue I have is zxcvbn being used in keepassxc, and recently running into an issue that causes the UI to become unresponsive for a minute or so. this was due to another program instructing keepass to store multiple JSON Web Tokens and keepass using zxcvbn to calculate password quality for each one
these secrets (not really "passwords") are large JWTs; which are basically base64 encoded strings which take a long time for this package to check. for example, you can create a string via the following Python code:
running
ZxcvbnMatch
on this takes multiple seconds when run withnum_bytes > 2000
, with the reported entropy being some suitably large numberThe text was updated successfully, but these errors were encountered: