Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fast parser (FDP) for large BigDecimals (500+ chars) #1157

Merged
merged 1 commit into from
Dec 8, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 1 addition & 129 deletions src/main/java/com/fasterxml/jackson/core/io/BigDecimalParser.java
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ public static BigDecimal parse(final char[] chars, final int off, final int len)
if (len < 500) {
return new BigDecimal(chars, off, len);
}
return parseBigDecimal(chars, off, len, len / 10);
return JavaBigDecimalParser.parseBigDecimal(chars, off, len);
Copy link
Member

@cowtowncoder cowtowncoder Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't this force use of "fast" BigDecimal parser for all cases? Looking at NumberInput.parseBigDecimal() we call parse() when StreamReadFeature.USE_FAST_BIG_NUMBER_PARSER is disabled (default), and parseWithFastParser() if enabled.

So I think this method should just become:

  return new BigDecimal(chars, off, len);

(and possibly existing try-catch block, although that is sort of questionable... except I think some calls are made from jackson-databind, from TokenBuffer f.ex so maybe that is legit).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line is preceded by an if block that says when there are less than 500 chars, return new BigDecimal(chars, off, len)

The reason why we don't want to use return new BigDecimal(chars, off, len) for 500+ chars is that this call gets very slow when you have lots of chars. JavaBigDecimalParser is faster. The custom code with the problems is not quite as fast but is still faster than return new BigDecimal(chars, off, len).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to try to keep the perf improvement that #677 brought in but just to do it with better and safer code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. So, I think what you are saying is that:

  1. Yes, now >500 character default (USE_FAST_BIG_NUMBER_PARSER = false) would go to same code as (USE_FAST_BIG_NUMBER_PARSER = true) BUT
  2. That's ok because we want to protect users against non-linear slow-down for very long legal numbers?

and I guess that makes sense even if pedantically speaking we shouldn't do that (wrt setting USE_FAST_BIG_NUMBER_PARSER). Especially since case of 500 or more characters really is very niche usage, most likely only used by malicious users.

With that, I can merge this for 2.17.


// 20-Aug-2022, tatu: Although "new BigDecimal(...)" only throws NumberFormatException
// operations by "parseBigDecimal()" can throw "ArithmeticException", so handle both:
Expand Down Expand Up @@ -83,132 +83,4 @@ public static BigDecimal parseWithFastParser(final char[] ch, final int off, fin
+ "\" can not be represented as `java.math.BigDecimal`, reason: " + nfe.getMessage());
}
}

private static BigDecimal parseBigDecimal(final char[] chars, final int off, final int len, final int splitLen) {
boolean numHasSign = false;
boolean expHasSign = false;
boolean neg = false;
int numIdx = off;
int expIdx = -1;
int dotIdx = -1;
int scale = 0;
final int endIdx = off + len;

for (int i = off; i < endIdx; i++) {
char c = chars[i];
switch (c) {
case '+':
if (expIdx >= 0) {
if (expHasSign) {
throw new NumberFormatException("Multiple signs in exponent");
}
expHasSign = true;
} else {
if (numHasSign) {
throw new NumberFormatException("Multiple signs in number");
}
numHasSign = true;
numIdx = i + 1;
}
break;
case '-':
if (expIdx >= 0) {
if (expHasSign) {
throw new NumberFormatException("Multiple signs in exponent");
}
expHasSign = true;
} else {
if (numHasSign) {
throw new NumberFormatException("Multiple signs in number");
}
numHasSign = true;
neg = true;
numIdx = i + 1;
}
break;
case 'e':
case 'E':
if (expIdx >= 0) {
throw new NumberFormatException("Multiple exponent markers");
}
expIdx = i;
break;
case '.':
if (dotIdx >= 0) {
throw new NumberFormatException("Multiple decimal points");
}
dotIdx = i;
break;
default:
if (dotIdx >= 0 && expIdx == -1) {
scale++;
}
}
}

int numEndIdx;
int exp = 0;
if (expIdx >= 0) {
numEndIdx = expIdx;
String expStr = new String(chars, expIdx + 1, endIdx - expIdx - 1);
exp = Integer.parseInt(expStr);
scale = adjustScale(scale, exp);
} else {
numEndIdx = endIdx;
}

BigDecimal res;

if (dotIdx >= 0) {
int leftLen = dotIdx - numIdx;
BigDecimal left = toBigDecimalRec(chars, numIdx, leftLen, exp, splitLen);

int rightLen = numEndIdx - dotIdx - 1;
BigDecimal right = toBigDecimalRec(chars, dotIdx + 1, rightLen, exp - rightLen, splitLen);

res = left.add(right);
} else {
res = toBigDecimalRec(chars, numIdx, numEndIdx - numIdx, exp, splitLen);
}

if (scale != 0) {
res = res.setScale(scale);
}

if (neg) {
res = res.negate();
}

return res;
}

private static int adjustScale(int scale, long exp) {
long adjScale = scale - exp;
if (adjScale > Integer.MAX_VALUE || adjScale < Integer.MIN_VALUE) {
throw new NumberFormatException(
"Scale out of range: " + adjScale + " while adjusting scale " + scale + " to exponent " + exp);
}

return (int) adjScale;
}

private static BigDecimal toBigDecimalRec(final char[] chars, final int off, final int len,
final int scale, final int splitLen) {
if (len > splitLen) {
int mid = len / 2;
BigDecimal left = toBigDecimalRec(chars, off, mid, scale + len - mid, splitLen);
BigDecimal right = toBigDecimalRec(chars, off + mid, len - mid, scale, splitLen);

return left.add(right);
}

if (len == 0) {
return BigDecimal.ZERO;
}
// 02-Apr-2023, tatu: [core#967] Looks like "scaleByPowerOfThen" avoids performance issue
// there would be with "movePointRight" (both doing about same thing), so)
return new BigDecimal(chars, off, len)
// .movePointRight(scale);
.scaleByPowerOfTen(scale);
}
}