Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feeding non-numeric data into a long field may consume excessive CPU #40323

Closed
DaveCTurner opened this issue Mar 21, 2019 · 1 comment · Fixed by #40325
Closed

Feeding non-numeric data into a long field may consume excessive CPU #40323

DaveCTurner opened this issue Mar 21, 2019 · 1 comment · Fixed by #40325
Assignees
Labels
>bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v6.3.1 v6.4.0 v6.6.2

Comments

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Mar 21, 2019

I've seen a couple of cases where all the write threads are heroically trying to parse some very large numbers on the way into a long field, consuming far too much CPU in the process. The hot threads output reported many threads stuck doing BigInteger shenanigans such as this:

       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2053)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2051)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2051)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.squareToomCook3(BigInteger.java:2053)
       java.math.BigInteger.square(BigInteger.java:1899)
       java.math.BigInteger.pow(BigInteger.java:2306)
       java.math.BigDecimal.bigTenToThe(BigDecimal.java:3543)
       java.math.BigDecimal.bigMultiplyPowerTen(BigDecimal.java:3676)
       java.math.BigDecimal.setScale(BigDecimal.java:2445)
       java.math.BigDecimal.toBigInteger(BigDecimal.java:3025)
       org.elasticsearch.common.xcontent.support.AbstractXContentParser.toLong(AbstractXContentParser.java:195)
       org.elasticsearch.common.xcontent.support.AbstractXContentParser.longValue(AbstractXContentParser.java:220)
       org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.parse(NumberFieldMapper.java:679)
       org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$7.parse(NumberFieldMapper.java:655)
       org.elasticsearch.index.mapper.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:996)
       org.elasticsearch.index.mapper.FieldMapper.parse(FieldMapper.java:297)

One example was at https://discuss.elastic.co/t/high-cpu-usage-in-elasticsearch-nodes/161504 and another was with a customer.

We fall back to BigInteger if a naive conversion to long fails, for instance if using scientific notation, but then reject the result if it doesn't fit into a long:

private static long toLong(String stringValue, boolean coerce) {
try {
return Long.parseLong(stringValue);
} catch (NumberFormatException e) {
// we will try again with BigDecimal
}
final BigInteger bigIntegerValue;
try {
BigDecimal bigDecimalValue = new BigDecimal(stringValue);
bigIntegerValue = coerce ? bigDecimalValue.toBigInteger() : bigDecimalValue.toBigIntegerExact();
} catch (ArithmeticException e) {
throw new IllegalArgumentException("Value [" + stringValue + "] has a decimal part");
} catch (NumberFormatException e) {
throw new IllegalArgumentException("For input string: \"" + stringValue + "\"");
}
if (bigIntegerValue.compareTo(BigInteger.valueOf(Long.MAX_VALUE)) > 0 ||
bigIntegerValue.compareTo(BigInteger.valueOf(Long.MIN_VALUE)) < 0) {
throw new IllegalArgumentException("Value [" + stringValue + "] is out of range for a long");
}
return bigIntegerValue.longValue();
}

This means that if you try and insert a short string such as "1e99999999" into a long field then we spend a lot of resources converting this into a very large BigInteger before deciding it's too big. I think we should not try so hard to parse values like "1e99999999" as a long.

@DaveCTurner DaveCTurner added >bug :Search Foundations/Mapping Index mappings, including merging and defining field types v6.4.0 v6.3.1 v6.6.2 labels Mar 21, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@DaveCTurner DaveCTurner self-assigned this Mar 21, 2019
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this issue Mar 21, 2019
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.

We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.

Relates elastic#26137
Closes elastic#40323
DaveCTurner added a commit that referenced this issue Mar 28, 2019
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.

We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.

Relates #26137
Closes #40323
DaveCTurner added a commit that referenced this issue Mar 28, 2019
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.

We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.

Relates #26137
Closes #40323
DaveCTurner added a commit that referenced this issue Mar 28, 2019
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.

We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.

Relates #26137
Closes #40323
DaveCTurner added a commit that referenced this issue Apr 5, 2019
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.

We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.

Relates #26137
Closes #40323
@javanna javanna added the Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch label Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch v6.3.1 v6.4.0 v6.6.2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants