Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Comparing float/half_float to number fails if out of range #100130

Closed
alex-spies opened this issue Oct 2, 2023 · 11 comments
Closed

ESQL: Comparing float/half_float to number fails if out of range #100130

alex-spies opened this issue Oct 2, 2023 · 11 comments
Assignees
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Milestone

Comments

@alex-spies
Copy link
Contributor

alex-spies commented Oct 2, 2023

Elasticsearch Version

main@c24cc0f54c216a5bff8e6b0e3ad2d09f7a3eb956

Installed Plugins

No response

Java Version

openjdk version "17.0.8" 2023-07-18

OS Version

Ubuntu 23.04

Problem Description

For an index with a float or half_float (probably also scaled_float) field, the following query fails:

from index | where field < to_double(-1.797693134862315) * pow(10.0, 307)

This throws a QueryShardException: failed to create query: [float] supports only finite values, but got [-Infinity].

The culprit is the local physical plan optimization PushFiltersToSource. We should not push filters down to Lucene if the right hand side is out of range.

Unfortunately, we cannot decide during optimization whether this is the case at the moment, because float and half_float datatypes are both interpreted as double by ESQL. We need to either keep the initial data type around in FieldAttributes or stop widening small numeric types.

Steps to Reproduce

Run the query from above against any index that has a field field with data type float or half_float.

Logs (if relevant)

    org.elasticsearch.client.ResponseException: method [POST], host [http://127.0.0.1:36267], URI [/_query?format=cbor&pretty=true&error_trace=true], status line [HTTP/1.1 400 Bad Request]
    ¿eerror¿jroot_cause[rest-esql-test/5_LldMDxQ6OeNFCjiIBidg] org.elasticsearch.index.query.QueryShardException: failed to create query: [float] supports only finite values, but got [-Infinity]
        at [email protected]/org.elasticsearch.index.query.SearchExecutionContext.toQuery(SearchExecutionContext.java:454)
        at org.elasticsearch.xpack.esql.planner.EsPhysicalOperationProviders.lambda$sourcePhysicalOperation$0(EsPhysicalOperationProviders.java:93)
        at org.elasticsearch.compute.lucene.LuceneOperator.lambda$weightFunction$0(LuceneOperator.java:284)
        at org.elasticsearch.compute.lucene.LuceneSliceQueue.lambda$create$0(LuceneSliceQueue.java:68)
        at org.elasticsearch.compute.lucene.LuceneOperator.getCurrentOrLoadNextScorer(LuceneOperator.java:81)
        at org.elasticsearch.compute.lucene.LuceneSourceOperator.getOutput(LuceneSourceOperator.java:131)
        at org.elasticsearch.compute.operator.Driver.runSingleLoopIteration(Driver.java:183)
        at org.elasticsearch.compute.operator.Driver.run(Driver.java:129)
        at org.elasticsearch.compute.operator.Driver$1.doRun(Driver.java:279)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at [email protected]/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
        at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
    Caused by: java.lang.IllegalArgumentException: [float] supports only finite values, but got [-Infinity]
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.validateParsed(NumberFieldMapper.java:580)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.parse(NumberFieldMapper.java:455)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.termQuery(NumberFieldMapper.java:478)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberFieldType.termQuery(NumberFieldMapper.java:1549)
        at [email protected]/org.elasticsearch.index.query.TermQueryBuilder.doToQuery(TermQueryBuilder.java:207)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116)
        at org.elasticsearch.xpack.esql.querydsl.query.SingleValueQuery$Builder.doToQuery(SingleValueQuery.java:179)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116)
        at [email protected]/org.elasticsearch.index.query.SearchExecutionContext.toQuery(SearchExecutionContext.java:446)
        ... 15 more
    ÿÿdtypeuquery_shard_exceptionfreasonxPfailed to create query: [float] supports only finite values, but got [-Infinity]jindex_uuidv5_LldMDxQ6OeNFCjiIBidgeindexnrest-esql-testicaused_by¿dtypex�illegal_argument_exceptionfreasonx8[float] supports only finite values, but got [-Infinity]kstack_tracey
                                                                                                                    Jjava.lang.IllegalArgumentException: [float] supports only finite values, but got [-Infinity]
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.validateParsed(NumberFieldMapper.java:580)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.parse(NumberFieldMapper.java:455)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.termQuery(NumberFieldMapper.java:478)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberFieldType.termQuery(NumberFieldMapper.java:1549)
        at [email protected]/org.elasticsearch.index.query.TermQueryBuilder.doToQuery(TermQueryBuilder.java:207)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116)
        at org.elasticsearch.xpack.esql.querydsl.query.SingleValueQuery$Builder.doToQuery(SingleValueQuery.java:179)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116)
        at [email protected]/org.elasticsearch.index.query.SearchExecutionContext.toQuery(SearchExecutionContext.java:446)
        at org.elasticsearch.xpack.esql.planner.EsPhysicalOperationProviders.lambda$sourcePhysicalOperation$0(EsPhysicalOperationProviders.java:93)
        at org.elasticsearch.compute.lucene.LuceneOperator.lambda$weightFunction$0(LuceneOperator.java:284)
        at org.elasticsearch.compute.lucene.LuceneSliceQueue.lambda$create$0(LuceneSliceQueue.java:68)
        at org.elasticsearch.compute.lucene.LuceneOperator.getCurrentOrLoadNextScorer(LuceneOperator.java:81)
        at org.elasticsearch.compute.lucene.LuceneSourceOperator.getOutput(LuceneSourceOperator.java:131)
        at org.elasticsearch.compute.operator.Driver.runSingleLoopIteration(Driver.java:183)
        at org.elasticsearch.compute.operator.Driver.run(Driver.java:129)
        at org.elasticsearch.compute.operator.Driver$1.doRun(Driver.java:279)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at [email protected]/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
        at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
    ÿkstack_tracey
                  [rest-esql-test/5_LldMDxQ6OeNFCjiIBidg] org.elasticsearch.index.query.QueryShardException: failed to create query: [float] supports only finite values, but got [-Infinity]
        at [email protected]/org.elasticsearch.index.query.SearchExecutionContext.toQuery(SearchExecutionContext.java:454)
        at org.elasticsearch.xpack.esql.planner.EsPhysicalOperationProviders.lambda$sourcePhysicalOperation$0(EsPhysicalOperationProviders.java:93)
        at org.elasticsearch.compute.lucene.LuceneOperator.lambda$weightFunction$0(LuceneOperator.java:284)
        at org.elasticsearch.compute.lucene.LuceneSliceQueue.lambda$create$0(LuceneSliceQueue.java:68)
        at org.elasticsearch.compute.lucene.LuceneOperator.getCurrentOrLoadNextScorer(LuceneOperator.java:81)
        at org.elasticsearch.compute.lucene.LuceneSourceOperator.getOutput(LuceneSourceOperator.java:131)
        at org.elasticsearch.compute.operator.Driver.runSingleLoopIteration(Driver.java:183)
        at org.elasticsearch.compute.operator.Driver.run(Driver.java:129)
        at org.elasticsearch.compute.operator.Driver$1.doRun(Driver.java:279)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at [email protected]/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
        at [email protected]/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983)
        at [email protected]/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1583)
    Caused by: java.lang.IllegalArgumentException: [float] supports only finite values, but got [-Infinity]
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.validateParsed(NumberFieldMapper.java:580)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.parse(NumberFieldMapper.java:455)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberType$2.termQuery(NumberFieldMapper.java:478)
        at [email protected]/org.elasticsearch.index.mapper.NumberFieldMapper$NumberFieldType.termQuery(NumberFieldMapper.java:1549)
        at [email protected]/org.elasticsearch.index.query.TermQueryBuilder.doToQuery(TermQueryBuilder.java:207)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116)
        at org.elasticsearch.xpack.esql.querydsl.query.SingleValueQuery$Builder.doToQuery(SingleValueQuery.java:179)
        at [email protected]/org.elasticsearch.index.query.AbstractQueryBuilder.toQuery(AbstractQueryBuilder.java:116)
        at [email protected]/org.elasticsearch.index.query.SearchExecutionContext.toQuery(SearchExecutionContext.java:446)
        ... 15 more
@elasticsearchmachine elasticsearchmachine added the Team:QL (Deprecated) Meta label for query languages team label Oct 2, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@alex-spies alex-spies self-assigned this Oct 2, 2023
alex-spies added a commit to alex-spies/elasticsearch that referenced this issue Oct 2, 2023
Prevent pushing down filters for binary comparisons that would
implicitly compare a byte/short/int with an out of range value. This
leads to exceptions thrown by Lucene - instead, evaluate the filter in
ESQL only.

This does not cover the same problem for half_float/scaled_float/float,
see elastic#100130.

Closes elastic#99960.
@jpountz
Copy link
Contributor

jpountz commented Oct 2, 2023

In my opinion this is a bug in the field mapper, not in ES|QL. I would prefer fixing the field mapper than introducing a workaround in ES|QL (which makes things a bit more complicated unfortunately because we'll want to check if it's a breaking change or not).

@alex-spies
Copy link
Contributor Author

@jpountz , I think the field mapper is fine. I debugged the issue and confirmed that the data type that ESQL first sees is indeed half_float, float etc. Here in the ESQL code we deliberately interpret fields like this as double and seem to discard the information that the field is, actually, half_float (resp. float etc.)

@alex-spies
Copy link
Contributor Author

Here's the relevant piece of code that turns the mapping into field attributes and "widens" data types.

@jpountz
Copy link
Contributor

jpountz commented Oct 2, 2023

OK. If ES|QL has a natural way of handling this because it needs to check out types anyway, I'm good with this. But if we end up working around bah behaviors in mappings, let's fix the root cause directly. :)

@alex-spies alex-spies modified the milestone: 8.12 Nov 21, 2023
@alex-spies alex-spies added this to the 8.13 milestone Dec 5, 2023
@wchaparro wchaparro added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 2, 2024
@elasticsearchmachine elasticsearchmachine removed the Team:QL (Deprecated) Meta label for query languages team label Jan 2, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@astefan
Copy link
Contributor

astefan commented Feb 1, 2024

Some clarifications, from taking a look at this:

  • For an index with a float or half_float (probably also scaled_float) field, the following query fails:

This doesn't actually fail for scaled_float, so ES itself has an inconsistent behavior regarding the float family of fields in this specific case. I am not sure if scaled_float should fail as well, or the other two shouldn't and they shouldn't actually match the documents.

To be a bit more clear, a field type mapped as

            "sf": {
                "type": "scaled_float",
                "scaling_factor": 10
            }

With value as "sf":1.234567891234568 and a query from test | where sf < to_double(-1.797693134862315) * pow(10.0, 307) doesn't fail as described in this issue.

  • the inconsistency in ES is also visible when trying to emulate what ESQL is doing in this query with Painless:
{
  "runtime_mappings": {
    "infinity": {
      "type": "boolean",
      "script": {
        "source": "emit(doc['sf'].value < (Double.valueOf(-1.797693134862315).doubleValue() * Math.pow(10.0, 307)))"
      }
    }
  },
  "query": {
    "match_all":{}
  },
  "fields":["infinity"],
  "_source": false
}

having a more extended mapping covering multiple other floating point data types

            "f": {
                "type":"float"
            },
            "sf": {
                "type": "scaled_float",
                "scaling_factor": 10
            },
            "hf": {
                "type":"half_float"
            },
            "d": {
                "type": "double"
            }

None of the DSL queries fail with exceptions. They all return

        "hits": [
            {
                "_index": "test",
                "_id": "87oUZI0Bm0edNwIev0__",
                "_score": 1.0,
                "fields": {
                    "infinity": [
                        false
                    ]
                }
            }
        ]

At this point, I am not sure what and where we need a fix. Maybe it is good to provide a warning message to ESQL users telling them a math operation doesn't make sense, but at the same time Elasticsearch itself is inconsistent in how it's approaching the impossible math operations.

@jpountz thank you for initially chiming in. Would you mind having a second look at my statement here and see if it does make sense from ES side of things?

@alex-spies
Copy link
Contributor Author

Heya, I need to go back on my statement that I think the field mapper is fine. Thanks again @jpountz for bringing up this angle as well.

The behavior can be reproduced in ES directly, without going through ES|QL, by simply running a range query on e.g. a half_float field (which is what ES|QL does behind the scenes).

POST /_search
{"query": {"range": {"half_float_field": {"lte": 1E300}}}}

->
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to create query: [half_float] supports only finite values, but got [Infinity]" ...

This originates here, while parsing the range query for HALF_FLOAT.

So, rather than working around this issue in ES|QL, we could indeed make range queries more permissive, although that would change the behavior of _search.

@costin
Copy link
Member

costin commented Feb 2, 2024

@alex-spies please raise an issue in ES replicating the problem and let's close this issue for now as it's not clear whether there's anything in ESQL to fix to begin with.

@alex-spies
Copy link
Contributor Author

Superseded by #105079

@alex-spies alex-spies reopened this Feb 2, 2024
@alex-spies alex-spies closed this as not planned Won't fix, can't repro, duplicate, stale Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants