Numbers of the original document lose precision if the query contains excludes #46261

Schwaller · 2019-09-03T11:40:42Z

Elasticsearch version:
Any up to 7.3.1 (Testet with ES 7.0.1 and ES 7.3.1.)

Plugins installed:
None

JVM version:
Any JVM

OS version:
Any OS (Testet Windows 10 and MacOSX Mojave)

Description of the problem including expected versus actual behavior:
When a document is indexed with floating point numbers beyond the precision of double the number precision of the returned document varies. If no excludes are provided the document will be returned with the original numbers (full precision). If excludes are provided the documents numbers will be reencoded with double precision (IEEE754). This can cause various annoying problems when comparing messages and it is sort of confusing that the real original data is only available under certain circumstances.

Steps to reproduce:
Step 1: Add a document to the ES index with a double value like this: 2.00000000000000000000000000000001
Step 2: Retrieve that document using the REST interface (you will get the full precision back)
Step 3: Retrieve that document using the REST interface while using an exclude filter on an existing or not existing field (the returned documents number field will be 2 instead of 2.00000000000000000000000000000001)

Testet with ES 7.0.1 and ES 7.3.1.

Cause:
XContentParser uses Double instead of BigDecimal and does not allow a Jackson config to configure this behaviour.

Schwaller · 2019-09-03T11:45:08Z

Similar issues from 2012 also exist and it seems to be known that the XContentParser "needs some love". Sadly nothing happened so far and I don't see an easy way to provide my own XContentParser implementation (the interface uses double in the return values which practically kills any possible workarounds). Also wouldn't it be much faster to keep the original JSON elements and just drop the unwanted ones instead of fully reencoding everything?

martijnvg · 2019-09-03T13:14:52Z

Thanks for reporting this bug @Schwaller, xcontent filtering should not change the precision of floating point values.

elasticmachine · 2019-09-03T13:18:55Z

Pinging @elastic/es-core-infra

martijnvg · 2019-09-03T13:31:06Z

Quick reproduction:

PUT test/_doc/1
{
    "number": 2.00000000000000000000000000000001
}

curl 'http://localhost:9200/test/_doc/1?filter_path=_source.number'

Returns:

{"_source":{"number":2.0}}

Whereas:

curl 'http://localhost:9200/test/_doc/1'

Returns:

{"_index":"test","_type":"_doc","_id":"1","_version":1,"_seq_no":0,"_primary_term":1,"found":true,"_source":{
    "number": 2.00000000000000000000000000000001
}}

Also pretty printing changes the precision of floating point numbers:

curl 'http://localhost:9200/test/_doc/1?pretty'

Returns:

{
  "_index" : "test",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "number" : 2.0
  }
}

This is why dev console and other tools will never show the original floating point number.

Schwaller · 2019-09-03T14:09:41Z

Thank you for taking this serious! Would be great to get a solution for this :)

martijnvg added >bug :Core/Infra/REST API REST infrastructure and utilities labels Sep 3, 2019

Schwaller changed the title ~~Numbers of the original document loose precision if the query contains excludes~~ Numbers of the original document lose precision if the query contains excludes Sep 3, 2019

jpountz linked a pull request Sep 10, 2019 that will close this issue

XContentParser shouldn't lose data from floating-point numbers. #46531

Open

rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020

rjernst added the needs:triage Requires assignment of a team area label label Dec 3, 2020

jaymode assigned jpountz Dec 14, 2020

jaymode removed the needs:triage Requires assignment of a team area label label Dec 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numbers of the original document lose precision if the query contains excludes #46261

Numbers of the original document lose precision if the query contains excludes #46261

Schwaller commented Sep 3, 2019 •

edited

Loading

Schwaller commented Sep 3, 2019

martijnvg commented Sep 3, 2019

elasticmachine commented Sep 3, 2019

martijnvg commented Sep 3, 2019

Schwaller commented Sep 3, 2019

Numbers of the original document lose precision if the query contains excludes #46261

Numbers of the original document lose precision if the query contains excludes #46261

Comments

Schwaller commented Sep 3, 2019 • edited Loading

Schwaller commented Sep 3, 2019

martijnvg commented Sep 3, 2019

elasticmachine commented Sep 3, 2019

martijnvg commented Sep 3, 2019

Schwaller commented Sep 3, 2019

Schwaller commented Sep 3, 2019 •

edited

Loading