-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sorting descending on datefield with nanosecond-precision fails for missing values #73763
Comments
Hi, Can I work on this issue? I have good hands-on experience as a backend developer in Java and expertise in Spring Boot and love to work on this. As a first-time contributor to open source projects, I'm excitingly looking for some interesting stuff to start with and found this suitable for me as I have previously worked in REST APIs involving Sorting, Search, Pagination, etc. |
@priyamounica You would have my gratitude, :-) |
Pinging @elastic/es-search (Team:Search) |
I was able to reproduce on 7.13. Note that the "format" in the query seems to causes the error, not the
|
This is right, for dates (which are internally represented as long) we use Long.MIN_Value, which is negative. Nanosecond dates cannot be negative though, so we should probably use another default in this case. In the meantime using |
Note that both |
@priyamounica if you want to take this one up I'd suggest you to look at the resolution of the "missing" field in |
Thank you so much @EmilBode |
@cbuescher Sure, Can I start working on this? |
Sure, please let me know if you need more pointers to what needs to be done, but I already pointed out some of the cases that should adressed and a potential idea where a fix might be possible. I haven't checked the details yet though. Also please include tests for the above scenarios in a PR. Let me know when you need more info. |
@priyamounica I just wanted to check if you had any chance looking at this issue. It would be nice getting this fixed in the next minor release I think. If you cannot work on it in the next few days please let me know, I think we'd like to take it over then. |
@cbuescher Please take it over |
For missing values on date fields we use Long.MIN_VALUE by default. This is okay when the resolution of the field is milliseconds. For nanoseconds though, negative values can lead to IllegalArgumentExpections when we try to format them internally. This change fixes this by explicitely setting the minimum value to 0L (which corresponds to 1970-01-01T00:00:00.000 for nanosecond resolution) when no other explicit missing value is defined and the target numeric type is a nanosecond type (this is true for nanosecond fields and when "numeric_type" is explicitely set). This way we correct the behaviour for single typed indices and cases where we are sorting across multiple indices with mixed "date" and "date_nanos" type where "numeric_type" is set in the sort definition. Closes elastic#73763
For missing values on date fields we use Long.MIN_VALUE by default. This is okay when the resolution of the field is milliseconds. For nanoseconds though, negative values can lead to IllegalArgumentExpections when we try to format them internally. This change fixes this by explicitely setting the minimum value to 0L (which corresponds to 1970-01-01T00:00:00.000 for nanosecond resolution) when no other explicit missing value is defined and the target numeric type is a nanosecond type (this is true for nanosecond fields and when "numeric_type" is explicitely set). This way we correct the behaviour for single typed indices and cases where we are sorting across multiple indices with mixed "date" and "date_nanos" type where "numeric_type" is set in the sort definition. Closes #73763
For missing values on date fields we use Long.MIN_VALUE by default. This is okay when the resolution of the field is milliseconds. For nanoseconds though, negative values can lead to IllegalArgumentExpections when we try to format them internally. This change fixes this by explicitely setting the minimum value to 0L (which corresponds to 1970-01-01T00:00:00.000 for nanosecond resolution) when no other explicit missing value is defined and the target numeric type is a nanosecond type (this is true for nanosecond fields and when "numeric_type" is explicitely set). This way we correct the behaviour for single typed indices and cases where we are sorting across multiple indices with mixed "date" and "date_nanos" type where "numeric_type" is set in the sort definition. Closes elastic#73763
For missing values on date fields we use Long.MIN_VALUE by default. This is okay when the resolution of the field is milliseconds. For nanoseconds though, negative values can lead to IllegalArgumentExpections when we try to format them internally. This change fixes this by explicitely setting the minimum value to 0L (which corresponds to 1970-01-01T00:00:00.000 for nanosecond resolution) when no other explicit missing value is defined and the target numeric type is a nanosecond type (this is true for nanosecond fields and when "numeric_type" is explicitely set). This way we correct the behaviour for single typed indices and cases where we are sorting across multiple indices with mixed "date" and "date_nanos" type where "numeric_type" is set in the sort definition. Closes elastic#73763
#75064) For missing values on date fields we use Long.MIN_VALUE by default. This is okay when the resolution of the field is milliseconds. For nanoseconds though, negative values can lead to IllegalArgumentExpections when we try to format them internally. This change fixes this by explicitely setting the minimum value to 0L (which corresponds to 1970-01-01T00:00:00.000 for nanosecond resolution) when no other explicit missing value is defined and the target numeric type is a nanosecond type (this is true for nanosecond fields and when "numeric_type" is explicitely set). This way we correct the behaviour for single typed indices and cases where we are sorting across multiple indices with mixed "date" and "date_nanos" type where "numeric_type" is set in the sort definition. Closes #73763
…#75065) For missing values on date fields we use Long.MIN_VALUE by default. This is okay when the resolution of the field is milliseconds. For nanoseconds though, negative values can lead to IllegalArgumentExpections when we try to format them internally. This change fixes this by explicitely setting the minimum value to 0L (which corresponds to 1970-01-01T00:00:00.000 for nanosecond resolution) when no other explicit missing value is defined and the target numeric type is a nanosecond type (this is true for nanosecond fields and when "numeric_type" is explicitely set). This way we correct the behaviour for single typed indices and cases where we are sorting across multiple indices with mixed "date" and "date_nanos" type where "numeric_type" is set in the sort definition. Closes #73763
Elasticsearch version: 7.13.1 (newest)
Plugins installed: None
JVM version: ES builtin,
openjdk 16 2021-03-16
OpenJDK Runtime Environment AdoptOpenJDK (build 16+36)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 16+36, mixed mode, sharing)
OS version: Windows 10, 64bit, version 2004, build 19041.985
Steps to reproduce:
And the query
Expected behaviour
I'd expect both docs to be returned without errors (first doc1, then doc2 with a missing timestamp)
Actual behaviour
Observations
"missing": "_first"
in the sort"missing": "_first"
in the sortMy guess would be that missing values are filled in with a value "before any possible value" (to be first for ascending order, or last in descending order), but that this subsequently fails when showing as a formatted date.
The text was updated successfully, but these errors were encountered: