-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
date_histogram parse_exception in time_zone America/Phoenix #50265
Labels
Comments
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
1 task
We've reproduced this with timezone Changing to Changing to zone
|
nik9000
added a commit
to nik9000/elasticsearch
that referenced
this issue
Feb 6, 2020
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long`s directly into the `DateFieldType`. Sadly, this means that `MappedFieldType` gets a `long` specialized implementation of `isFieldWithinQuery` which is a little lame, but given that we always call it with a long in this context it feels like the lesser of two evils. Not because it is more efficient. It is a little. No. This is the less evil way to solve this problem because it is less confusing to trace the execution. The parsing code is fairly complex and we can just skip it entirely because we already *have* longs. Closes elastic#50265
nik9000
added a commit
that referenced
this issue
Feb 11, 2020
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265
nik9000
added a commit
to nik9000/elasticsearch
that referenced
this issue
Feb 11, 2020
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes elastic#50265
nik9000
added a commit
that referenced
this issue
Feb 12, 2020
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265
nik9000
added a commit
to nik9000/elasticsearch
that referenced
this issue
Feb 12, 2020
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes elastic#50265
nik9000
added a commit
that referenced
this issue
Feb 13, 2020
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Elasticsearch version (
bin/elasticsearch --version
): 8.0.0-SNAPSHOTPlugins installed: []
JVM version (
java -version
): 13.0.1OS version (
uname -a
if on a Unix-like system): Darwin Kernel Version 18.7.0Description of the problem including expected versus actual behavior:
A
date_histogram
with afixed_interval
range of1ms
to999ms
in the timezoneAmerica/Phoenix
causes a "parse_exception". In other timezones e.g.Europe/Berlin
it worksSteps to reproduce:
Here's the error message of the result:
Provide logs (if relevant):
The text was updated successfully, but these errors were encountered: