-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a DST error in date_histogram #52016
Conversation
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long`s directly into the `DateFieldType`. Sadly, this means that `MappedFieldType` gets a `long` specialized implementation of `isFieldWithinQuery` which is a little lame, but given that we always call it with a long in this context it feels like the lesser of two evils. Not because it is more efficient. It is a little. No. This is the less evil way to solve this problem because it is less confusing to trace the execution. The parsing code is fairly complex and we can just skip it entirely because we already *have* longs. Closes elastic#50265
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Nice work finding this one. I like the solution, not going through some weird coercion chain seems like a win.
* Use this instead of | ||
* {@link #isFieldWithinQuery(IndexReader, Object, Object, boolean, boolean, ZoneId, DateMathParser, QueryRewriteContext)} | ||
* when you *know* the {@code fromInclusive} and {@code toInclusive} are always | ||
* floats. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you mean they're always longs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd really like to avoid adding the method to MappedFieldType, can we have it only live in DateFieldType and do an instanceof check in the aggregator, and disable the optimization if the instanceof check fails? I think we also need javadocs to clarify that the longs are a number of millis, which is not obvious at all if the field is a date_millis
. By the way we might want to have an explicit test for date_millis
to make sure it does the right thing with this new method?
I can do all those things. I'm not a fan of |
Good call on the |
@jpountz, I've added something to properly handle nanos. It feels properly paranoid to me now. |
@elasticmachine run elasticsearch-ci/packaging-sample-matrix-unix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Nik, this looks good to me now. As you guessed, I meant date_nanos
, not date_millis
. :) I also like the way you refactored the rewrite of the time zone, I find it easier to read now.
Thanks @not-napoleon and @jpountz ! |
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes elastic#50265
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes elastic#50265
When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265
Now that we've backported elastic#52016 we can run its tests when we're performance backwards compatibility testing.
Now that we've backported elastic#52016 we can run its tests when we're performance backwards compatibility testing.
Now that we've backported #52016 we can run its tests when we're performance backwards compatibility testing.
Now that we've backported #52016 we can run its tests when we're performance backwards compatibility testing.
When
date_histogram
attempts to optimize itself it for a particulartime zone it checks to see if the entire shard is within the same
"transition". Most time zone transition once every size months or
thereabouts so the optimization can usually kicks in.
But it crashes when you attempt feed it a time zone who's last DST
transition was before epoch. The reason for this is a little twisted:
before this patch it'd find the next and previous transitions in
milliseconds since epoch. Then it'd cast them to
Long
s and pass theminto the
DateFieldType
to check if the shard's contents were withinthe range. The trouble is they are then converted to
String
s which arethen parsed back to
Instant
s which are then convertd tolong
s. Andthe parser doesn't like most negative numbers. And everything before
epoch is negative.
This change removes the
long
->Long
->String
->Instant
->long
chain in favor ofpassing the
long
s directly into theDateFieldType
. Sadly, this meansthat
MappedFieldType
gets along
specialized implementation ofisFieldWithinQuery
which is a little lame, but given that we alwayscall it with a long in this context it feels like the lesser of two
evils. Not because it is more efficient. It is a little. No. This is the
less evil way to solve this problem because it is less confusing to
trace the execution. The parsing code is fairly complex and we can just
skip it entirely because we already have longs.
Closes #50265