Fix a DST error in date_histogram #52016

nik9000 · 2020-02-06T21:37:03Z

When date_histogram attempts to optimize itself it for a particular
time zone it checks to see if the entire shard is within the same
"transition". Most time zone transition once every size months or
thereabouts so the optimization can usually kicks in.

But it crashes when you attempt feed it a time zone who's last DST
transition was before epoch. The reason for this is a little twisted:
before this patch it'd find the next and previous transitions in
milliseconds since epoch. Then it'd cast them to Longs and pass them
into the DateFieldType to check if the shard's contents were within
the range. The trouble is they are then converted to Strings which are
then parsed back to Instants which are then convertd to longs. And
the parser doesn't like most negative numbers. And everything before
epoch is negative.

This change removes the
long -> Long -> String -> Instant -> long chain in favor of
passing the longs directly into the DateFieldType. Sadly, this means
that MappedFieldType gets a long specialized implementation of
isFieldWithinQuery which is a little lame, but given that we always
call it with a long in this context it feels like the lesser of two
evils. Not because it is more efficient. It is a little. No. This is the
less evil way to solve this problem because it is less confusing to
trace the execution. The parsing code is fairly complex and we can just
skip it entirely because we already have longs.

Closes #50265

When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long`s directly into the `DateFieldType`. Sadly, this means that `MappedFieldType` gets a `long` specialized implementation of `isFieldWithinQuery` which is a little lame, but given that we always call it with a long in this context it feels like the lesser of two evils. Not because it is more efficient. It is a little. No. This is the less evil way to solve this problem because it is less confusing to trace the execution. The parsing code is fairly complex and we can just skip it entirely because we already *have* longs. Closes elastic#50265

elasticmachine · 2020-02-06T21:37:06Z

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

not-napoleon

LGTM. Nice work finding this one. I like the solution, not going through some weird coercion chain seems like a win.

not-napoleon · 2020-02-06T21:43:33Z

server/src/main/java/org/elasticsearch/index/mapper/MappedFieldType.java

+     * Use this instead of
+     * {@link #isFieldWithinQuery(IndexReader, Object, Object, boolean, boolean, ZoneId, DateMathParser, QueryRewriteContext)}
+     * when you *know* the {@code fromInclusive} and {@code toInclusive} are always
+     * floats.


I think you mean they're always longs?

nik9000 · 2020-02-06T21:49:09Z

LGTM. Nice work finding this one. I like the solution, not going through some weird coercion chain seems like a win.

@jimczi actually found it! He just didn't get a chance to fix it before going on a long trip. He did mention that @jpountz may want to look at it so I added him as a reviewer too.

jpountz

I'd really like to avoid adding the method to MappedFieldType, can we have it only live in DateFieldType and do an instanceof check in the aggregator, and disable the optimization if the instanceof check fails? I think we also need javadocs to clarify that the longs are a number of millis, which is not obvious at all if the field is a date_millis. By the way we might want to have an explicit test for date_millis to make sure it does the right thing with this new method?

nik9000 · 2020-02-10T18:50:52Z

I'd really like to avoid adding the method to MappedFieldType, can we have it only live in DateFieldType and do an instanceof check in the aggregator, and disable the optimization if the instanceof check fails? I think we also need javadocs to clarify that the longs are a number of millis, which is not obvious at all if the field is a date_millis. By the way we might want to have an explicit test for date_millis to make sure it does the right thing with this new method?

I can do all those things. I'm not a fan of instanceof for this sort of thing but if you feel like it is better I'm happy to do it. I agree having a method that only makes sense for dates on MappedFieldType is a bit strange.

nik9000 · 2020-02-10T22:13:26Z

Good call on the date_nanos test, and the resolution of the interface.

nik9000 · 2020-02-11T15:06:30Z

@jpountz, I've added something to properly handle nanos. It feels properly paranoid to me now.

nik9000 · 2020-02-11T17:58:32Z

@elasticmachine run elasticsearch-ci/packaging-sample-matrix-unix

jpountz

Thanks Nik, this looks good to me now. As you guessed, I meant date_nanos, not date_millis. :) I also like the way you refactored the rewrite of the time zone, I find it easier to read now.

nik9000 · 2020-02-11T20:44:44Z

I also like the way you refactored the rewrite of the time zone, I find it easier to read now.

nik9000 · 2020-02-11T21:02:40Z

Thanks @not-napoleon and @jpountz !

When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes elastic#50265

When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265

When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes elastic#50265

When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. *But* it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are *then* parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265

Now that we've backported elastic#52016 we can run its tests when we're performance backwards compatibility testing.

Now that we've backported #52016 we can run its tests when we're performance backwards compatibility testing.

nik9000 added >bug :Analytics/Aggregations Aggregations v8.0.0 v7.7.0 v7.6.1 labels Feb 6, 2020

nik9000 requested a review from not-napoleon February 6, 2020 21:37

nik9000 requested review from jpountz and abdonpijpelink and removed request for abdonpijpelink February 6, 2020 21:44

not-napoleon approved these changes Feb 6, 2020

View reviewed changes

Can't keep my types strait

ab6fb90

jpountz reviewed Feb 7, 2020

View reviewed changes

nik9000 added 2 commits February 10, 2020 14:31

Merge branch 'master' into no_dst

b8cc224

Update from feedback

74e9463

Handle nanos

53355f0

Merge branch 'master' into no_dst

5a1c963

jpountz approved these changes Feb 11, 2020

View reviewed changes

nik9000 merged commit da2b67d into elastic:master Feb 11, 2020

nik9000 added the backport pending label Feb 11, 2020

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Feb 13, 2020

Enable BWC test after backport

1b47e2e

Now that we've backported elastic#52016 we can run its tests when we're performance backwards compatibility testing.

nik9000 mentioned this pull request Feb 13, 2020

Enable BWC test after backport #52299

Merged

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Feb 13, 2020

Enable BWC test after backport

b660a0b

Now that we've backported elastic#52016 we can run its tests when we're performance backwards compatibility testing.

nik9000 mentioned this pull request Feb 13, 2020

Enable BWC test after backport #52300

Merged

nik9000 added a commit that referenced this pull request Feb 13, 2020

Enable BWC test after backport (#52300)

ac535f5

Now that we've backported #52016 we can run its tests when we're performance backwards compatibility testing.

nik9000 added a commit that referenced this pull request Feb 13, 2020

Enable BWC test after backport (#52299)

24e36ba

Now that we've backported #52016 we can run its tests when we're performance backwards compatibility testing.

nik9000 removed the backport pending label Feb 13, 2020

codebrain mentioned this pull request Apr 1, 2020

7.7.0 meta ticket (Part 3) elastic/elasticsearch-net#4534

Closed

romain-chanu mentioned this pull request Apr 28, 2020

date_histogram aggregation fails for date_nanos and noon utc timezone #39107

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

not-napoleon mentioned this pull request Mar 21, 2022

test: date_histogram with time zone on date_nanos field #85149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix a DST error in date_histogram #52016

Fix a DST error in date_histogram #52016

nik9000 commented Feb 6, 2020

elasticmachine commented Feb 6, 2020

not-napoleon left a comment

not-napoleon Feb 6, 2020

nik9000 Feb 6, 2020

nik9000 commented Feb 6, 2020

jpountz left a comment

nik9000 commented Feb 10, 2020

nik9000 commented Feb 10, 2020

nik9000 commented Feb 11, 2020

nik9000 commented Feb 11, 2020

jpountz left a comment

nik9000 commented Feb 11, 2020

nik9000 commented Feb 11, 2020

Fix a DST error in date_histogram #52016

Fix a DST error in date_histogram #52016

Conversation

nik9000 commented Feb 6, 2020

elasticmachine commented Feb 6, 2020

not-napoleon left a comment

Choose a reason for hiding this comment

not-napoleon Feb 6, 2020

Choose a reason for hiding this comment

nik9000 Feb 6, 2020

Choose a reason for hiding this comment

nik9000 commented Feb 6, 2020

jpountz left a comment

Choose a reason for hiding this comment

nik9000 commented Feb 10, 2020

nik9000 commented Feb 10, 2020

nik9000 commented Feb 11, 2020

nik9000 commented Feb 11, 2020

jpountz left a comment

Choose a reason for hiding this comment

nik9000 commented Feb 11, 2020

nik9000 commented Feb 11, 2020