Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date/Time parsing: Use java time API instead of exception handling #37222

Merged
merged 14 commits into from
Jan 11, 2019

Conversation

spinscale
Copy link
Contributor

@spinscale spinscale commented Jan 8, 2019

when several formatters are used, the existing way of parsing those is
to throw an exception catch it, and try the next one. This is is
considerably slower than the approach taken in joda time, so that
indexing throughput is reduced when a date format like x||y is used and y is the
date format being used.

This commit now uses the java API to parse the date by appending the
date time formatters to each other and does not rely on exception
handling.

This also removes the MergedDateFormatter class, is this kind of
work can now be done before the class is created.

The stats of the added benchmark before this patch

Benchmark                             Mode  Cnt     Score     Error  Units
DateFormatterBenchmark.parseJavaDate  avgt   30  5638,653 ± 135,143  ns/op
DateFormatterBenchmark.parseJodaDate  avgt   30  1094,825 ±   8,526  ns/op

with this patch

Benchmark                             Mode  Cnt     Score   Error  Units
DateFormatterBenchmark.parseJavaDate  avgt   30   813,562 ± 6,450  ns/op
DateFormatterBenchmark.parseJodaDate  avgt   30  1091,707 ± 7,008  ns/op

I also played around a bit with the benchmark using parseJoda for joda, but there are no real runtime changes.

when several formatters are used, the existing way of parsing those is
to throw an exception catch it, and try the next one. This is is
considerably slower than the approach taken in joda time, so that
indexing is reduced when a date format like `x||y` is used and y is the
date format being used.

This commit now uses the java API to parse the date by appending the
date time formatters to each other and does not rely on exception
handling.
@spinscale spinscale added >enhancement :Core/Infra/Core Core issues without another label labels Jan 8, 2019
@spinscale spinscale requested a review from rjernst January 8, 2019 13:16
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@spinscale
Copy link
Contributor Author

@elasticmachine please retest this

@spinscale
Copy link
Contributor Author

this requires more investigation, parsing seems to be broken, when using more than one parser, only the first one is applied correctly

@spinscale
Copy link
Contributor Author

@elasticmachine retest this please

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine overall, I left a few minor comments.

@spinscale
Copy link
Contributor Author

thanks! applied your comments

@spinscale
Copy link
Contributor Author

@elasticmachine retest this please

Copy link
Member

@danielmitterdorfer danielmitterdorfer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating. Can you please rerun the microbenchmarks with the new changes and update the description? LGTM if Jenkins is also happy.

@spinscale
Copy link
Contributor Author

there was no significant change in the benchmark stats, thus I didnt update

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@spinscale spinscale merged commit 9f3da01 into elastic:master Jan 11, 2019
spinscale added a commit that referenced this pull request Jan 11, 2019
…37222)

when several formatters are used, the existing way of parsing those is
to throw an exception, catch it, and try the next one. This is is
considerably slower than the approach taken in joda time, so that
indexing throughput is reduced when a date format like `x||y` is used
and y is the date format being used.

This commit now uses the java API to parse the date by appending the
date time formatters to each other and does not rely on exception
handling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants