-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ingest DateProcessor (small) speedup, optimize collections code in DateFormatter.forPattern #91521
Merged
joegallo
merged 6 commits into
elastic:main
from
joegallo:ingest-date-processor-speedup
Nov 14, 2022
Merged
Ingest DateProcessor (small) speedup, optimize collections code in DateFormatter.forPattern #91521
joegallo
merged 6 commits into
elastic:main
from
joegallo:ingest-date-processor-speedup
Nov 14, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
to match how it's done on the previous ternary for timezone
to keep with the return type from delimitedListToStringArray, rather than wrapping it
This is still pretty simple as a for each, and the for each is faster.
joegallo
added
>non-issue
:Data Management/Ingest Node
Execution or management of Ingest Pipelines including GeoIP
v8.6.0
labels
Nov 10, 2022
elasticsearchmachine
added
the
Team:Data Management
Meta label for data/management team
label
Nov 10, 2022
Pinging @elastic/es-data-management (Team:Data Management) |
grcevski
approved these changes
Nov 14, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I would just add a comment on line 116 in forPattern to say it's performance sensitive code ;), just so that we don't undo it by a refactor in the future.
weizijun
added a commit
to weizijun/elasticsearch
that referenced
this pull request
Nov 15, 2022
* main: (163 commits) [DOCS] Edits frequent items aggregation (elastic#91564) Handle providers of optional services in ubermodule classloader (elastic#91217) Add `exportDockerImages` lifecycle task for exporting docker tarballs (elastic#91571) Fix CSV dependency report output file location in DRA CI job Fix variable placeholder for Strings.format calls (elastic#91531) Fix output dir creation in ConcatFileTask (elastic#91568) Fix declaration of dependencies in DRA snapshots CI job (elastic#91569) Upgrade Gradle Enterprise plugin to 3.11.4 (elastic#91435) Ingest DateProcessor (small) speedup, optimize collections code in DateFormatter.forPattern (elastic#91521) Fix inter project handling of generateDependenciesReport (elastic#91555) [Synthetics] Add synthetics-* read to fleet-server (elastic#91391) [ML] Copy more settings when creating DF analytics destination index (elastic#91546) Reduce CartesianCentroidIT flakiness (elastic#91553) Propagate last node to reinitialized routing tables (elastic#91549) Forecast write load during rollovers (elastic#91425) [DOCS] Warn about potential overhead of named queries (elastic#91512) Datastream unavailable exception metadata (elastic#91461) Generate docker images and dependency report in DRA ci job (elastic#91545) Support cartesian_bounds aggregation on point and shape (elastic#91298) Add support for EQL samples queries (elastic#91312) ... # Conflicts: # x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/downsample/RollupShardIndexer.java
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Data Management/Ingest Node
Execution or management of Ingest Pipelines including GeoIP
>non-issue
Team:Data Management
Meta label for data/management team
v8.6.0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
splitCombinedPatterns
to avoid an array->collection conversionforPattern
rather than the streams approachIt's a relatively small change to the code, but since we run this each time a
DateProcessor
executes on anIngestDocument
, it's a pretty hot code path. In my little microbenchmark it cuts ~20% of the time spent in aDateProcessor
, or about 1% of the ingest time overall (but again for emphasis, that's just my little microbenchmark).Here's the before/after flamegraph: