Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve grok pattern ordering and add sort option #61671

Merged
merged 5 commits into from
Sep 8, 2020

Conversation

danhermann
Copy link
Contributor

The grok processor REST endpoint returns the bundled grok patterns in an undetermined order. This PR changes the default behavior to return them in the order in which they were read from disk which preserves the natural grouping of related patterns such as "S3_ACCESS_LOG" and "ELB_URI" for AWS-related patterns though they share no common prefix. An option to return all patterns sorted by key was also added.

Resolves #40819.

@danhermann danhermann added >enhancement :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.0.0 v7.10.0 labels Aug 28, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Aug 28, 2020
@danhermann
Copy link
Contributor Author

@elasticmachine update branch

@danhermann
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/packaging-sample-windows

@probakowski probakowski self-requested a review September 3, 2020 15:28
Copy link
Contributor

@probakowski probakowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @danhermann for adding this, I left one serialization question and small possible improvement


Request(StreamInput in) throws IOException {
super(in);
this.sorted = in.readBoolean();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be guarded with version check? What will happen if we receive request from older node where there where no sorted field?

} catch (Exception e) {
listener.onFailure(e);
}
}

static Map<String, String> getGrokPatternsResponse(Map<String, String> patterns, boolean sorted) {
return sorted ? new TreeMap<>(patterns) : patterns;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could extract this new TreeMap<>(patterns) to a field to only sort all patterns once as they are constant. We could then skip this method completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shuffled around a few things in 9d3a31a to sort the patterns only once while retaining the ability to test the sort option. If you think it's useful, the sorted patterns member could be pulled into a singleton enum or the like for lazy loading. I was on the fence as to whether that would be particularly beneficial.

@danhermann
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/bwc

@danhermann
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Contributor

@probakowski probakowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @danhermann!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v7.10.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ingest: Sort grok patterns when retrieving via REST API
4 participants