Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named captures conversion to grok pattern_definitions format #586

Merged

Conversation

graytaylor0
Copy link
Member

@graytaylor0 graytaylor0 commented Nov 12, 2021

Description

This PR contains a utility class for taking any regex string and pulling out the named captures groups from it for the grok pattern_definition format. It then replaces the original named captures group with grok syntax %{SYNTAX:SEMANTIC}. It creates random pattern names to go with the PATTERN_NAME regex format of pattern definition declaration

Issues Resolved

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@graytaylor0 graytaylor0 requested a review from a team as a code owner November 12, 2021 23:36
@codecov-commenter
Copy link

Codecov Report

Merging #586 (19febe0) into main (42c5954) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##               main     #586   +/-   ##
=========================================
  Coverage     92.12%   92.12%           
  Complexity      569      569           
=========================================
  Files            71       71           
  Lines          1727     1727           
  Branches        144      144           
=========================================
  Hits           1591     1591           
  Misses          105      105           
  Partials         31       31           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 42c5954...19febe0. Read the comment docs.

Copy link
Collaborator

@chenqi0805 chenqi0805 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor nits. Not blocking


class GrokNamedCapturesUtil {

private static final String namedCapturesRegex = "\\(\\?\\<(.+?)\\>(.+?)\\)";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could be all capitalized

}

public Map<String, String> getMappedPatternDefinitions() {
return mappedPatternDefinitions;
Copy link
Collaborator

@chenqi0805 chenqi0805 Nov 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: For access safety, I suggest return a view of the map, i.e. Collections.unmodifiableMap(mappedPatternDefinitions).


for (final Map.Entry<String, String> patternDefinition : result.getMappedPatternDefinitions().entrySet()) {
assertThat(patternDefinition.getValue().equals(namedCapturesPattern), equalTo(true));
final String expectedResult = String.format("%%{%s:%s}", patternDefinition.getKey(), namedCapturesName);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%%? I was expecting %{%s:%s} but could be wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%% is how you do a literal % in String.format

}
final String expectedResult = String.format("%s%%{%s:%s} %s %%{%s:%s}%s", randomPrefix, patternDefinitionNames.get(0), namedCapturesNames.get(0), randomMiddle,
patternDefinitionNames.get(1), namedCapturesNames.get(1), randomSuffix);
assertThat(result.getMappedRegex().equals(expectedResult), equalTo(true));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it better to use the following form because you get better output when the assertion fails:

assertThat(result.getMappedRegex(), equalTo(expectedResult));

@graytaylor0 graytaylor0 merged commit 816a8bd into opensearch-project:main Nov 13, 2021
@graytaylor0 graytaylor0 deleted the NamedCapturesConversion branch November 16, 2021 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants