-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grok Prepper Matching Functionality #324
Conversation
Signed-off-by: graytaylor0 <[email protected]>
Signed-off-by: graytaylor0 <[email protected]>
Signed-off-by: graytaylor0 <[email protected]>
Signed-off-by: graytaylor0 <[email protected]>
try (MockedStatic<GrokCompiler> grokCompilerMockedStatic = mockStatic(GrokCompiler.class)) { | ||
|
||
capture = new HashMap<>(); | ||
capture.put("field_capture_1", "value_capture_1"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should leave the capture as an empty map in this setup. Tests which want specific values should set them directly. So, I'd recommend removing these three lines from here.
Additionally, you can make @Nested
tests to help combine tests that should be more similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, this could give a test in which there are no matches. That test would have an empty capture
map.
...grok-prepper/src/test/java/com/amazon/dataprepper/plugins/prepper/grok/GrokPrepperTests.java
Show resolved
Hide resolved
...grok-prepper/src/test/java/com/amazon/dataprepper/plugins/prepper/grok/GrokPrepperTests.java
Outdated
Show resolved
Hide resolved
...ns/grok-prepper/src/test/java/com/amazon/dataprepper/plugins/prepper/grok/GrokPrepperIT.java
Show resolved
Hide resolved
...gins/grok-prepper/src/main/java/com/amazon/dataprepper/plugins/prepper/grok/GrokPrepper.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor comments.
String testData = "{\"message\":\"127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] \\\"GET /apache_pb.gif HTTP/1.0\\\" 200 2326\"}"; | ||
Record<String> record = new Record<>(testData); | ||
|
||
String resultData = "{\"message\":\"127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] \\\"GET /apache_pb.gif HTTP/1.0\\\" 200 2326\"," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick: we could construct it as map to avoid redundant deserialization.
pluginSetting.getSettings().put(GrokPrepperConfig.MATCH, matchConfig); | ||
grokPrepper = new GrokPrepper(pluginSetting); | ||
|
||
String testData = "{\"message\":\"127.0.0.1 user-identifier frank [10/Oct/2000:13:55:36 -0700] \\\"GET /apache_pb.gif HTTP/1.0\\\" 200 2326\"}"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-pick: It seems to me the testData does not include %{DATA:rawrequest}. From the context I infer it is to cover empty_match = False? If so, we should better comment on this field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah rawrequest being null makes it easier to test keep_empty_captures and to test merging maps with null values to make sure no NPE's can occur. Can you elaborate on what you mean by "better comment on this field"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am correct, rawrequest key is not expected to appear in the keys of result data in this test cases while it appears in the grokking pattern. So we could call that out in the comment (maybe above result data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will put a comment somewhere for this
Signed-off-by: graytaylor0 <[email protected]>
@@ -43,10 +69,35 @@ public GrokPrepper(final PluginSetting pluginSetting) { | |||
* | |||
* @param records Input records that will be modified/processed | |||
* @return Record modified output records | |||
*/ | |||
*/ | |||
@Override | |||
public Collection<Record<String>> doExecute(Collection<Record<String>> records) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: parameters should be final
for (Map.Entry<String, List<Grok>> entry : fieldToGrok.entrySet()) { | ||
for (Grok grok : entry.getValue()) { | ||
if (recordMap.containsKey(entry.getKey())) { | ||
Match match = grok.match(recordMap.get(entry.getKey()).toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: final
for (Map.Entry<String, List<String>> entry : grokPrepperConfig.getMatch().entrySet()) { | ||
fieldToGrok.put(entry.getKey(), entry.getValue().stream().map(item -> grokCompiler.compile(item, grokPrepperConfig.isNamedCapturesOnly())).collect(Collectors.toList())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very hard to read and I am not sure what is happening here. Let's pull this out into a separate function. A separate function will allow us to provide some context on what operation we are performing. Breaking up this line into multiple lines with each line focused on a concrete operation with help readability and not hide functionality.
Match match = grok.match(recordMap.get(entry.getKey()).toString()); | ||
match.setKeepEmptyCaptures(grokPrepperConfig.isKeepEmptyCaptures()); | ||
|
||
mergeUpdateWithOriginalMap(recordMap, match.capture()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't read very well. Might I recommend: "mergeCapturesWithRecordMap"
Signed-off-by: graytaylor0 <[email protected]>
Signed-off-by: graytaylor0 <[email protected]>
|
||
## Notes on Patterns | ||
|
||
The Grok Prepper internally uses the [java-grok Library](https://github.com/thekrakken/java-grok), so any patterns compatible there are compatible with Grok Prepper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments here, but let's defer that to a separate PR where we can focus on the documentation for our users later. I don't want to block this functionality right now.
-
"so any patterns compatible there are compatible with Grok Prepper." - this is passive and "there" refers to the object of the sentence instead of the subject which is a little confusing. Might I recommend:
"The Grok Prepper uses the the java-grok Library internally and supports all java-grok Library compatible patterns." -
We should link to the default patterns as well.
@@ -71,20 +68,20 @@ public GrokPrepper(final PluginSetting pluginSetting) { | |||
* @return Record modified output records | |||
*/ | |||
@Override | |||
public Collection<Record<String>> doExecute(Collection<Record<String>> records) { | |||
public Collection<Record<String>> doExecute(final Collection<Record<String>> records) { | |||
final List<Record<String>> recordsOut = new LinkedList<>(); | |||
|
|||
for (Record<String> record : records) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: final
Signed-off-by: graytaylor0 <[email protected]>
...grok-prepper/src/test/java/com/amazon/dataprepper/plugins/prepper/grok/GrokPrepperTests.java
Outdated
Show resolved
Hide resolved
Signed-off-by: graytaylor0 <[email protected]>
Signed-off-by: graytaylor0 <[email protected]>
Signed-off-by: graytaylor0 [email protected]
Description
Issues Resolved
#304
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.