-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added ANTLR visitor for parsing Logstash configuration #506
Conversation
.../main/java/org/opensearch/dataprepper/logstash/exception/LogstashConfigurationException.java
Outdated
Show resolved
Hide resolved
package org.opensearch.dataprepper.logstash.exception; | ||
|
||
/** | ||
* Exception for visitor when it's unable to convert into Logstash models |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like the javadoc to be more descriptive about what is Visitor? You can update this to LogstashVisitor
and in that mention what the visitor class does.
private final Map<String, Object> hashEntries = new LinkedHashMap<>(); | ||
|
||
@Override | ||
public Object visitConfig(LogstashParser.ConfigContext ctx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename this to LogstashParser.ConfigContext configContext
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can rename the parameters.
...configuration/src/test/java/org/opensearch/dataprepper/logstash/parser/TestDataProvider.java
Outdated
Show resolved
Hide resolved
Sidenote: The first commit for this PR |
ecee63a
to
4935f3d
Compare
|
||
/** | ||
* Exception thrown when {@link org.opensearch.dataprepper.logstash.parser.LogstashVisitor} is unable to convert | ||
* * Logstash configuration into Logstash model objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: there is an extra *
switch (pluginSectionContext.plugin_type().getText()) { | ||
case "input": | ||
logstashPluginType = LogstashPluginType.INPUT; | ||
break; | ||
case "filter": | ||
logstashPluginType = LogstashPluginType.FILTER; | ||
break; | ||
case "output": | ||
logstashPluginType = LogstashPluginType.OUTPUT; | ||
break; | ||
default: | ||
throw new LogstashParsingException("only input, filter and output plugin sections are supported."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic belongs in LogstashPluginType
enum as it is related to the creation of an enum. We can also eliminate the need for a case statement by using a map.
I would recommend updating the enum like so:
public enum LogstashPluginType {
INPUT("input"),
FILTER("filter"),
OUTPUT("output");
private final String value;
private static final Map<Stirng, LogstashPlugingType> VALUES_MAP = Arrays.stream(LogstashPlugingType.values())
.collect(Collectors.toMap(LogstashPlugingType::toString, Function.identity()));
LogstashPlugingType(final String value) {
this.value = value;
}
public String toString() {
return value;
}
public static LogstashPlugingType getByValue(final String value) {
return VALUES_MAP.get(name.toLowerCase());
}
}
Then you can simplify this to:
final LogstashPluginType logstashPluginType = LogstashPlugingType.getByValue(pluginSectionContext.plugin_type().getText());
|
||
filler: (COMMENT | WS | NEWLINE)*; | ||
|
||
plugin_section: plugin_type filler '{' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use camel-case for all our grammar definitions (for example: pluginSection
) This would result in method signatures with a single casing:
public Object visitPluginSsection(LogstashParser.PluginSectionContext pluginSectionContext) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this raises a question as to what convention we should use in the Grammar - follow the convention from Logstash, which would make it easier to map back to it; or use conventions which make the Java code nicer. This code should be used exclusively by the Logstash configuration framework, so I'm fine with using the Logstash conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh okay. given this maps directly and is a manual process I am onboard with keep this as is to assist in the manual mapping process. Thanks for clarifying.
private List<LogstashPlugin> logstashPluginList; | ||
private final Map<LogstashPluginType, List<LogstashPlugin>> pluginSections = new LinkedHashMap<>(); | ||
private final Map<String, Object> hashEntries = new LinkedHashMap<>(); | ||
|
||
@Override | ||
public Object visitConfig(LogstashParser.ConfigContext configContext) { | ||
for(int i = 0; i < configContext.plugin_section().size(); i++) { | ||
visitPlugin_section(configContext.plugin_section().get(i)); | ||
} | ||
return LogstashConfiguration.builder() | ||
.pluginSections(pluginSections) | ||
.build(); | ||
} | ||
|
||
@Override | ||
public Object visitPlugin_section(LogstashParser.Plugin_sectionContext pluginSectionContext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lists and maps don't need to be declared globally. We can use the methods as intended and return the constructed lists over mutating and reassigning the objects. This will simplify the code:
@Override
public Object visitConfig(final LogstashParser.ConfigContext configContext) {
final Map<LogstashPluginType, List<LogstashPlugin>> pluginSections = new LinkedHashMap<>();
configContext.plugin_section().forEach(pluginSection -> {
final LogstashPluginType logstashPluginType = LogstashPluginType.getByValue(pluginSectionContext.plugin_type().getText();
final List<LogstashPlugin> logstashPluginList = (List<LogstashPlugin>) visitPlugin_section(pluginSection);
pluginSections.put(logstashPluginType, logstashPluginList);
}
);
return LogstashConfiguration.builder()
.pluginSections(pluginSections)
.build();
}
@Override
public Object visitPlugin_section(final LogstashParser.Plugin_sectionContext pluginSectionContext) {
return pluginSectionContext.branch_or_plugin().stream()
.map(this::visitBranch_or_plugin)
.collect(Collectors.toList());
});
}
(see the comment on visitHashEntries() below)
String pluginName = pluginContext.name().getText(); | ||
List<LogstashAttribute> logstashAttributeList = new ArrayList<>(); | ||
|
||
for (int i = 0; i < pluginContext.attributes().attribute().size(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is strange. I am not sure if there is a way around it but attributes
is an AttributeContext while attribute
is a list of AttributeContexts...
Is this an issue with our grammar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the attributes will have list of attribute in grammar.
attributes
attribute 1
attribute 2
We are storing each attribute from configuration in a list of type LogstashAttribute.
|
||
hashentries: hashentry (WS hashentry)*; | ||
|
||
hashentry: hashname filler '=>' filler value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashEntry
@Override | ||
public Object visitHashentries(LogstashParser.HashentriesContext hashentriesContext) { | ||
for (int i = 0; i < hashentriesContext.hashentry().size(); i++) { | ||
visitHashentry((hashentriesContext.hashentry().get(i))); | ||
} | ||
return hashEntries; | ||
} | ||
|
||
@Override | ||
public Object visitHashentry(LogstashParser.HashentryContext hashentryContext) { | ||
|
||
if (hashentryContext.value().getChild(0) instanceof LogstashParser.ArrayContext) | ||
hashEntries.put(hashentryContext.hashname().getText(), visitArray(hashentryContext.value().array())); | ||
|
||
else | ||
hashEntries.put(hashentryContext.hashname().getText(), hashentryContext.value().getText()); | ||
|
||
return hashEntries; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extending off what I said earlier to eliminate the use of the shared class value:
@Override
public Object visitHashentries(LogstashParser.HashentriesContext hashentriesContext) {
final Map<String, Object> hashEntries = new LinkedHashMap<>();
hashentriesContext.hashentry().forEach(hashentryContext -> {
final String key = hashentryContext.hashname().getText();
final Object value = visitHashentry(hashentryContext);
hashEntries.put(key, value);
});
return hashEntries;
}
@Override
public Object visitHashentry(LogstashParser.HashentryContext hashentryContext) {
if (hashentryContext.value().getChild(0) instanceof LogstashParser.ArrayContext)
return visitArray(hashentryContext.value().array());
return hashentryContext.value().getText();
}
void visit_config_test() { | ||
final LogstashParser.ConfigContext configContextMock = mock(LogstashParser.ConfigContext.class); | ||
final LogstashParser.Plugin_sectionContext pluginSectionMock = mock(LogstashParser.Plugin_sectionContext.class); | ||
final LogstashParser.Plugin_typeContext pluginTypeContextMock = mock(LogstashParser.Plugin_typeContext.class); | ||
final LogstashParser.Branch_or_pluginContext branchOrPluginContextMock = mock(LogstashParser.Branch_or_pluginContext.class); | ||
|
||
given(configContextMock.plugin_section()).willReturn(Collections.singletonList(pluginSectionMock)); | ||
given(pluginSectionMock.plugin_type()).willReturn(pluginTypeContextMock); | ||
given(pluginTypeContextMock.getText()).willReturn("input"); | ||
given(pluginSectionMock.branch_or_plugin()).willReturn(Collections.singletonList(branchOrPluginContextMock)); | ||
|
||
LogstashVisitor logstashVisitor = createObjectUnderTest(); | ||
Mockito.doReturn(TestDataProvider.pluginWithOneArrayContextAttributeData()).when(logstashVisitor).visitBranch_or_plugin(branchOrPluginContextMock); | ||
|
||
LogstashConfiguration actualLogstashConfiguration = (LogstashConfiguration) logstashVisitor.visitConfig(configContextMock); | ||
LogstashConfiguration expectedLogstashConfiguration = TestDataProvider.configData(); | ||
|
||
assertThat(actualLogstashConfiguration.getPluginSection(LogstashPluginType.INPUT).size(), | ||
equalTo(expectedLogstashConfiguration.getPluginSection(LogstashPluginType.INPUT).size())); | ||
Mockito.verify(logstashVisitor, Mockito.times(1)).visitPlugin_section(pluginSectionMock); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears you are mocking the same objects over and over again in each test. I would encourage you to use @Mock
annotation and create a class variable to reuse. I would also encourage you to create a LogstashVisitor class variable to assign in a @BeforeEach
function as well. This will help reduce the duplicate code you have.
} | ||
|
||
@Test | ||
void visit_config_test() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is not very descriptive. What are we testing for here?
import java.util.ArrayList; | ||
|
||
/** | ||
* Class to populate Logstash configuration model objects using ANTLR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the following to the documentation? I am not familiar with ANTLR and how this whole process works. It took me a little while to understand where the LogstashBaseVisitor
is coming from. I didn't realize I had to build my code to generate the class.
- What is a LogstashBaseVisitor?
- How is a LogstashBaseVisitor generated?
- Where do all the methods come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While this is confusing in a PR, this is all part of the build process and I think detailed documentation here is not quite appropriate. If anything, I'd keep it as simple as "... using ANTLR library classes and generated code."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works. Anything to point the reader in the right direction.
* | ||
* @since 1.2 | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove extra lines between Javadocs and their classes/methods. I found this one, but there may be others.
*/ | ||
|
||
@SuppressWarnings("rawtypes") | ||
public class LogstashVisitor extends LogstashBaseVisitor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be package private so that it is not used outside of this work.
import java.util.LinkedHashMap; | ||
|
||
|
||
public class TestDataProvider { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class should also be package private.
|
||
filler: (COMMENT | WS | NEWLINE)*; | ||
|
||
plugin_section: plugin_type filler '{' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this raises a question as to what convention we should use in the Grammar - follow the convention from Logstash, which would make it easier to map back to it; or use conventions which make the Java code nicer. This code should be used exclusively by the Logstash configuration framework, so I'm fine with using the Logstash conventions.
package org.opensearch.dataprepper.logstash.exception; | ||
|
||
/** | ||
* Exception thrown when {@link org.opensearch.dataprepper.logstash.parser.LogstashVisitor} is unable to convert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan of {@link}
, but I suggest removing it here. The LogstashVisitor
is an internal implementation detail. Also, exceptions should generally not declare which classes use them. It goes the other way: classes specify which exceptions they throw.
I suggest rewording this somewhat.
4935f3d
to
49dd6c8
Compare
class LogstashVisitor extends LogstashBaseVisitor { | ||
|
||
@Override | ||
public Object visitConfig(LogstashParser.ConfigContext configContext) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can mark every parameter for all methods implemented in this class as final
value = attributeContext.value().getText().replaceAll("^\"|\"$|^'|'$", ""); | ||
} | ||
|
||
LogstashAttributeValue logstashAttributeValue = LogstashAttributeValue.builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
final
* | ||
* @since 1.2 | ||
*/ | ||
public class LogstashGrammarException extends LogstashParsingException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exception is never used. Will it be used in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The intention was to use it in the future.
Signed-off-by: Asif Sohail Mohammed <[email protected]>
Signed-off-by: Asif Sohail Mohammed <[email protected]>
added logstash visitor populate logstash model POJO's Signed-off-by: Asif Sohail Mohammed <[email protected]>
updated javadoc for public classes Signed-off-by: Asif Sohail Mohammed <[email protected]>
Signed-off-by: Asif Sohail Mohammed <[email protected]>
Signed-off-by: Asif Sohail Mohammed <[email protected]>
34a37a3
to
3d006ec
Compare
added visitor to populate Logstash model objects using ANTLR library Signed-off-by: Asif Sohail Mohammed <[email protected]>
Description
Issues Resolved
Resolve #465
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.