Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added ANTLR visitor for parsing Logstash configuration #506

Merged
merged 6 commits into from
Nov 5, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ subprojects {
sourceCompatibility = '1.8'
spotless {
java {
targetExclude 'build/generated-src/antlr/**'
// TODO: enrich format rules
removeUnusedImports()
}
Expand Down
9 changes: 9 additions & 0 deletions config/checkstyle/checkstyle-suppressions.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<?xml version="1.0"?>
<!DOCTYPE suppressions PUBLIC
"-//Puppy Crawl//DTD Suppressions 1.1//EN"
"http://checkstyle.sourceforge.net/dtds/suppressions_1_1.dtd">

<suppressions>
<!-- suppress all checks on files generated by ANTLR -->
<suppress files="data-prepper-logstash-configuration[\\/]build[\\/]generated-src[\\/]antlr[\\/]main[\\/]*" checks="[a-zA-Z0-9]*"/>
</suppressions>
4 changes: 4 additions & 0 deletions config/checkstyle/checkstyle.xml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@
<module name="Checker">
<property name="charset" value="UTF-8" />

<module name="SuppressionFilter">
<property name="file" value="${config_loc}/checkstyle-suppressions.xml" />
</module>

<!-- Checks Java files and forbids empty Javadoc comments -->
<module name="RegexpMultiline">
<property name="id" value="EmptyJavadoc"/>
Expand Down
23 changes: 23 additions & 0 deletions data-prepper-logstash-configuration/build.gradle
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
plugins {
id 'java'
id 'antlr'
id 'idea'
}

repositories {
mavenCentral()
}

dependencies {
antlr "org.antlr:antlr4:4.9.2"
testImplementation "org.hamcrest:hamcrest:2.2"
testImplementation "org.mockito:mockito-inline:${versionMap.mockito}"
testImplementation platform("org.junit:junit-bom:${versionMap.junitJupiter}")
}

generateGrammarSource {
maxHeapSize = "128m"
arguments += ['-listener', '-visitor']
outputDirectory = new File("build/generated-src/antlr/main/org/opensearch/dataprepper/logstash/".toString())
}
compileJava.dependsOn generateGrammarSource
141 changes: 141 additions & 0 deletions data-prepper-logstash-configuration/src/main/antlr/Logstash.g4
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
/*
* ANTLR grammar file for parsing Logstash configurations
*/
grammar Logstash;
sshivanii marked this conversation as resolved.
Show resolved Hide resolved

@header {
package org.opensearch.dataprepper.logstash;
}
/*
* Parser Rules
*/
config: filler plugin_section filler (filler plugin_section)* filler;

filler: (COMMENT | WS | NEWLINE)*;

plugin_section: plugin_type filler '{'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use camel-case for all our grammar definitions (for example: pluginSection) This would result in method signatures with a single casing:

public Object visitPluginSsection(LogstashParser.PluginSectionContext pluginSectionContext) {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this raises a question as to what convention we should use in the Grammar - follow the convention from Logstash, which would make it easier to map back to it; or use conventions which make the Java code nicer. This code should be used exclusively by the Logstash configuration framework, so I'm fine with using the Logstash conventions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh okay. given this maps directly and is a manual process I am onboard with keep this as is to assist in the manual mapping process. Thanks for clarifying.

filler (branch_or_plugin filler)*
'}';

plugin_type: ('input' | 'filter' | 'output');

branch_or_plugin: branch | plugin;

plugin:
name filler '{'
filler
attributes
filler
'}';

attributes:( attribute (filler attribute)*)?;

attribute: name filler '=>' filler value;

name: BAREWORD | STRING;

value: plugin | BAREWORD | STRING | NUMBER | array | hash;

branch: r_if (filler else_if)* (filler r_else)?;

r_if: 'if' filler condition filler '{' filler (branch_or_plugin filler)* '}';

else_if: 'else' filler 'if' filler condition filler '{' filler ( branch_or_plugin filler)* '}';

r_else: 'else' filler '{' filler (branch_or_plugin filler)* '}';

condition: expression (filler boolean_operator filler expression)*;

expression:
(
('(' filler condition filler ')')
| negative_expression
| in_expression
| not_in_expression
| compare_expression
| regexp_expression
| rvalue
);

array:
'['
filler
(
value (filler ',' filler value)*
)?
filler
']';

hash:
'{'
filler
hashentries?
filler
'}';

hashentries: hashentry (WS hashentry)*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashEntires


hashentry: hashname filler '=>' filler value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashEntry


hashname: BAREWORD | STRING | NUMBER;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hashName


boolean_operator: ('and' | 'or' | 'xor' | 'nand');

negative_expression:
(
('!' filler '(' filler condition filler ')')
| ('!' filler selector)
);

in_expression: rvalue filler in_operator filler rvalue;

not_in_expression: rvalue filler not_in_operator filler rvalue;

rvalue: STRING | NUMBER | selector | array | method_call | regexp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between an rvalue and a value?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value is just value set for attributes. While rvalue is used in conditional statement.
We don't use rvalue currently, but we need to define all the rules in the grammar to parse the configuration.


regexp: '/' ('\\' | ~'/' .)*? '/';

selector: selector_element+;

compare_expression: rvalue filler compare_operator filler rvalue;

regexp_expression: rvalue filler regexp_operator filler (STRING | regexp);

selector_element: '[' ~( '[' | ']' | ',' )+ ']';

in_operator: 'in';

not_in_operator: 'not' filler 'in';

method_call:
BAREWORD filler '(' filler
(
rvalue ( filler ',' filler rvalue )*
)?
filler ')';

compare_operator: ('==' | '!=' | '<=' | '>=' | '<' | '>') ;

regexp_operator: ('=~' | '!~');

/*
* Lexer Rules
*/

COMMENT: (WS? '#' ~('\r'|'\n')*)+;

NEWLINE: ('\r'? '\n' | '\r')+ -> skip;

WS: ( NEWLINE | ' ' | '\t')+;

fragment DIGIT: [0-9];

NUMBER: '-'? DIGIT+ ('.' DIGIT*)?;

BAREWORD: [a-zA-Z0-9_]+;

STRING: DOUBLE_QUOTED_STRING | SINGLE_QUOTED_STRING;

fragment DOUBLE_QUOTED_STRING : ('"' ( '\\"' | . )*? '"');

fragment SINGLE_QUOTED_STRING : ('\'' ('\'' | . )*? '\'');
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package org.opensearch.dataprepper.logstash.exception;

/**
* Exception for Logstash configuration converter
*
* @since 1.2
*/
public class LogstashConfigurationException extends RuntimeException {

public LogstashConfigurationException(String errorMessage) {
super(errorMessage);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package org.opensearch.dataprepper.logstash.exception;

/**
* Exception thrown when ANTLR fails to parse the Logstash configuration
*
* @since 1.2
*/
public class LogstashGrammarException extends LogstashParsingException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exception is never used. Will it be used in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The intention was to use it in the future.


public LogstashGrammarException(String errorMessage) {
super(errorMessage);
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
package org.opensearch.dataprepper.logstash.exception;

/**
* Exception thrown when ANTLR visitor is unable to convert Logstash configuration into Logstash model objects
*
* @since 1.2
*/
public class LogstashParsingException extends LogstashConfigurationException {

public LogstashParsingException(String errorMessage) {
super(errorMessage);
}
}
Original file line number Diff line number Diff line change
@@ -1,12 +1,35 @@
package org.opensearch.dataprepper.logstash.model;

import java.util.Arrays;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;

/**
* Types of plugins in Logstash configuration
*
* @since 1.2
*/
public enum LogstashPluginType {
INPUT,
FILTER,
OUTPUT
INPUT("input"),
FILTER("filter"),
OUTPUT("output");

private final String value;

private static final Map<String, LogstashPluginType> VALUES_MAP = Arrays.stream(LogstashPluginType.values())
.collect(Collectors.toMap(LogstashPluginType::toString, Function.identity()));

LogstashPluginType(final String value) {
this.value = value;
}

@Override
public String toString() {
return value;
}

public static LogstashPluginType getByValue(final String value) {
return VALUES_MAP.get(value.toLowerCase());
}
}
Loading