Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex used to parse Grok patterns prints warning with latest Joni #47861

Closed
droberts195 opened this issue Oct 10, 2019 · 1 comment · Fixed by #47870
Closed

Regex used to parse Grok patterns prints warning with latest Joni #47861

droberts195 opened this issue Oct 10, 2019 · 1 comment · Fixed by #47870
Assignees
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.5.0

Comments

@droberts195
Copy link
Contributor

Following the Joni upgrade in #47374 the following warning is printed on Elasticsearch startup:

regular expression has redundant nested repeat operator + /%\{(?<name>(?<pattern>[A-z0-9]+)(?::(?<subname>[[:alnum:]@\[\]_:.-]+))?)(?:=(?<definition>(?:(?:[^{}]+|\.+)+)+))?\}/

I guess it means there is a non-fatal inefficiency in this regex:

private static final String GROK_PATTERN =
"%\\{" +
"(?<name>" +
"(?<pattern>[A-z0-9]+)" +
"(?::(?<subname>[[:alnum:]@\\[\\]_:.-]+))?" +
")" +
"(?:=(?<definition>" +
"(?:" +
"(?:[^{}]+|\\.+)+" +
")+" +
")" +
")?" + "\\}";
private static final Regex GROK_PATTERN_REGEX = new Regex(GROK_PATTERN.getBytes(StandardCharsets.UTF_8), 0,
GROK_PATTERN.getBytes(StandardCharsets.UTF_8).length, Option.NONE, UTF8Encoding.INSTANCE, Syntax.DEFAULT);

That regular expression has not changed for a long time, so fixing it is probably not necessary for correctness, but it will avoid questions on forums/issues/support cases if it could be fixed before release of 7.5.

@droberts195 droberts195 added :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.5.0 labels Oct 10, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/Ingest)

@martijnvg martijnvg self-assigned this Oct 10, 2019
martijnvg added a commit to martijnvg/elasticsearch that referenced this issue Oct 10, 2019
This prevents the following warning from being printed to console:
`regular expression has redundant nested repeat operator + /%\{(?<name>(?<pattern>[A-z0-9]+)(?::(?<subname>[[:alnum:]@\[\]_:.-]+))?)(?:=(?<definition>(?:(?:[^{}]+|\.+)+)+))?\}/`

The current grok expression is not failing, but just this warning is being printed.
The warning started being printed after upgrading joni (elastic#47374).

Closes elastic#47861
martijnvg added a commit that referenced this issue Oct 14, 2019
This prevents the following warning from being printed to console:
`regular expression has redundant nested repeat operator + /%\{(?<name>(?<pattern>[A-z0-9]+)(?::(?<subname>[[:alnum:]@\[\]_:.-]+))?)(?:=(?<definition>(?:(?:[^{}]+|\.+)+)+))?\}/`

The current grok expression is not failing, but just this warning is being printed.
The warning started being printed after upgrading joni (#47374).

Closes #47861
martijnvg added a commit that referenced this issue Oct 14, 2019
This prevents the following warning from being printed to console:
`regular expression has redundant nested repeat operator + /%\{(?<name>(?<pattern>[A-z0-9]+)(?::(?<subname>[[:alnum:]@\[\]_:.-]+))?)(?:=(?<definition>(?:(?:[^{}]+|\.+)+)+))?\}/`

The current grok expression is not failing, but just this warning is being printed.
The warning started being printed after upgrading joni (#47374).

Closes #47861
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v7.5.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants