Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Ingest Pipelines Grok processor to accept patterns that contain escaped characters #137245

Merged

Conversation

cjcenizal
Copy link
Contributor

@cjcenizal cjcenizal commented Jul 27, 2022

Fixes #124027

Summary

The original issue reported this Grok pattern as the failure case:

%{IPORHOST:ip}%{SPACE}-%{SPACE}%{DATA:user_name}%{SPACE}\[%{HTTPDATE:timestamp}\]%{SPACE}\"%{WORD:http_method}%{SPACE}%{DATA:url}%{SPACE}HTTP/%{NUMBER:http_version}\"%{SPACE}%{NUMBER:response_code}%{SPACE}%{NUMBER:body_sent_bytes}%{SPACE}\"%{DATA:referrer}\"%{SPACE}\"%{DATA:agent}\"%{SPACE}\"%{IPORHOST:client_ip}\

However, it contains a trailing backslash. I ran it through Grok Debugger, which rejected it, so I believe the trailing backslash is invalid syntax anyway.

I implemented the solution proposed by JL and added a test with a new case:

\[%{HTTPDATE:timestamp}\]%{SPACE}\"%{WORD:http_method}%{SPACE}HTTP/%{NUMBER:http_version}\"

This is valid grok syntax, but the backslashes get rejected by the JSON string validator in the original code. Removing that validator solves the validation problem, and then it looks like no additional escaping or unescaping of slashes is necessary. Reviewers can manually test this fix by saving a Grok processor with the new pattern, above.

Release note

Ingest Pipelines will now accept valid grok syntax for Grok processor patterns, and will no longer incorrectly reject them with an "Invalid JSON string" error.

@cjcenizal cjcenizal added release_note:fix Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more Feature:Ingest Node Pipelines Ingest node pipelines management backport:all-open Backport to all branches that could still receive a release labels Jul 27, 2022
@cjcenizal cjcenizal requested a review from a team as a code owner July 27, 2022 00:07
@elasticmachine
Copy link
Contributor

Pinging @elastic/platform-deployment-management (Team:Deployment Management)

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
ingestPipelines 453.0KB 452.9KB -74.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Member

@sabarasaba sabarasaba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for patching this up @cjcenizal! Changes lgtm

// Add "field" value
form.setInputValue('fieldNameField.input', 'test_grok_processor');

// Add the escaped value of \[%{HTTPDATE:timestamp}\]%{SPACE}\"%{WORD:http_method}%{SPACE}HTTP/%{NUMBER:http_version}\"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems a bit redundant to me to write the whole escaped string in this comment

Suggested change
// Add the escaped value of \[%{HTTPDATE:timestamp}\]%{SPACE}\"%{WORD:http_method}%{SPACE}HTTP/%{NUMBER:http_version}\"
// Add the escaped value into the input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you. It's definitely repetitive. TBH I liked having it because it saved me the trouble of having to read the escaped string, parse it for every backslash and then mentally unescape it. This was there's no mental computation necessary, the value is right there for me to read.

@cjcenizal cjcenizal merged commit 8b01dcf into elastic:main Aug 2, 2022
@cjcenizal cjcenizal deleted the bug/ingest-pipelines-grok-pattern-encoding branch August 2, 2022 14:27
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Aug 2, 2022
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Aug 2, 2022
@kibanamachine
Copy link
Contributor

💔 Some backports could not be created

Status Branch Result
7.17 Backport failed because of merge conflicts
8.3
8.4

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 137245

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Aug 2, 2022
…scaped characters. (#137245) (#137866)

(cherry picked from commit 8b01dcf)

Co-authored-by: CJ Cenizal <[email protected]>
kibanamachine added a commit that referenced this pull request Aug 2, 2022
…scaped characters. (#137245) (#137867)

(cherry picked from commit 8b01dcf)

Co-authored-by: CJ Cenizal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:all-open Backport to all branches that could still receive a release Feature:Ingest Node Pipelines Ingest node pipelines management release_note:fix Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more v8.3.4 v8.4.0 v8.5.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Ingest Pipelines] Invalid error shown for grok processor
5 participants