Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Fix "regular expression has redundant nested repeat operator" in MySQL slowlog module. #17156

Merged
merged 4 commits into from
Apr 1, 2020
Merged

Conversation

whataboutpereira
Copy link
Contributor

@whataboutpereira whataboutpereira commented Mar 20, 2020

What does this PR do?

Remove a superfluous wildcard from EXPLAIN pattern used in the the grok pattern in MySQL slowlog ingest pipeline.

Why is it important?

It will fix excessive log messages produced every time the pipeline is used:

Mar 18 18:12:07 elk elasticsearch[29353]: regular expression has redundant nested repeat operator * /^# User@Host: (?<USER:user.name>(?:[a-zA-Z0-9._-]+))(\[(?<USER:mysql.slowlog.current_user>(?:[a-zA-Z0-9._-]+))\])? @ (?<HOSTNAME:source.domain>\b(?:[0-9A-Za-z][0-9A-Za-z-]{0,62})(?:\.(?:[0-9A-Za-z][0-9A-Za-z-]{0,62}))*(\.?|\b))? \[(?<IP:source.ip>(?:(?:((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?)|(?:(?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9]))))?\](?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*))(Id:(?:\s*)(?<NUMBER:mysql.thread_id:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Thread_id:(?:\s*)(?<NUMBER:mysql.thread_id>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Schema:(?:\s*)(?<WORD:mysql.slowlog.schema>\b\w+\b)?(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Last_errno: (?<NUMBER:mysql.slowlog.last_errno:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Killed: (?<NUMBER:mysql.slowlog.killed:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(QC_hit: (?<WORD:mysql.slowlog.query_cache_hit>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Query_time: (?<NUMBER:temp.duration:float>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Lock_time: (?<NUMBER:mysql.slowlog.lock_time.sec:float>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Rows_sent: (?<NUMBER:mysql.slowlog.rows_sent:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Rows_examined: (?<NUMBER:mysql.slowlog.rows_examined:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Rows_affected: (?<NUMBER:mysql.slowlog.rows_affected:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Thread_id: (?<NUMBER:mysql.thread_id>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Errno: (?<NUMBER:mysql.slowlog.last_errno:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Killed: (?<NUMBER:mysql.slowlog.killed:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Bytes_received: (?<NUMBER:mysql.slowlog.bytes_received:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Bytes_sent: (?<NUMBER:mysql.slowlog.bytes_sent:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Read_first: (?<NUMBER:mysql.slowlog.read_first:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Read_last: (?<NUMBER:mysql.slowlog.read_last:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Read_key: (?<NUMBER:mysql.slowlog.read_key:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Read_next: (?<NUMBER:mysql.slowlog.read_next:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Read_prev: (?<NUMBER:mysql.slowlog.read_prev:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Read_rnd: (?<NUMBER:mysql.slowlog.read_rnd:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Read_rnd_next: (?<NUMBER:mysql.slowlog.read_rnd_next:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Sort_merge_passes: (?<NUMBER:mysql.slowlog.sort_merge_passes:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Sort_range_count: (?<NUMBER:mysql.slowlog.sort_range_count:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Sort_rows: (?<NUMBER:mysql.slowlog.sort_rows:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Sort_scan_count: (?<NUMBER:mysql.slowlog.sort_scan_count:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Created_tmp_disk_tables: (?<NUMBER:mysql.slowlog.tmp_disk_tables:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Created_tmp_tables: (?<NUMBER:mysql.slowlog.tmp_tables:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Tmp_tables: (?<NUMBER:mysql.slowlog.tmp_tables:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Tmp_disk_tables: (?<NUMBER:mysql.slowlog.tmp_disk_tables>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Tmp_table_sizes: (?<NUMBER:mysql.slowlog.tmp_table_sizes:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Start: (?<TIMESTAMP_ISO8601:event.start>(?:(?>\d\d){1,2})-(?:(?:0?[1-9]|1[0-2]))-(?:(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]))[T ](?:(?:2[0123]|[01][0-9])):?(?:(?:[0-5][0-9]))(?::?(?:(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)))?(?:(?:Z|[+-](?:(?:2[0123]|[01]?[0-9]))(?::?(?:(?:[0-5][0-9])))))?)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(End: (?<TIMESTAMP_ISO8601:event.end>(?:(?>\d\d){1,2})-(?:(?:0?[1-9]|1[0-2]))-(?:(?:(?:0[1-9])|(?:[12][0-9])|(?:3[01])|[1-9]))[T ](?:(?:2[0123]|[01][0-9])):?(?:(?:[0-5][0-9]))(?::?(?:(?:(?:[0-5]?[0-9]|60)(?:[:.,][0-9]+)?)))?(?:(?:Z|[+-](?:(?:2[0123]|[01]?[0-9]))(?::?(?:(?:[0-5][0-9])))))?)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(InnoDB_trx_id: (?<WORD:mysql.slowlog.innodb.trx_id>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(QC_Hit: (?<WORD:mysql.slowlog.query_cache_hit>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Full_scan: (?<WORD:mysql.slowlog.full_scan>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Full_join: (?<WORD:mysql.slowlog.full_join>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Tmp_table: (?<WORD:mysql.slowlog.tmp_table>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Tmp_table_on_disk: (?<WORD:mysql.slowlog.tmp_table_on_disk>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Filesort: (?<WORD:mysql.slowlog.filesort>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Filesort_on_disk: (?<WORD:mysql.slowlog.filesort_on_disk>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Merge_passes: (?<NUMBER:mysql.slowlog.merge_passes:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Priority_queue: (?<WORD:mysql.slowlog.priority_queue>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(No InnoDB statistics available for this query(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(InnoDB_IO_r_ops: (?<NUMBER:mysql.slowlog.innodb.io_r_ops:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(InnoDB_IO_r_bytes: (?<NUMBER:mysql.slowlog.innodb.io_r_bytes:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(InnoDB_IO_r_wait: (?<NUMBER:mysql.slowlog.innodb.io_r_wait.sec:float>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(InnoDB_rec_lock_wait: (?<NUMBER:mysql.slowlog.innodb.rec_lock_wait.sec:float>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(InnoDB_queue_wait: (?<NUMBER:mysql.slowlog.innodb.queue_wait.sec:float>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(InnoDB_pages_distinct: (?<NUMBER:mysql.slowlog.innodb.pages_distinct:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Log_slow_rate_type: (?<WORD:mysql.slowlog.log_slow_rate_type>\b\w+\b)(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(Log_slow_rate_limit: (?<NUMBER:mysql.slowlog.log_slow_rate_limit:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))))(?:([ #
Mar 18 18:12:07 elk elasticsearch[29353]: ]*)))?(?:(# explain:.*
Mar 18 18:12:07 elk elasticsearch[29353]: |#\s*
Mar 18 18:12:07 elk elasticsearch[29353]: )*)?(use (?<WORD:mysql.slowlog.schema>\b\w+\b);
Mar 18 18:12:07 elk elasticsearch[29353]: )?SET timestamp=(?<NUMBER:mysql.slowlog.timestamp:long>(?:(?:(?<![0-9.+-])(?>[+-]?(?:(?:[0-9]+(?:\.[0-9]+)?)|(?:\.[0-9]+))))));
Mar 18 18:12:07 elk elasticsearch[29353]: (?<GREEDYMULTILINE:mysql.slowlog.query>(.|
Mar 18 18:12:07 elk elasticsearch[29353]: )*)/

Checklist

  • My code follows the style guidelines of this project
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
    - [ ] I have added tests that prove my fix is effective or that my feature works

Related issues

…MySQL slowlog module.

Fix for issue #17086.

The grok pattern used in the MySQL slowlog ingest pipeline uses EXPLAIN pattern with a superfluous * wildcard.

This makes Elasticsearch log a warning seemingly every time the pipeline is used.
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@cla-checker-service
Copy link

cla-checker-service bot commented Mar 20, 2020

💚 CLA has been signed

@whataboutpereira
Copy link
Contributor Author

❌ Author of the following commits did not sign a Contributor Agreement:
d4aef7c

Please, read and sign the above mentioned agreement if you want to contribute to this project

CLA signed.

@andresrc andresrc added [zube]: Inbox [zube]: In Review Team:Services (Deprecated) Label for the former Integrations-Services team and removed [zube]: Inbox labels Mar 20, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations-services (Team:Services)

@whataboutpereira
Copy link
Contributor Author

Looks like it's the other way around. Didn't realize expanding a pattern adds (?: ) around the pattern.

@adriansr
Copy link
Contributor

adriansr commented Mar 20, 2020

Your solution of removing the * from the EXPLAIN pattern causes a failure on one of the tests where there are multiple explain: [...] lines.

I've experimented a little bit and found out that this change instead:

-        "EXPLAIN": "(# explain:.*\n|#\\s*\n)*"
+        "EXPLAIN": "(:?(# explain:.*\n|#\\s*\n)*)"

Makes the Elasticsearch error go away and at the same time passes all the tests. Now, I'm not 100% sure this is a good solution and so many regexps is giving me a headache. @jsoriano WDYT?

@adriansr
Copy link
Contributor

I commented before seeing your last commit and comment. The current solution is nicer than what I propose and the tests pass 👍

@adriansr adriansr requested a review from jsoriano March 20, 2020 18:34
@adriansr
Copy link
Contributor

jenkins, test this

@whataboutpereira
Copy link
Contributor Author

Regex. Love it when it works, hate it when it doesn't. :)

@adriansr adriansr self-assigned this Mar 23, 2020
@adriansr adriansr requested a review from sayden March 23, 2020 22:05
@sayden sayden self-assigned this Mar 24, 2020
@sayden sayden removed the request for review from jsoriano March 24, 2020 17:02
@jsoriano
Copy link
Member

LGTM! Thanks for dealing with these regular expressions!

@adriansr
Copy link
Contributor

adriansr commented Apr 1, 2020

@whataboutpereira this is good to merge. Can you add an entry to the CHANGELOG.next.asciidoc file, at the end of the Bugfixes - Metricbeat section?

@whataboutpereira
Copy link
Contributor Author

whataboutpereira commented Apr 1, 2020

@whataboutpereira this is good to merge. Can you add an entry to the CHANGELOG.next.asciidoc file, at the end of the Bugfixes - Metricbeat section?

Metricbeat or Filebeat? I added it to Filebeat section for now (since it's a Filebeat module). Let me know if I need to move it to Metricbeat afterall. :)

@adriansr
Copy link
Contributor

adriansr commented Apr 1, 2020

Right, my mistake, this is a Filebeat module.

I've fixed the changelog conflicts with master and reworded the message a little bit. Hope you don't mind.

@adriansr adriansr merged commit 58a3ddc into elastic:master Apr 1, 2020
@zube zube bot removed the [zube]: In Review label Apr 1, 2020
@zube zube bot added the [zube]: Done label Apr 1, 2020
@whataboutpereira
Copy link
Contributor Author

Right, my mistake, this is a Filebeat module.

I've fixed the changelog conflicts with master and reworded the message a little bit. Hope you don't mind.

Wonderful, thank you! :)

@adriansr adriansr added the needs_backport PR is waiting to be backported to other branches. label Apr 1, 2020
@whataboutpereira whataboutpereira deleted the patch-1 branch April 1, 2020 12:24
adriansr pushed a commit to adriansr/beats that referenced this pull request Apr 1, 2020
…n MySQL slowlog module. (elastic#17156)

Fix for issue elastic#17086.

The grok pattern used in the MySQL slowlog ingest pipeline uses EXPLAIN pattern with a superfluous * wildcard.

This makes Elasticsearch log a warning seemingly every time the pipeline is used.

Closes elastic#17086

(cherry picked from commit 58a3ddc)
@adriansr adriansr added v7.8.0 and removed needs_backport PR is waiting to be backported to other branches. labels Apr 1, 2020
adriansr pushed a commit to adriansr/beats that referenced this pull request Apr 1, 2020
…n MySQL slowlog module. (elastic#17156)

Fix for issue elastic#17086.

The grok pattern used in the MySQL slowlog ingest pipeline uses EXPLAIN pattern with a superfluous * wildcard.

This makes Elasticsearch log a warning seemingly every time the pipeline is used.

Closes elastic#17086

(cherry picked from commit 58a3ddc)
@adriansr adriansr added the v7.7.0 label Apr 1, 2020
adriansr added a commit that referenced this pull request Apr 1, 2020
…n MySQL slowlog module. (#17156) (#17394)

Fix for issue #17086.

The grok pattern used in the MySQL slowlog ingest pipeline uses EXPLAIN pattern with a superfluous * wildcard.

This makes Elasticsearch log a warning seemingly every time the pipeline is used.

Closes #17086

(cherry picked from commit 58a3ddc)

Co-authored-by: Reio Remma <[email protected]>
adriansr added a commit that referenced this pull request Apr 1, 2020
…n MySQL slowlog module. (#17156) (#17395)

Fix for issue #17086.

The grok pattern used in the MySQL slowlog ingest pipeline uses EXPLAIN pattern with a superfluous * wildcard.

This makes Elasticsearch log a warning seemingly every time the pipeline is used.

Closes #17086

(cherry picked from commit 58a3ddc)

Co-authored-by: Reio Remma <[email protected]>
@coudenysj
Copy link

Hi, thanks for fixing this.

I see a label 7.7.0 is added, do you have any idea when this will be available?

Or is there a workaround for this issue, because we get a lot of messages in the logs.

@whataboutpereira
Copy link
Contributor Author

Hi, thanks for fixing this.

I see a label 7.7.0 is added, do you have any idea when this will be available?

Or is there a workaround for this issue, because we get a lot of messages in the logs.

The patch looks scarier than it is, but the actual change was to remove ? from behind %{EXPLAIN}? in the huge regex. Have a look at "Files changed" above.

/usr/share/filebeat/module/mysql/slowlog/ingest/pipeline.json

Before:
%{EXPLAIN}?
After:
%{EXPLAIN}

@anmancipe
Copy link

I just upgraded from 7.4 to 7.6.2 and still having the same error. "regular expression has redundant nested repeat operator" Any update on a fix?

@adriansr
Copy link
Contributor

adriansr commented Apr 10, 2020

@anmancipe this fix didn't make it into 7.6.2. It'll be released with 7.7.0.

For now you can follow the suggestion in the comment by @whataboutpereira above, edit the pipeline to remove the extra ? and (important) delete the existing mysql pipelines in your Elasticsearch cluster so the new one gets installed.

Note that the elasticsearch and activemq modules in Filebeat also had a similar problem. For the former, you already have a fixed pipeline in 7.6.2, but removing the old pipelines in ES is still necessary. For the later, you'll have to do a similar process as with mysql, and replace the current pipeline by hand.

This is only necessary if you have ever used the elasticsearch or activemq modules.

@echu2013
Copy link

@anmancipe this fix didn't make it into 7.6.2. It'll be released with 7.7.0.

For now you can follow the suggestion in the comment by @whataboutpereira above, edit the pipeline to remove the extra ? and (important) delete the existing mysql pipelines in your Elasticsearch cluster so the new one gets installed.

Note that the elasticsearch and activemq modules in Filebeat also had a similar problem. For the former, you already have a fixed pipeline in 7.6.2, but removing the old pipelines in ES is still necessary. For the later, you'll have to do a similar process as with mysql, and replace the current pipeline by hand.

This is only necessary if you have ever used the elasticsearch or activemq modules.

I´ve done the following, is this an appropiate approach?
`
DELETE /_ingest/pipeline/filebeat-7.6.2-mysql-slowlog-pipeline

PUT /_ingest/pipeline/filebeat-7.6.2-mysql-slowlog-pipeline
{
"description" : "Pipeline for parsing MySQL slow logs.",
"processors" : [
{
"grok" : {
"field" : "message",
"patterns" : [
"^# User@Host: %{USER:user.name}(\[%{USER:mysql.slowlog.current_user}\])? @ %{HOSTNAME:source.domain}? \[%{IP:source.ip}?\]%{METRICSPACE}(Id:%{SPACE}%{NUMBER:mysql.thread_id:long}%{METRICSPACE})?(Thread_id:%{SPACE}%{NUMBER:mysql.thread_id}%{METRICSPACE})?(Schema:%{SPACE}%{WORD:mysql.slowlog.schema}?%{METRICSPACE})?(Last_errno: %{NUMBER:mysql.slowlog.last_errno:long}%{METRICSPACE})?(Killed: %{NUMBER:mysql.slowlog.killed:long}%{METRICSPACE})?(QC_hit: %{WORD:mysql.slowlog.query_cache_hit}%{METRICSPACE})?(Query_time: %{NUMBER:temp.duration:float}%{METRICSPACE})?(Lock_time: %{NUMBER:mysql.slowlog.lock_time.sec:float}%{METRICSPACE})?(Rows_sent: %{NUMBER:mysql.slowlog.rows_sent:long}%{METRICSPACE})?(Rows_examined: %{NUMBER:mysql.slowlog.rows_examined:long}%{METRICSPACE})?(Rows_affected: %{NUMBER:mysql.slowlog.rows_affected:long}%{METRICSPACE})?(Thread_id: %{NUMBER:mysql.thread_id}%{METRICSPACE})?(Errno: %{NUMBER:mysql.slowlog.last_errno:long}%{METRICSPACE})?(Killed: %{NUMBER:mysql.slowlog.killed:long}%{METRICSPACE})?(Bytes_received: %{NUMBER:mysql.slowlog.bytes_received:long}%{METRICSPACE})?(Bytes_sent: %{NUMBER:mysql.slowlog.bytes_sent:long}%{METRICSPACE})?(Read_first: %{NUMBER:mysql.slowlog.read_first:long}%{METRICSPACE})?(Read_last: %{NUMBER:mysql.slowlog.read_last:long}%{METRICSPACE})?(Read_key: %{NUMBER:mysql.slowlog.read_key:long}%{METRICSPACE})?(Read_next: %{NUMBER:mysql.slowlog.read_next:long}%{METRICSPACE})?(Read_prev: %{NUMBER:mysql.slowlog.read_prev:long}%{METRICSPACE})?(Read_rnd: %{NUMBER:mysql.slowlog.read_rnd:long}%{METRICSPACE})?(Read_rnd_next: %{NUMBER:mysql.slowlog.read_rnd_next:long}%{METRICSPACE})?(Sort_merge_passes: %{NUMBER:mysql.slowlog.sort_merge_passes:long}%{METRICSPACE})?(Sort_range_count: %{NUMBER:mysql.slowlog.sort_range_count:long}%{METRICSPACE})?(Sort_rows: %{NUMBER:mysql.slowlog.sort_rows:long}%{METRICSPACE})?(Sort_scan_count: %{NUMBER:mysql.slowlog.sort_scan_count:long}%{METRICSPACE})?(Created_tmp_disk_tables: %{NUMBER:mysql.slowlog.tmp_disk_tables:long}%{METRICSPACE})?(Created_tmp_tables: %{NUMBER:mysql.slowlog.tmp_tables:long}%{METRICSPACE})?(Tmp_tables: %{NUMBER:mysql.slowlog.tmp_tables:long}%{METRICSPACE})?(Tmp_disk_tables: %{NUMBER:mysql.slowlog.tmp_disk_tables}%{METRICSPACE})?(Tmp_table_sizes: %{NUMBER:mysql.slowlog.tmp_table_sizes:long}%{METRICSPACE})?(Start: %{TIMESTAMP_ISO8601:event.start}%{METRICSPACE})?(End: %{TIMESTAMP_ISO8601:event.end}%{METRICSPACE})?(InnoDB_trx_id: %{WORD:mysql.slowlog.innodb.trx_id}%{METRICSPACE})?(QC_Hit: %{WORD:mysql.slowlog.query_cache_hit}%{METRICSPACE})?(Full_scan: %{WORD:mysql.slowlog.full_scan}%{METRICSPACE})?(Full_join: %{WORD:mysql.slowlog.full_join}%{METRICSPACE})?(Tmp_table: %{WORD:mysql.slowlog.tmp_table}%{METRICSPACE})?(Tmp_table_on_disk: %{WORD:mysql.slowlog.tmp_table_on_disk}%{METRICSPACE})?(Filesort: %{WORD:mysql.slowlog.filesort}%{METRICSPACE})?(Filesort_on_disk: %{WORD:mysql.slowlog.filesort_on_disk}%{METRICSPACE})?(Merge_passes: %{NUMBER:mysql.slowlog.merge_passes:long}%{METRICSPACE})?(Priority_queue: %{WORD:mysql.slowlog.priority_queue}%{METRICSPACE})?(No InnoDB statistics available for this query%{METRICSPACE})?(InnoDB_IO_r_ops: %{NUMBER:mysql.slowlog.innodb.io_r_ops:long}%{METRICSPACE})?(InnoDB_IO_r_bytes: %{NUMBER:mysql.slowlog.innodb.io_r_bytes:long}%{METRICSPACE})?(InnoDB_IO_r_wait: %{NUMBER:mysql.slowlog.innodb.io_r_wait.sec:float}%{METRICSPACE})?(InnoDB_rec_lock_wait: %{NUMBER:mysql.slowlog.innodb.rec_lock_wait.sec:float}%{METRICSPACE})?(InnoDB_queue_wait: %{NUMBER:mysql.slowlog.innodb.queue_wait.sec:float}%{METRICSPACE})?(InnoDB_pages_distinct: %{NUMBER:mysql.slowlog.innodb.pages_distinct:long}%{METRICSPACE})?(Log_slow_rate_type: %{WORD:mysql.slowlog.log_slow_rate_type}%{METRICSPACE})?(Log_slow_rate_limit: %{NUMBER:mysql.slowlog.log_slow_rate_limit:long}%{METRICSPACE})?%{EXPLAIN}(use %{WORD:mysql.slowlog.schema};\n)?SET timestamp=%{NUMBER:mysql.slowlog.timestamp:long};\n%{GREEDYMULTILINE:mysql.slowlog.query}"
],
"pattern_definitions" : {
"EXPLAIN" : """(# explain:.*
|#\s*
)""",
"GREEDYMULTILINE" : """(.|
)
""",
"METRICSPACE" : """([ #
]*)"""
},
"ignore_missing" : true
}
},
{
"remove" : {
"field" : "message"
}
},
{
"script" : {
"params" : {
"mapping" : {
"Yes" : true,
"No" : false
},
"fields" : [
"query_cache_hit",
"tmp_table",
"tmp_table_on_disk",
"filesort",
"filesort_on_disk",
"priority_queue",
"full_scan",
"full_join"
]
},
"lang" : "painless",
"source" : "for (field in params.fields) { def v = ctx.mysql.slowlog.get(field); if (v != null) { ctx.mysql.slowlog.put(field, params.mapping.get(v)) } }"
}
},
{
"script" : {
"if" : "ctx.temp?.duration != null",
"lang" : "painless",
"source" : "ctx.event.duration = Math.round(ctx.temp.duration * 1000000) * 1000"
}
},
{
"remove" : {
"field" : "temp.duration",
"ignore_missing" : true
}
},
{
"date" : {
"formats" : [
"UNIX"
],
"ignore_failure" : true,
"field" : "mysql.slowlog.timestamp",
"target_field" : "@timestamp"
}
},
{
"remove" : {
"field" : "mysql.slowlog.timestamp",
"ignore_missing" : true
}
}
],
"on_failure" : [
{
"set" : {
"field" : "error.message",
"value" : "{{ _ingest.on_failure_message }}"
}
}
]
}
`

leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
…n MySQL slowlog module. (elastic#17156) (elastic#17395)

Fix for issue elastic#17086.

The grok pattern used in the MySQL slowlog ingest pipeline uses EXPLAIN pattern with a superfluous * wildcard.

This makes Elasticsearch log a warning seemingly every time the pipeline is used.

Closes elastic#17086

(cherry picked from commit cbfe82d)

Co-authored-by: Reio Remma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review Team:Services (Deprecated) Label for the former Integrations-Services team v7.7.0 v7.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filebeat mysql slowlog module regular expression has redundant nested repeat operator.
9 participants