-
-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace aho corasick #302
replace aho corasick #302
Conversation
Kudos, SonarCloud Quality Gate passed! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome results!
if tx.Capture { | ||
matches := o.matcher.FindAll(value) | ||
for i, match := range matches { | ||
if i == 10 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it isn't part of the PR but it would be really cool to get a hint on what this 10 means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent question :) This is another constant that we should be naming properly.
10 is the maximum captures you can have (originally in modsecurity). This comes from https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-%28v2.x%29#capture
Up to 10 captures will be copied on a successful pattern match, each with a name consisting of a digit from 0 to 9. The TX.0 variable always contains the entire area that the regular expression matched. All the other variables contain the captured values, in the order in which the capturing parentheses appear in the regular expression.
@anuraaga worth for you to have a look at github.com/petar-dambovaliev/aho-corasick repository to see if all effort towards performance is enough? |
Looks quite similar to our implementation, we would have to run benchmarks. |
This is a huge improvement, memory consumption is 60% less, and execution time was 233% faster
Benchmark results using CRS:
Old results:
New results: