Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace aho corasick #302

Merged
merged 1 commit into from
Jul 25, 2022
Merged

replace aho corasick #302

merged 1 commit into from
Jul 25, 2022

Conversation

jptosso
Copy link
Member

@jptosso jptosso commented Jul 24, 2022

This is a huge improvement, memory consumption is 60% less, and execution time was 233% faster

Benchmark results using CRS:

Old results:

Preparing CRS...
CRS PATH: /var/folders/tf/23czq0cd4wd5v54zcvctvtzw0000gn/T/crs1564927737
goos: darwin
goarch: arm64
pkg: github.com/corazawaf/coraza/v3/testing
BenchmarkCRSCompilation-10             3         378610667 ns/op
BenchmarkCRSSimpleGET-10             613           1791376 ns/op
BenchmarkCRSSimplePOST-10            308           3499975 ns/op
PASS
ok      github.com/corazawaf/coraza/v3/testing  13.985s

New results:

Preparing CRS...
CRS PATH: /var/folders/tf/23czq0cd4wd5v54zcvctvtzw0000gn/T/crs1721984072
goos: darwin
goarch: arm64
pkg: github.com/corazawaf/coraza/v3/testing
BenchmarkCRSCompilation-10            21          48269002 ns/op
BenchmarkCRSSimpleGET-10             777           1396721 ns/op
BenchmarkCRSSimplePOST-10            489           2396956 ns/op
PASS
ok      github.com/corazawaf/coraza/v3/testing  5.969s

@sonarcloud
Copy link

sonarcloud bot commented Jul 24, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

74.3% 74.3% Coverage
0.0% 0.0% Duplication

Copy link
Member

@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome results!

@jptosso jptosso enabled auto-merge (squash) July 25, 2022 02:12
@jptosso jptosso merged commit a1529ab into v2/master Jul 25, 2022
@jptosso jptosso deleted the v2/replace-ahocorasick branch July 25, 2022 14:33
if tx.Capture {
matches := o.matcher.FindAll(value)
for i, match := range matches {
if i == 10 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it isn't part of the PR but it would be really cool to get a hint on what this 10 means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent question :) This is another constant that we should be naming properly.

10 is the maximum captures you can have (originally in modsecurity). This comes from https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-%28v2.x%29#capture

Up to 10 captures will be copied on a successful pattern match, each with a name consisting of a digit from 0 to 9. The TX.0 variable always contains the entire area that the regular expression matched. All the other variables contain the captured values, in the order in which the capturing parentheses appear in the regular expression. 

@jcchavezs
Copy link
Member

@anuraaga worth for you to have a look at github.com/petar-dambovaliev/aho-corasick repository to see if all effort towards performance is enough?

@jptosso
Copy link
Member Author

jptosso commented Aug 19, 2022

@anuraaga worth for you to have a look at github.com/petar-dambovaliev/aho-corasick repository to see if all effort towards performance is enough?

Looks quite similar to our implementation, we would have to run benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants