-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0-infinite quantifier inside look behind capture group causes tremendous performance loss #166
Comments
This might be causing some of the serious performance issues in the C++ syntax as well. Part of our preprocessing steps in the C++ grammar optimize (b+)? to become b* so that's probably not doing us any favors |
I just looked through all |
Oh, well thanks for saving me the trouble @RedCMD. I suppose it's just regular catastrophic backtracking then for C++ |
@jeff-hykin |
Thank you for reaching out! I think the regex @kkos do you think there are optimization opportunities in oniguruma around such regular expressions? |
Creating a syntax highlighter with the following code and running it on a single line file filled with thousands of the letter
c
will cause tremendous lagReplacing the
*
with a+
or changing the regex to(?<=ab+|a)c
will remove all lag and allow my machine to easily interpret a million character single line fileBut then a file filled with
bc
will lag insteadI cant seem to figure out how the engine matches lookbehinds
does it start with the
c
first then try to match the lookbehind afterwards?or does it try to match the lookbehind first, then the
c
?using
{0,10000}
instead of*
reduces lag massivelywhich seems to suggest that the engine is trying to match
b
an infinite amount (or an internal limit) of times before giving upbut for some reason it only does so when its allowed to match 0 times??
maybe its an underflow error?
But then using the regex
(?<=a(b+)?)c
also causes an enormous amount of lagwhich I don't get
unless..
(b+)?
gets optmized tob*
or maybe cause the engine backtracks in the wrong direction and ends up endlessly rechecking
b
?Noticed single line files were only getting partially highlighted
Dug around and found out it was caused by this regex:
(?<={\\d*)\\\\{2}
is the part that was going over board on every single\\
The text was updated successfully, but these errors were encountered: