You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is bug 2792 from the old Bugzilla, posted by firas. Perl used to allow \K in lookarounds, but it now throws an error. PCRE2 currently supports \K in positive lookarounds, and ignores it in negative ones. However, naive implementations can cause loops. After some discussion on the old list, the following was my (PH) conclusion:
I should have looked more closely at the code in pcre2demo. It has special code to deal with this case. Here is the comment:
/* If the previous match was not an empty string, there is one tricky case to
consider. If a pattern contains \K within a lookbehind assertion at the
start, the end of the matched string can be at the offset where the match
started. Without special action, this leads to a loop that keeps on matching
the same substring. We must detect this case and arrange to move the start on
by one character. The pcre2_get_startchar() function returns the starting
offset that was passed to pcre2_match(). */
OK, so now all is understood (pcre2test no doubt does the same). Perhaps the best thing to do here is to forbid \K in assertions, but to implement a new option in the PCRE2_EXTRA series to allow the current implementation. Then anyone who really needs the current behaviour can get it. We can put lots of warnings in the docs.
The text was updated successfully, but these errors were encountered:
I have done the suggested modification: \K is locked out of lookaround assertions by default, but there is a new option called PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK that re-enables the previous behaviour.
This is bug 2792 from the old Bugzilla, posted by firas. Perl used to allow \K in lookarounds, but it now throws an error. PCRE2 currently supports \K in positive lookarounds, and ignores it in negative ones. However, naive implementations can cause loops. After some discussion on the old list, the following was my (PH) conclusion:
I should have looked more closely at the code in pcre2demo. It has special code to deal with this case. Here is the comment:
/* If the previous match was not an empty string, there is one tricky case to
consider. If a pattern contains \K within a lookbehind assertion at the
start, the end of the matched string can be at the offset where the match
started. Without special action, this leads to a loop that keeps on matching
the same substring. We must detect this case and arrange to move the start on
by one character. The pcre2_get_startchar() function returns the starting
offset that was passed to pcre2_match(). */
OK, so now all is understood (pcre2test no doubt does the same). Perhaps the best thing to do here is to forbid \K in assertions, but to implement a new option in the PCRE2_EXTRA series to allow the current implementation. Then anyone who really needs the current behaviour can get it. We can put lots of warnings in the docs.
The text was updated successfully, but these errors were encountered: