Disable \K in lookarounds #4

PhilipHazel · 2021-08-23T14:01:18Z

This is bug 2792 from the old Bugzilla, posted by firas. Perl used to allow \K in lookarounds, but it now throws an error. PCRE2 currently supports \K in positive lookarounds, and ignores it in negative ones. However, naive implementations can cause loops. After some discussion on the old list, the following was my (PH) conclusion:

I should have looked more closely at the code in pcre2demo. It has special code to deal with this case. Here is the comment:

/* If the previous match was not an empty string, there is one tricky case to
consider. If a pattern contains \K within a lookbehind assertion at the
start, the end of the matched string can be at the offset where the match
started. Without special action, this leads to a loop that keeps on matching
the same substring. We must detect this case and arrange to move the start on
by one character. The pcre2_get_startchar() function returns the starting
offset that was passed to pcre2_match(). */

OK, so now all is understood (pcre2test no doubt does the same). Perhaps the best thing to do here is to forbid \K in assertions, but to implement a new option in the PCRE2_EXTRA series to allow the current implementation. Then anyone who really needs the current behaviour can get it. We can put lots of warnings in the docs.

PhilipHazel · 2021-08-30T16:01:56Z

I have done the suggested modification: \K is locked out of lookaround assertions by default, but there is a new option called PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK that re-enables the previous behaviour.

PhilipHazel self-assigned this Aug 23, 2021

PhilipHazel added the bug Something isn't working label Aug 23, 2021

PhilipHazel closed this as completed Aug 30, 2021

cmb69 mentioned this issue Oct 19, 2021

skip test with libpcre2 10.38 php/php-src#7588

Closed

nikic mentioned this issue Nov 23, 2021

Backwards incompatible change between 10.37 and 10.38 #56

Closed

rlohning mentioned this issue Jan 11, 2022

Heap buffer overflow #77

Closed

SolitaryGrass mentioned this issue May 31, 2023

internal_dfa_match, a stack overflow occurred due to recursive calls. #258

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable \K in lookarounds #4

Disable \K in lookarounds #4

PhilipHazel commented Aug 23, 2021

PhilipHazel commented Aug 30, 2021

Disable \K in lookarounds #4

Disable \K in lookarounds #4

Comments

PhilipHazel commented Aug 23, 2021

PhilipHazel commented Aug 30, 2021