Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable \K in lookarounds #4

Closed
PhilipHazel opened this issue Aug 23, 2021 · 1 comment
Closed

Disable \K in lookarounds #4

PhilipHazel opened this issue Aug 23, 2021 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@PhilipHazel
Copy link
Collaborator

This is bug 2792 from the old Bugzilla, posted by firas. Perl used to allow \K in lookarounds, but it now throws an error. PCRE2 currently supports \K in positive lookarounds, and ignores it in negative ones. However, naive implementations can cause loops. After some discussion on the old list, the following was my (PH) conclusion:

I should have looked more closely at the code in pcre2demo. It has special code to deal with this case. Here is the comment:

/* If the previous match was not an empty string, there is one tricky case to
consider. If a pattern contains \K within a lookbehind assertion at the
start, the end of the matched string can be at the offset where the match
started. Without special action, this leads to a loop that keeps on matching
the same substring. We must detect this case and arrange to move the start on
by one character. The pcre2_get_startchar() function returns the starting
offset that was passed to pcre2_match(). */

OK, so now all is understood (pcre2test no doubt does the same). Perhaps the best thing to do here is to forbid \K in assertions, but to implement a new option in the PCRE2_EXTRA series to allow the current implementation. Then anyone who really needs the current behaviour can get it. We can put lots of warnings in the docs.

@PhilipHazel PhilipHazel self-assigned this Aug 23, 2021
@PhilipHazel PhilipHazel added the bug Something isn't working label Aug 23, 2021
@PhilipHazel
Copy link
Collaborator Author

I have done the suggested modification: \K is locked out of lookaround assertions by default, but there is a new option called PCRE2_EXTRA_ALLOW_LOOKAROUND_BSK that re-enables the previous behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant