Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matching issue introduced in last version of PCRE2 library ( {x} doesn't match, {X-1,X+1} does ) #82

Closed
ghost opened this issue Jan 24, 2022 · 3 comments

Comments

@ghost
Copy link

ghost commented Jan 24, 2022

Came up with another project which led to it being an issue with the library itself as it also fails with pcre2test. For a full discussion see this bug:

firasdib/Regex101#1704 (comment)
php/php-src#7994

Basically given "aa" [a]{2} fails, [a]{1,3} works.

Sorry took a weird path to end up at this place as didn't realized this was just a library everybody used lol.

@carenas
Copy link
Contributor

carenas commented Jan 25, 2022

issue released with 10.34 and introduced in commit bf15267
Author: Philip.Hazel [email protected]
Date: Mon Sep 9 17:00:19 2019 +0000

Optimize classes such as [Aa] to be a single caseless character.

ChangeLog | 6 +++-
src/pcre2_compile.c | 85 ++++++++++++++++++++++++++++++++++++++----------
testdata/testinput10 | 2 ++
testdata/testinput12 | 2 ++
testdata/testinput2 | 2 ++
testdata/testoutput10 | 7 ++++
testdata/testoutput12-16 | 7 ++++
testdata/testoutput12-32 | 7 ++++
testdata/testoutput2 | 14 ++++++++
9 files changed, 114 insertions(+), 18 deletions(-)

and only affects the matching when the last character in the class doesn't correspond to the correct case that is shown in the data, because is registered as literal (and not caseless)

$ printf "/[Aa]{2}/\naa\n" | pcre2test
PCRE2 version 10.39 2021-10-29
/[Aa]{2}/
aa
No match

@PhilipHazel
Copy link
Collaborator

This was an optimization bug. [Aa] is turned into caseless 'A' but when it was last in a pattern, the "must have this character" optimization was not getting flagged as caseless. I have committed a small patch that fixes this. Thanks for the report and triage.

@guilliamxavier
Copy link

For the record: the fix is commit fdd9479 (released in version 10.40)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants