You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Suppose I create a match pattern: 0xE6, 0xBC, 0xA2. This pattern gives a single complete utf8 character.
I then compile the pattern with PARTIAL_HARD and MATCH_INVALID_UTF, and use the compiled pattern to match against the subject string consisting of only the first byte of the pattern: 0xE6.
I expect the match to give a partial match consisting of the entire subject string. Instead, it gives no match. Is this correct behavior?
The text was updated successfully, but these errors were encountered:
MATCH_INVALID_UTF means (ironically) that anything that is not perfectly valid UTF will be ignored, hence why you can't match an incomplete UTF subject.
If not in UTF mode (which means not using PCRE2_UTF nor PCRE2_MATCH_INVALID_UTF) you can:
$ pcre2test
PCRE2 version 10.42 2022-12-11
re> /e6 bc a2/hex
data> \xe6\=ph
Partial match: \xe6
note the use of hex in pcre2test is just to avoid the ambiguity of using instead \x, so don't expect that in your pattern string.
Suppose I create a match pattern: 0xE6, 0xBC, 0xA2. This pattern gives a single complete utf8 character.
I then compile the pattern with PARTIAL_HARD and MATCH_INVALID_UTF, and use the compiled pattern to match against the subject string consisting of only the first byte of the pattern: 0xE6.
I expect the match to give a partial match consisting of the entire subject string. Instead, it gives no match. Is this correct behavior?
The text was updated successfully, but these errors were encountered: