Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow NULL to be considered as "" if used together with a length of 0 #54

Closed
carenas opened this issue Nov 21, 2021 · 6 comments
Closed

Comments

@carenas
Copy link
Contributor

carenas commented Nov 21, 2021

I usually prefer to support ptr:NULL, length:0 for string manipulations (all string.h functions support it) but it looks like this is not the case for pcre2. I wouldn't mind supporting it though.

Originally posted by @zherczeg in #53 (comment)

@PhilipHazel
Copy link
Collaborator

The original intention for PCRE2 was that NULL always gives an error, whatever the length, for both patterns and subjects, whereas an empty string is of course a valid pattern and a valid subject. If NULL with zero length is supported as an empty subject, it should also be supported as a pattern, and as a replacement string in pcre2_substitute() (and any others I've overlooked) for consistency. Is it really worth doing this work? How much would such a feature be used?

@zherczeg
Copy link
Collaborator

In most system, don't supporting it requires effort, not supporting it :) Anyway, it is definitely not worth to spending much effort on it.

@ltrzesniewski
Copy link
Contributor

I brought that up because in C#, ReadOnlySpan<char>.Empty is a struct which contains a NULL pointer and a zero length, and PCRE2 didn't like that kind of input, which surprised me. I had to special-case this in PCRE.NET, and I thought maybe other people stumbled upon it as well.

I don't know if I'd even call that a feature, it just seemed like a small inconsistency to me. I can send a PR to support this later on if you deem it's worth the trouble.

@carenas
Copy link
Contributor Author

carenas commented Nov 21, 2021

The original intention for PCRE2 was that NULL always gives an error, whatever the length

I think this makes sense design wise, and the only interface that is currently missing AFAIK (other than the fixes needed so that we don't crash instead) is the use of NULL for the replacement string in pcre2_substitute, which is not documented/coded to return PCRE2_NULL_ERROR.

presume that since we will crash now instead, changing that wouldn't require changes to the API and is unlikely to get anyone surprised but also, since it was not documented to fail until now, and it just happened to work when NULL and 0 were used (for the replacement parameters) we will need to keep that as an exception anyway.

would be great if this could be tested after to avoid regressions, but I am not sure what would be the best way to do so, eventhough I was leaning towards a new modifier flag that will make pcre2test pass NULL instead as a subject, ideally we will need the same for patterns and replacement though, and that is where it gets confusing without adding a lot of weirdly specific flags.

@PhilipHazel
Copy link
Collaborator

I have found (and fixed, but not yet pushed) a couple more places where tests for NULL were needed. During this process, I discovered that pcre2_jit_match(), which is "fast path with no checks use at your own risk" facility does in fact treat NULL with length zero as an empty string. With this bit of extra evidence I have now decided that I will make other functions do the same for just that special case NULL+length=0. NULL with non-zero length will still give an error - pcre2_jit_match() crashes, but that's OK because it is documented not to do sanity checks.

@PhilipHazel
Copy link
Collaborator

Commit 4ef0c51 implements NULL/0 interpretation as an empty string for subjects and replacements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants