-
Notifications
You must be signed in to change notification settings - Fork 471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement array shapes for preg_match()
$matches
by-ref parameter
#2589
Conversation
@mvorisek please provide feedback and if you could come up with a few more test-cases, that would be awesome. whats the expected result for phpstan/phpstan#9502 ? |
memo to me: I need to fix |
@staabm It would be really great to have this implemented now |
I have this PR on my todo list and will rebase it in the near future. You can help by verifying existing test expectations of this PR or help explore missing tests/cases |
c08a944
to
0ce2624
Compare
39d2b3b
to
7ed47c4
Compare
This pull request has been marked as ready for review. |
In the end I combined both approaches. regex based retrieval of the capturing groups and the AST to identify how many non-optional groups are contained in the regex. since the type inference now requires a regex which is parsable by Hoa\Regex I commented some type-assertions which currently don't work because the regex parser cannot parse it. I think these can be fixed in the future by applying fixes to the I am neither a regex nor a grammer expert, therefore I left these things for other people more confident in this topic. In a future PR I would improve the types to be even more precise for e.g. |
|
||
function doNamedSubpattern(string $s): void { | ||
if (preg_match('/\w-(?P<num>\d+)-(\w)/', $s, $matches)) { | ||
// could be assertType('array{0: string, num: string, 1: string, 2: string, 3: string}', $matches); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we can see one example of a regex which is not parsable with the current hoa grammar:
Fatal error: Uncaught Hoa\Compiler\Llk\Parser::parse(): (0) Unexpected token "-" (range) at line 1 and column 4:
/\w-(?P<num>\d+)-(\w)/
↑
in /Users/staabm/workspace/phpstan-src/vendor/hoa/compiler/Llk/Parser.php at line 1.
thrown in /Users/staabm/workspace/phpstan-src/vendor/hoa/compiler/Llk/Parser.php on line 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two issues:
a) (?P<xxx>...
is equivalent of supported (?<xxx>...
- spec https://www.pcre.org/original/doc/html/pcrepattern.html (search for ˙?p<˙), grammar: https://github.com/hoaproject/Regex/blob/master/Source/Grammar.pp#L73
b) -
range - it is reserved character strictly if in [
character group only (otherwise it is regular chanracter and very often used)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Please provide code suggestions for fixes.
the example work in plain php, so we either need to fix the grammar or leave the typing for these cases behind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind forking hoaproject/Regex
and saving there a php test file I can run with all the issues you found?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
look into this PR here and find all commented assertions. these are the cases which hoa cannot parse.
I have spent the last the 3 days of freetime for your feature request. take your time to do the tests yourself. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh I see. I got the impression you are working on all these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can look into more cases later. This one was the most important. What about adding proper Hoa Regex unit testing to phpstan-src? We can assert the parsed regexes using https://github.com/mvorisek/Hoa-Regex/blob/0c8a5cbcf696df8f65168704af14ad5ebd48c641/test.php#L13-L25 function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, why not. Please send a PR to my branch/fork
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not super familiar with phpstan-src codebase - where, what class can I base this unit test on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would create a new file in tests/PHPStan/Type
. we can later on move it somewhere else if this doesn't work for ondrej.
-%token anchor \\(bBAZzG)|\^|\$ | ||
+%token anchor \\([bBAZzG])|\^|\$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreleased hoa/regex upstream fix
hoaproject/Regex@5f670af
patches/Grammar.patch
Outdated
- ( range() | literal() )+ | ||
+ ( <class_> | range() | literal() )+ <range>? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unreleased hoa/regex upstream fix
hoaproject/Regex@ce7fd7b
hoaproject/Regex@e770ada
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2nd patch was reverted, see #2589 (comment)
-capturing: | ||
+#capturing: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upstream hoa fix
hoaproject/Regex#38 (comment)
This looks amazing, thanks @staabm. Now I'm just wondering how I can get it to recognize Composer\Pcre\Preg::match and co as well to get the same quality of type output there :s If you can think of some way to make this more reusable by third party libs wrapping preg functions it'd be great. |
at first I want to see what @ondrejmirtes thinks about all this. he was not yet involved. After that: I can think of different ways to make this re-usable. for wrapper-libs which are 100% api compatible with the pcre_* native functions we could e.g.
|
Ok cool thanks. Yes composer/pcre is almost one to one.. Except it throws instead of returning false. For the other api returning object results there it's a bit more complex but anyway I see your point and I'll let you wrap things up here first. Just wanted to let you know in case it can influence design decisions along the way but it's not urgent at all. |
I am not sure why we have these 7.4/7.2 remaining errors. are you fine with merging anyway? |
made the expectations more php version specific so we can move on and get a green build. we see some PHPStan extension builds failing because of the problem mentioned in #2589 (comment) |
What about the errors here? https://github.com/phpstan/phpstan-src/actions/runs/9610380632 I know you mentioned I don't know whether you solved that, based on the results probably not. My opinion is that we could hardcode this specifically in TypeSpecifier, so that extension is called correctly with the right context. |
if you have an idea how this need to look like, please go ahead and implement it. the only idea I have atm would be to add a AST visitor which detects a hopefully you have something better in mind :) |
What exactly is the problem, why we cannot check if the context/preg_match result is surely non-falsy? |
I see, thanks. should we have the same for |
@staabm Yeah, it's probably harmless, but do that in a next PR, not here :) |
preg_match()
$matches
by-ref parameter
Thank you! |
An extension can be as simple as this phpstan/phpstan-nette@3e68a5d |
the extension for |
I just noticed that this doesn't seem to play well with strict rules. This example without strict rules works: https://phpstan.org/r/6187bcca-8fc8-459b-9790-21a82e62b345 But when I enable strict rules, I need to cast the return value of Same thing happens when I cast to an |
@westonruter Please open a new bug report about this. |
Done: phpstan/phpstan#11262 |
extensions implemented for PHP-CS-Fixer |
initial support for |
I tried solving the shapes with Hoa Regex AST, but I couldn't find the information about capturing groups on the ast nodes.I don't see a way on how to identify where capturing groups are based on the AST.
therefore I tried the way suggested in phpstan/phpstan#9502 (comment)