Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support posix character classes in preg_match() inference #3241

Closed
wants to merge 1 commit into from

Conversation

staabm
Copy link
Contributor

@staabm staabm commented Jul 15, 2024

closes phpstan/phpstan#11323


  • checkout PR
  • run composer install - so the patch is applied
  • run the test with vendor/bin/phpunit tests/PHPStan/Analyser/NodeScopeResolverTest.php (will test all preg_match() related tests)

for a faster debug loop, you might just put the test into a test.php file:

<?php

use function PHPStan\Testing\assertType;

function bug11323(string $s): void {
	if (preg_match('/([*|+?{}()]+)([^*|+[:digit:]?{}()]+)/', $s, $matches)) {
		assertType('array{string, string, string}', $matches);
	}
}

and run php bin/phpstan analyze test.php --debug on it
(append an additional --xdebug in case you want to step-debug)

@staabm staabm changed the base branch from 1.12.x to 1.11.x July 15, 2024 12:18
@phpstan-bot
Copy link
Collaborator

You've opened the pull request against the latest branch 1.12.x. If your code is relevant on 1.11.x and you want it to be released sooner, please rebase your pull request and change its target to 1.11.x.

@Seldaek
Copy link
Contributor

Seldaek commented Jul 15, 2024

Ok I'll try to debug this, thanks. Might not have time before later this week though.

@Seldaek
Copy link
Contributor

Seldaek commented Jul 15, 2024

Just adding %token posix_class \[^?:[a-z]+:\] and not the class_literal line because that breaks everything, and then using ( <posix_class> | <class_> | range() | literal() | quantifier() | <alternation> | <capturing_> | <_capturing> )+ <range>? to parse the class contents.. it gets almost to the end:

Unexpected token ")" (_capturing) at line 1 and column 34:
/([*|+?{}()]+)([^*|+[:digit:]?{}()]+)/
                                 ↑

Unfortunately I don't manage to make it capture the last ), no clue why it'd fail there..

And also this is super brittle, and I can break it by adding [(?:] for example or any other token to the character class, which then gets matched and results in an:

Unexpected token "(?:" (non_capturing_) at line 1 and column 10:
/([*|+?{}(?:)]+)([^*|+[:digit:]?{}()]+)/
         ↑

@Seldaek
Copy link
Contributor

Seldaek commented Jul 15, 2024

@mvorisek are you familiar with this parser? Is there a syntax to turn this silly scanning-of-all-tokens off and tell it to not look at other tokens while it is in the middle of parsing a class section?

@Seldaek
Copy link
Contributor

Seldaek commented Jul 15, 2024

I guess I'll need to read up on https://github.com/hoaproject/Compiler - but no time for this anymore today ..

@staabm
Copy link
Contributor Author

staabm commented Jul 15, 2024

here seems to be something like a documentation

@mvorisek
Copy link
Contributor

@mvorisek are you familiar with this parser? Is there a syntax to turn this silly scanning-of-all-tokens off and tell it to not look at other tokens while it is in the middle of parsing a class section?

Of course class_literal needs to go before class_, more precisely class_ needs to be removed - if that is not possible, the only option is to list "all" tokens in phpstan/phpstan#11323 (comment) and my following comment.

@staabm
Copy link
Contributor Author

staabm commented Jul 17, 2024

as far as I understood the other parallel PRs, current progress happens in #3244 and this one here is obsolete.

therefore closing.

@staabm staabm closed this Jul 17, 2024
@staabm staabm deleted the bug11323 branch July 17, 2024 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

preg_match missing array shape for pattern
4 participants