Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHP 8.1: fix retokenization of "&" character #3411

Merged

Conversation

jrfnl
Copy link
Contributor

@jrfnl jrfnl commented Aug 16, 2021

In PHP < 8.1, the ampersand was tokenized as a simple token, basically just a plain string "&".
As of PHP 8.1, due to the introduction of intersection types, PHP is introducing two new tokens for the ampersand.

This PR proposes to "undo" the new PHP 8.1 tokenization of the ampersand in favour of the pre-existing tokenization of the character as T_BITWISE_AND as it has been in PHPCS since forever.

This fixes a lot of current test failures on PHP 8.1, though not all, as there are more tokenizer changes which still need to be accounted for.

We may want to consider adding an extra 'is_reference' array key index to the token array for these tokens, which would allow the File::isReference() method to resolve tokens on PHP 8.1 much more quickly and more easily.

We also may want to have a think about whether we want to move to the PHP 8.1 tokenization in PHPCS 4.x. All the same, this PR should not be held back by a decision like that as, for now, it just needs to be fixed for PHPCS 3.x.

Props also to @afilina (Anna Filina) for doing this together with me during pair programming.

Updated
Includes taking the new tokens into account for the "next token after a function keyword token should be a T_STRING" logic.

This change is already covered extensively by the tests for the File::isReference() method, though that method will need updating for PHP 8.1 intersection types, just like the File::getMethodParameters() method will need adjusting too.

This PR, in combination with PR #3400, fixes all current test failures on PHP 8.1.

@jrfnl
Copy link
Contributor Author

jrfnl commented Aug 17, 2021

Related PHP RFC link: https://wiki.php.net/rfc/pure-intersection-types and the related PR: php/php-src#6799

@jrfnl jrfnl force-pushed the feature/php-8.1-undo-new-reference-token branch from b0eeca7 to 9f01041 Compare August 17, 2021 03:30
In PHP < 8.1, the ampersand was tokenized as a simple token, basically just a plain string "&".
As of PHP 8.1, due to the introduction of intersection types, PHP is introducing two new tokens for the ampersand.

This PR proposes to "undo" the new PHP 8.1 tokenization of the ampersand in favour of the pre-existing tokenization of the character as `T_BITWISE_AND` as it has been in PHPCS since forever.

Includes taking the new tokens into account for the "next token after a function keyword token should be a `T_STRING`" logic.

This change is already covered extensively by the tests for the `File::isReference()` method, though that method will need updating for PHP 8.1 intersection types, just like the `File::getMethodParameters()` method will need adjusting too.

This PR, in combination with PR 3400, fixes all current test failures on PHP 8.1.

We may want to consider adding an extra `'is_reference'` array key index to the token array for these tokens, which would allow the `File::isReference()` method to resolve tokens on PHP 8.1 much more quickly and more easily.

We also may want to have a think about whether we want to move to the PHP 8.1 tokenization in PHPCS 4.x. All the same, this PR should not be held back by a decision like that as, for now, it just needs to be fixed for PHPCS 3.x.
@jrfnl jrfnl force-pushed the feature/php-8.1-undo-new-reference-token branch from 9f01041 to 7c5b4e6 Compare August 17, 2021 03:39
@gsherwood gsherwood added this to the 3.6.1 milestone Aug 17, 2021
gsherwood added a commit that referenced this pull request Aug 26, 2021
@gsherwood gsherwood merged commit 9736ccd into squizlabs:master Aug 26, 2021
@gsherwood
Copy link
Member

Thank you @jrfnl and @afilina

@jrfnl jrfnl deleted the feature/php-8.1-undo-new-reference-token branch August 26, 2021 23:28
@jrfnl
Copy link
Contributor Author

jrfnl commented Aug 26, 2021

We may want to consider adding an extra 'is_reference' array key index to the token array for these tokens, which would allow the File::isReference() method to resolve tokens on PHP 8.1 much more quickly and more easily.

We also may want to have a think about whether we want to move to the PHP 8.1 tokenization in PHPCS 4.x.

@gsherwood Got an opinion about the above suggestions/questions ? Or would you like me to open a new issue for this to discuss this further at a later date ?

@gsherwood
Copy link
Member

Adding and is_reference array index to the token is ok as long as it gets filled out for all PHP versions, but that forces an isReference check for every ampersand and will waste a lot of time. It could be set only for PHP 8.1, but I don't love the idea of an optional array index.

I'm not planning on supporting the PHP 8.1 tokens in version 4 yet. I'll have to look at the BC breaks that are in there to see if it's worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants