Skip to content

Commit

Permalink
PHP 8.0 | Tokenizer/PHP: stabilize comment tokenization
Browse files Browse the repository at this point in the history
As described in issue 3002, in PHP 8 a trailing new line is no longer included in a `T_COMMENT` token.

This commit "forward-fills" the PHP 5/7 tokenization of `T_COMMENT` tokens to PHP 8.

Includes extensive unit tests. I'm hoping to have caught everything affected :fingers_crossed:

The initial set of unit tests `StableCommentWhitespaceTest` use Linux line endings `\n`.
The secondary set of unit tests `StableCommentWhitespaceWinTest` use Windows line endings `\r\n` to test that the fix is stable for files using different line ending.

For the tests with Windows line endings, both the test case file as well as the actual test file have been set up to use Windows line endings for all lines, not just the test data lines, to make it simpler to manage the line endings for the files.

The test file has been excluded from the line endings CS check for that reason and a directive has been added to the `.gitattributes` file to safeguard that the line endings of those files will remain Windows line endings.

Fixes 3002
  • Loading branch information
jrfnl committed Jul 20, 2020
1 parent 802a514 commit ee91c16
Show file tree
Hide file tree
Showing 8 changed files with 1,444 additions and 1 deletion.
4 changes: 4 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@ package.xml export-ignore
phpunit.xml.dist export-ignore
php5-testingConfig.ini export-ignore
php7-testingConfig.ini export-ignore

# Declare files that should always have CRLF line endings on checkout.
*WinTest.inc text eol=crlf
*WinTest.php text eol=crlf
12 changes: 12 additions & 0 deletions package.xml
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ http://pear.php.net/dtd/package-2.0.xsd">
<file baseinstalldir="" name="BackfillNumericSeparatorTest.php" role="test" />
<file baseinstalldir="" name="ShortArrayTest.inc" role="test" />
<file baseinstalldir="" name="ShortArrayTest.php" role="test" />
<file baseinstalldir="" name="StableCommentWhitespaceTest.inc" role="test" />
<file baseinstalldir="" name="StableCommentWhitespaceTest.php" role="test" />
<file baseinstalldir="" name="StableCommentWhitespaceWinTest.inc" role="test" />
<file baseinstalldir="" name="StableCommentWhitespaceWinTest.php" role="test" />
</dir>
<file baseinstalldir="" name="AbstractMethodUnitTest.php" role="test" />
<file baseinstalldir="" name="AllTests.php" role="test" />
Expand Down Expand Up @@ -1985,6 +1989,10 @@ http://pear.php.net/dtd/package-2.0.xsd">
<install as="CodeSniffer/Core/Tokenizer/BackfillNumericSeparatorTest.inc" name="tests/Core/Tokenizer/BackfillNumericSeparatorTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.php" name="tests/Core/Tokenizer/ShortArrayTest.php" />
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.inc" name="tests/Core/Tokenizer/ShortArrayTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.php" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.php" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" />
<install as="CodeSniffer/Standards/AllSniffs.php" name="tests/Standards/AllSniffs.php" />
<install as="CodeSniffer/Standards/AbstractSniffUnitTest.php" name="tests/Standards/AbstractSniffUnitTest.php" />
</filelist>
Expand Down Expand Up @@ -2040,6 +2048,10 @@ http://pear.php.net/dtd/package-2.0.xsd">
<install as="CodeSniffer/Core/Tokenizer/BackfillNumericSeparatorTest.inc" name="tests/Core/Tokenizer/BackfillNumericSeparatorTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.php" name="tests/Core/Tokenizer/ShortArrayTest.php" />
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.inc" name="tests/Core/Tokenizer/ShortArrayTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.php" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.inc" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.php" />
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" />
<install as="CodeSniffer/Standards/AllSniffs.php" name="tests/Standards/AllSniffs.php" />
<install as="CodeSniffer/Standards/AbstractSniffUnitTest.php" name="tests/Standards/AbstractSniffUnitTest.php" />
<ignore name="bin/phpcs.bat" />
Expand Down
7 changes: 6 additions & 1 deletion phpcs.xml.dist
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,12 @@

<!-- The testing bootstrap file uses string concats to stop IDEs seeing the class aliases -->
<rule ref="Generic.Strings.UnnecessaryStringConcat">
<exclude-pattern>tests/bootstrap.php</exclude-pattern>
<exclude-pattern>tests/bootstrap\.php</exclude-pattern>
</rule>

<!-- This test file specifically *needs* Windows line endings for testing purposes. -->
<rule ref="Generic.Files.LineEndings.InvalidEOLChar">
<exclude-pattern>tests/Core/Tokenizer/StableCommentWhitespaceWinTest\.php</exclude-pattern>
</rule>

</ruleset>
41 changes: 41 additions & 0 deletions src/Tokenizers/PHP.php
Original file line number Diff line number Diff line change
Expand Up @@ -566,6 +566,47 @@ protected function tokenize($string)
continue;
}

/*
PHP 8 tokenizes a new line after a slash comment to the next whitespace token.
*/

if (PHP_VERSION_ID >= 80000
&& $tokenIsArray === true
&& ($token[0] === T_COMMENT && strpos($token[1], '//') === 0)
&& isset($tokens[($stackPtr + 1)]) === true
&& is_array($tokens[($stackPtr + 1)]) === true
&& $tokens[($stackPtr + 1)][0] === T_WHITESPACE
) {
$nextToken = $tokens[($stackPtr + 1)];

// If the next token is a single new line, merge it into the comment token
// and set to it up to be skipped.
if ($nextToken[1] === "\n" || $nextToken[1] === "\r\n" || $nextToken[1] === "\n\r") {
$token[1] .= $nextToken[1];
$tokens[($stackPtr + 1)] = null;

if (PHP_CODESNIFFER_VERBOSITY > 1) {
echo "\t\t* merged newline after comment into comment token $stackPtr".PHP_EOL;
}
} else {
// This may be a whitespace token consisting of multiple new lines.
if (strpos($nextToken[1], "\r\n") === 0) {
$token[1] .= "\r\n";
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 2);
} else if (strpos($nextToken[1], "\n\r") === 0) {
$token[1] .= "\n\r";
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 2);
} else if (strpos($nextToken[1], "\n") === 0) {
$token[1] .= "\n";
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 1);
}

if (PHP_CODESNIFFER_VERBOSITY > 1) {
echo "\t\t* stripped first newline after comment and added it to comment token $stackPtr".PHP_EOL;
}
}//end if
}//end if

/*
If this is a double quoted string, PHP will tokenize the whole
thing which causes problems with the scope map when braces are
Expand Down
119 changes: 119 additions & 0 deletions tests/Core/Tokenizer/StableCommentWhitespaceTest.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
<?php

/* testSingleLineSlashComment */
// Comment

/* testSingleLineSlashCommentTrailing */
echo 'a'; // Comment

/* testSingleLineSlashAnnotation */
// phpcs:disable Stnd.Cat

/* testMultiLineSlashComment */
// Comment1
// Comment2
// Comment3

/* testMultiLineSlashCommentWithIndent */
// Comment1
// Comment2
// Comment3

/* testMultiLineSlashCommentWithAnnotationStart */
// phpcs:ignore Stnd.Cat
// Comment2
// Comment3

/* testMultiLineSlashCommentWithAnnotationMiddle */
// Comment1
// @phpcs:ignore Stnd.Cat
// Comment3

/* testMultiLineSlashCommentWithAnnotationEnd */
// Comment1
// Comment2
// phpcs:ignore Stnd.Cat


/* testSingleLineStarComment */
/* Single line star comment */

/* testSingleLineStarCommentTrailing */
echo 'a'; /* Comment */

/* testSingleLineStarAnnotation */
/* phpcs:ignore Stnd.Cat */

/* testMultiLineStarComment */
/* Comment1
* Comment2
* Comment3 */

/* testMultiLineStarCommentWithIndent */
/* Comment1
* Comment2
* Comment3 */

/* testMultiLineStarCommentWithAnnotationStart */
/* @phpcs:ignore Stnd.Cat
* Comment2
* Comment3 */

/* testMultiLineStarCommentWithAnnotationMiddle */
/* Comment1
* phpcs:ignore Stnd.Cat
* Comment3 */

/* testMultiLineStarCommentWithAnnotationEnd */
/* Comment1
* Comment2
* phpcs:ignore Stnd.Cat */


/* testSingleLineDocblockComment */
/** Comment */

/* testSingleLineDocblockCommentTrailing */
$prop = 123; /** Comment */

/* testSingleLineDocblockAnnotation */
/** phpcs:ignore Stnd.Cat.Sniff */

/* testMultiLineDocblockComment */
/**
* Comment1
* Comment2
*
* @tag Comment
*/

/* testMultiLineDocblockCommentWithIndent */
/**
* Comment1
* Comment2
*
* @tag Comment
*/

/* testMultiLineDocblockCommentWithAnnotation */
/**
* Comment
*
* phpcs:ignore Stnd.Cat
* @tag Comment
*/

/* testMultiLineDocblockCommentWithTagAnnotation */
/**
* Comment
*
* @phpcs:ignore Stnd.Cat
* @tag Comment
*/

/* testSingleLineSlashCommentNoNewLineAtEnd */
// Slash ?>
<?php

/* testCommentAtEndOfFile */
/* Comment
Loading

0 comments on commit ee91c16

Please sign in to comment.