Skip to content

Commit ee91c16

Browse files
committed
PHP 8.0 | Tokenizer/PHP: stabilize comment tokenization
As described in issue 3002, in PHP 8 a trailing new line is no longer included in a `T_COMMENT` token. This commit "forward-fills" the PHP 5/7 tokenization of `T_COMMENT` tokens to PHP 8. Includes extensive unit tests. I'm hoping to have caught everything affected :fingers_crossed: The initial set of unit tests `StableCommentWhitespaceTest` use Linux line endings `\n`. The secondary set of unit tests `StableCommentWhitespaceWinTest` use Windows line endings `\r\n` to test that the fix is stable for files using different line ending. For the tests with Windows line endings, both the test case file as well as the actual test file have been set up to use Windows line endings for all lines, not just the test data lines, to make it simpler to manage the line endings for the files. The test file has been excluded from the line endings CS check for that reason and a directive has been added to the `.gitattributes` file to safeguard that the line endings of those files will remain Windows line endings. Fixes 3002
1 parent 802a514 commit ee91c16

File tree

8 files changed

+1444
-1
lines changed

8 files changed

+1444
-1
lines changed

.gitattributes

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,7 @@ package.xml export-ignore
33
phpunit.xml.dist export-ignore
44
php5-testingConfig.ini export-ignore
55
php7-testingConfig.ini export-ignore
6+
7+
# Declare files that should always have CRLF line endings on checkout.
8+
*WinTest.inc text eol=crlf
9+
*WinTest.php text eol=crlf

package.xml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,10 @@ http://pear.php.net/dtd/package-2.0.xsd">
119119
<file baseinstalldir="" name="BackfillNumericSeparatorTest.php" role="test" />
120120
<file baseinstalldir="" name="ShortArrayTest.inc" role="test" />
121121
<file baseinstalldir="" name="ShortArrayTest.php" role="test" />
122+
<file baseinstalldir="" name="StableCommentWhitespaceTest.inc" role="test" />
123+
<file baseinstalldir="" name="StableCommentWhitespaceTest.php" role="test" />
124+
<file baseinstalldir="" name="StableCommentWhitespaceWinTest.inc" role="test" />
125+
<file baseinstalldir="" name="StableCommentWhitespaceWinTest.php" role="test" />
122126
</dir>
123127
<file baseinstalldir="" name="AbstractMethodUnitTest.php" role="test" />
124128
<file baseinstalldir="" name="AllTests.php" role="test" />
@@ -1985,6 +1989,10 @@ http://pear.php.net/dtd/package-2.0.xsd">
19851989
<install as="CodeSniffer/Core/Tokenizer/BackfillNumericSeparatorTest.inc" name="tests/Core/Tokenizer/BackfillNumericSeparatorTest.inc" />
19861990
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.php" name="tests/Core/Tokenizer/ShortArrayTest.php" />
19871991
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.inc" name="tests/Core/Tokenizer/ShortArrayTest.inc" />
1992+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.php" />
1993+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.inc" />
1994+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.php" />
1995+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" />
19881996
<install as="CodeSniffer/Standards/AllSniffs.php" name="tests/Standards/AllSniffs.php" />
19891997
<install as="CodeSniffer/Standards/AbstractSniffUnitTest.php" name="tests/Standards/AbstractSniffUnitTest.php" />
19901998
</filelist>
@@ -2040,6 +2048,10 @@ http://pear.php.net/dtd/package-2.0.xsd">
20402048
<install as="CodeSniffer/Core/Tokenizer/BackfillNumericSeparatorTest.inc" name="tests/Core/Tokenizer/BackfillNumericSeparatorTest.inc" />
20412049
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.php" name="tests/Core/Tokenizer/ShortArrayTest.php" />
20422050
<install as="CodeSniffer/Core/Tokenizer/ShortArrayTest.inc" name="tests/Core/Tokenizer/ShortArrayTest.inc" />
2051+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.php" />
2052+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceTest.inc" />
2053+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.php" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.php" />
2054+
<install as="CodeSniffer/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" name="tests/Core/Tokenizer/StableCommentWhitespaceWinTest.inc" />
20432055
<install as="CodeSniffer/Standards/AllSniffs.php" name="tests/Standards/AllSniffs.php" />
20442056
<install as="CodeSniffer/Standards/AbstractSniffUnitTest.php" name="tests/Standards/AbstractSniffUnitTest.php" />
20452057
<ignore name="bin/phpcs.bat" />

phpcs.xml.dist

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -143,7 +143,12 @@
143143

144144
<!-- The testing bootstrap file uses string concats to stop IDEs seeing the class aliases -->
145145
<rule ref="Generic.Strings.UnnecessaryStringConcat">
146-
<exclude-pattern>tests/bootstrap.php</exclude-pattern>
146+
<exclude-pattern>tests/bootstrap\.php</exclude-pattern>
147+
</rule>
148+
149+
<!-- This test file specifically *needs* Windows line endings for testing purposes. -->
150+
<rule ref="Generic.Files.LineEndings.InvalidEOLChar">
151+
<exclude-pattern>tests/Core/Tokenizer/StableCommentWhitespaceWinTest\.php</exclude-pattern>
147152
</rule>
148153

149154
</ruleset>

src/Tokenizers/PHP.php

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -566,6 +566,47 @@ protected function tokenize($string)
566566
continue;
567567
}
568568

569+
/*
570+
PHP 8 tokenizes a new line after a slash comment to the next whitespace token.
571+
*/
572+
573+
if (PHP_VERSION_ID >= 80000
574+
&& $tokenIsArray === true
575+
&& ($token[0] === T_COMMENT && strpos($token[1], '//') === 0)
576+
&& isset($tokens[($stackPtr + 1)]) === true
577+
&& is_array($tokens[($stackPtr + 1)]) === true
578+
&& $tokens[($stackPtr + 1)][0] === T_WHITESPACE
579+
) {
580+
$nextToken = $tokens[($stackPtr + 1)];
581+
582+
// If the next token is a single new line, merge it into the comment token
583+
// and set to it up to be skipped.
584+
if ($nextToken[1] === "\n" || $nextToken[1] === "\r\n" || $nextToken[1] === "\n\r") {
585+
$token[1] .= $nextToken[1];
586+
$tokens[($stackPtr + 1)] = null;
587+
588+
if (PHP_CODESNIFFER_VERBOSITY > 1) {
589+
echo "\t\t* merged newline after comment into comment token $stackPtr".PHP_EOL;
590+
}
591+
} else {
592+
// This may be a whitespace token consisting of multiple new lines.
593+
if (strpos($nextToken[1], "\r\n") === 0) {
594+
$token[1] .= "\r\n";
595+
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 2);
596+
} else if (strpos($nextToken[1], "\n\r") === 0) {
597+
$token[1] .= "\n\r";
598+
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 2);
599+
} else if (strpos($nextToken[1], "\n") === 0) {
600+
$token[1] .= "\n";
601+
$tokens[($stackPtr + 1)][1] = substr($nextToken[1], 1);
602+
}
603+
604+
if (PHP_CODESNIFFER_VERBOSITY > 1) {
605+
echo "\t\t* stripped first newline after comment and added it to comment token $stackPtr".PHP_EOL;
606+
}
607+
}//end if
608+
}//end if
609+
569610
/*
570611
If this is a double quoted string, PHP will tokenize the whole
571612
thing which causes problems with the scope map when braces are
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
<?php
2+
3+
/* testSingleLineSlashComment */
4+
// Comment
5+
6+
/* testSingleLineSlashCommentTrailing */
7+
echo 'a'; // Comment
8+
9+
/* testSingleLineSlashAnnotation */
10+
// phpcs:disable Stnd.Cat
11+
12+
/* testMultiLineSlashComment */
13+
// Comment1
14+
// Comment2
15+
// Comment3
16+
17+
/* testMultiLineSlashCommentWithIndent */
18+
// Comment1
19+
// Comment2
20+
// Comment3
21+
22+
/* testMultiLineSlashCommentWithAnnotationStart */
23+
// phpcs:ignore Stnd.Cat
24+
// Comment2
25+
// Comment3
26+
27+
/* testMultiLineSlashCommentWithAnnotationMiddle */
28+
// Comment1
29+
// @phpcs:ignore Stnd.Cat
30+
// Comment3
31+
32+
/* testMultiLineSlashCommentWithAnnotationEnd */
33+
// Comment1
34+
// Comment2
35+
// phpcs:ignore Stnd.Cat
36+
37+
38+
/* testSingleLineStarComment */
39+
/* Single line star comment */
40+
41+
/* testSingleLineStarCommentTrailing */
42+
echo 'a'; /* Comment */
43+
44+
/* testSingleLineStarAnnotation */
45+
/* phpcs:ignore Stnd.Cat */
46+
47+
/* testMultiLineStarComment */
48+
/* Comment1
49+
* Comment2
50+
* Comment3 */
51+
52+
/* testMultiLineStarCommentWithIndent */
53+
/* Comment1
54+
* Comment2
55+
* Comment3 */
56+
57+
/* testMultiLineStarCommentWithAnnotationStart */
58+
/* @phpcs:ignore Stnd.Cat
59+
* Comment2
60+
* Comment3 */
61+
62+
/* testMultiLineStarCommentWithAnnotationMiddle */
63+
/* Comment1
64+
* phpcs:ignore Stnd.Cat
65+
* Comment3 */
66+
67+
/* testMultiLineStarCommentWithAnnotationEnd */
68+
/* Comment1
69+
* Comment2
70+
* phpcs:ignore Stnd.Cat */
71+
72+
73+
/* testSingleLineDocblockComment */
74+
/** Comment */
75+
76+
/* testSingleLineDocblockCommentTrailing */
77+
$prop = 123; /** Comment */
78+
79+
/* testSingleLineDocblockAnnotation */
80+
/** phpcs:ignore Stnd.Cat.Sniff */
81+
82+
/* testMultiLineDocblockComment */
83+
/**
84+
* Comment1
85+
* Comment2
86+
*
87+
* @tag Comment
88+
*/
89+
90+
/* testMultiLineDocblockCommentWithIndent */
91+
/**
92+
* Comment1
93+
* Comment2
94+
*
95+
* @tag Comment
96+
*/
97+
98+
/* testMultiLineDocblockCommentWithAnnotation */
99+
/**
100+
* Comment
101+
*
102+
* phpcs:ignore Stnd.Cat
103+
* @tag Comment
104+
*/
105+
106+
/* testMultiLineDocblockCommentWithTagAnnotation */
107+
/**
108+
* Comment
109+
*
110+
* @phpcs:ignore Stnd.Cat
111+
* @tag Comment
112+
*/
113+
114+
/* testSingleLineSlashCommentNoNewLineAtEnd */
115+
// Slash ?>
116+
<?php
117+
118+
/* testCommentAtEndOfFile */
119+
/* Comment

0 commit comments

Comments
 (0)