Skip to content

Commit 0d5663c

Browse files
committed
Tokenizer/PHP: fix tokenization of the default keyword
As per: #3326 (comment) > After `PHP::tokenize()`, the `DEFAULT` is still tokenized as `T_DEFAULT`. This causes the `Tokenizer::recurseScopeMap()` to assign it as the `scope_opener` to the `;` semi-colon at the end of the constant declaration, with the class close curly brace being set as the `scope_closer`. > In the `PHP::processAdditional()` method, the `DEFAULT` is subsequently retokenized to `T_STRING` as it is preceded by a `const` keyword, but that is too late. > > The `scope_opener` being set on the semi-colon is what is causing the errors to be displayed for the above code sample. The commit fixes this by: 1. Abstracting the list of `T_STRING` contexts out to a class property. 2. Using the list from the property in all places in the `Tokenizer\PHP` class where keyword tokens are (potentially) being re-tokenized to `T_STRING`, including in the `T_DEFAULT` tokenization code which was added to address the PHP 8.0 `match` expressions. Note: the issue was not introduced by `match` related code, however, that code being there does make it relatively easy now to fix this particular case. While this doesn't address 3336 yes, it is a step towards addressing it and will sort out one of the most common causes for bugs.
1 parent 65ab395 commit 0d5663c

File tree

1 file changed

+27
-44
lines changed

1 file changed

+27
-44
lines changed

src/Tokenizers/PHP.php

Lines changed: 27 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -454,6 +454,29 @@ class PHP extends Tokenizer
454454
T_TYPE_UNION => 1,
455455
];
456456

457+
/**
458+
* Contexts in which keywords should always be tokenized as T_STRING.
459+
*
460+
* @var array
461+
*/
462+
protected $tstringContexts = [
463+
T_OBJECT_OPERATOR => true,
464+
T_NULLSAFE_OBJECT_OPERATOR => true,
465+
T_FUNCTION => true,
466+
T_CLASS => true,
467+
T_INTERFACE => true,
468+
T_TRAIT => true,
469+
T_EXTENDS => true,
470+
T_IMPLEMENTS => true,
471+
T_ATTRIBUTE => true,
472+
T_NEW => true,
473+
T_CONST => true,
474+
T_NS_SEPARATOR => true,
475+
T_USE => true,
476+
T_NAMESPACE => true,
477+
T_PAAMAYIM_NEKUDOTAYIM => true,
478+
];
479+
457480
/**
458481
* A cache of different token types, resolved into arrays.
459482
*
@@ -1332,16 +1355,7 @@ protected function tokenize($string)
13321355
break;
13331356
}
13341357

1335-
$notMatchContext = [
1336-
T_PAAMAYIM_NEKUDOTAYIM => true,
1337-
T_OBJECT_OPERATOR => true,
1338-
T_NULLSAFE_OBJECT_OPERATOR => true,
1339-
T_NS_SEPARATOR => true,
1340-
T_NEW => true,
1341-
T_FUNCTION => true,
1342-
];
1343-
1344-
if (isset($notMatchContext[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
1358+
if (isset($this->tstringContexts[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
13451359
// Also not a match expression.
13461360
break;
13471361
}
@@ -1389,14 +1403,7 @@ protected function tokenize($string)
13891403
if ($tokenIsArray === true
13901404
&& $token[0] === T_DEFAULT
13911405
) {
1392-
$ignoreContext = [
1393-
T_OBJECT_OPERATOR => true,
1394-
T_NULLSAFE_OBJECT_OPERATOR => true,
1395-
T_NS_SEPARATOR => true,
1396-
T_PAAMAYIM_NEKUDOTAYIM => true,
1397-
];
1398-
1399-
if (isset($ignoreContext[$finalTokens[$lastNotEmptyToken]['code']]) === false) {
1406+
if (isset($this->tstringContexts[$finalTokens[$lastNotEmptyToken]['code']]) === false) {
14001407
for ($x = ($stackPtr + 1); $x < $numTokens; $x++) {
14011408
if ($tokens[$x] === ',') {
14021409
// Skip over potential trailing comma (supported in PHP).
@@ -1894,25 +1901,7 @@ function return types. We want to keep the parenthesis map clean,
18941901
if ($tokenIsArray === true && $token[0] === T_STRING) {
18951902
// Some T_STRING tokens should remain that way
18961903
// due to their context.
1897-
$context = [
1898-
T_OBJECT_OPERATOR => true,
1899-
T_NULLSAFE_OBJECT_OPERATOR => true,
1900-
T_FUNCTION => true,
1901-
T_CLASS => true,
1902-
T_INTERFACE => true,
1903-
T_TRAIT => true,
1904-
T_EXTENDS => true,
1905-
T_IMPLEMENTS => true,
1906-
T_ATTRIBUTE => true,
1907-
T_NEW => true,
1908-
T_CONST => true,
1909-
T_NS_SEPARATOR => true,
1910-
T_USE => true,
1911-
T_NAMESPACE => true,
1912-
T_PAAMAYIM_NEKUDOTAYIM => true,
1913-
];
1914-
1915-
if (isset($context[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
1904+
if (isset($this->tstringContexts[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
19161905
// Special case for syntax like: return new self
19171906
// where self should not be a string.
19181907
if ($finalTokens[$lastNotEmptyToken]['code'] === T_NEW
@@ -2784,13 +2773,7 @@ protected function processAdditional()
27842773
}
27852774
}
27862775

2787-
$context = [
2788-
T_OBJECT_OPERATOR => true,
2789-
T_NULLSAFE_OBJECT_OPERATOR => true,
2790-
T_NS_SEPARATOR => true,
2791-
T_PAAMAYIM_NEKUDOTAYIM => true,
2792-
];
2793-
if (isset($context[$this->tokens[$x]['code']]) === true) {
2776+
if (isset($this->tstringContexts[$this->tokens[$x]['code']]) === true) {
27942777
if (PHP_CODESNIFFER_VERBOSITY > 1) {
27952778
$line = $this->tokens[$i]['line'];
27962779
$type = $this->tokens[$i]['type'];

0 commit comments

Comments
 (0)