Skip to content

Commit 96327b7

Browse files
committed
Add fast path for ascii chars to isFirstCodepointUppercase
Whenever we know that the Rope is either using ASCII or something that is a superset of ASCII (such as UTF-8), we can make a faster check without needing to extract codepoints and checking for a bigger range of uppercase characters. I've used the instructions on <https://github.com/oracle/truffleruby/blob/master/doc/contributor/profiling.md#require-profiling> (in combination with a native binary -- `jt build --env native` + `jt -u native ruby ...`) to benchmark the impact of this change and the numbers were too close to the margin of error to tell: * `master` (without the fix to #2079): avg 57.7ms, median 57ms, p90 64ms * this commit: avg 57.8ms, median 58ms, p90 63ms * without fast path (previous commit): avg 56.7ms, median 56ms, p90 63ms ...but since this fast path is both very simple and local to the method, I think it's worth keeping anyway.
1 parent fa10840 commit 96327b7

File tree

1 file changed

+10
-3
lines changed

1 file changed

+10
-3
lines changed

src/main/java/org/truffleruby/parser/lexer/RubyLexer.java

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3668,8 +3668,15 @@ protected boolean isSpaceArg(int c, boolean spaceSeen) {
36683668
/** Encoding-aware (including multi-byte encodings) check of first codepoint of a given rope, usually to determine
36693669
* if it is a constant */
36703670
private boolean isFirstCodepointUppercase(Rope rope) {
3671-
byte[] ropeBytes = rope.getBytes();
3672-
int firstCharacter = rope.encoding.mbcToCode(ropeBytes, 0, ropeBytes.length);
3673-
return rope.encoding.isUpper(firstCharacter);
3671+
Encoding ropeEncoding = rope.encoding;
3672+
int firstByte = rope.get(0) & 0xFF;
3673+
3674+
if (ropeEncoding.isAsciiCompatible() && isASCII(firstByte)) {
3675+
return StringSupport.isAsciiUppercase((byte) firstByte);
3676+
} else {
3677+
byte[] ropeBytes = rope.getBytes();
3678+
int firstCharacter = ropeEncoding.mbcToCode(ropeBytes, 0, ropeBytes.length);
3679+
return ropeEncoding.isUpper(firstCharacter);
3680+
}
36743681
}
36753682
}

0 commit comments

Comments
 (0)