-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8358880: Performance of parsing with DecimalFormat can be improved #25644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back j3graham! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
Rough performance results on AArch64 M4: Original
Benchmark Mode Cnt Score Error Units
-DecimalFormatParseBench.testParseDoubles thrpt 15 15200.984 ± 409.547 ops/ms
-DecimalFormatParseBench.testParseLongs thrpt 15 25777.899 ± 559.096 ops/ms
This PR
Benchmark Mode Cnt Score Error Units
+DecimalFormatParseBench.testParseDoubles thrpt 15 28041.325 ± 472.657 ops/ms
+DecimalFormatParseBench.testParseLongs thrpt 15 34181.146 ± 655.719 ops/ms |
temp.append("0".repeat(Math.max(0, decimalAt - count))); | ||
return Long.parseLong(temp.toString()); | ||
long pow10 = Math.powExact(10L, Math.max(0, decimalAt - count)); | ||
return Math.multiplyExact(v, pow10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two methods throw ArithmeticException. This needs to be rethrown as NumberFormatException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is a little odd. The parse methods that call getLong
are not supposed to throw NumberFormatException
either. So wherever getLong
is called, it must be preceded by a check to fitsIntoLong
, which should avoid any exceptions here. That said, rethrowing as NFE would avoid new surprises. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave this question to I18N reviewers, who are ultimately in charge of DigitList.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing implementation does not throw NumberFormatException
/ArithmeticException
, but ParseException
if parsing is failing. I would expect the same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not seeing where the original could throw ParseException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I was unclear. I mean the parse()
in the NumberFormat do not throw NumberFormatException/ArithmeticException, but ParseException, so if this piece of code need to throw something, it should be ParseException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parse()
methods where this code gets used in DecimalFormat
unfortunately don't throw ParseException
. The current calls to getLong
always are guarded with a call to fitsIntoLong
, which should avoid any exceptions actually being thrown here. So there is no parse failure as such - instead it tries to parse it as a double
or a BigDecimal
. If getLong were to be called without the guard, the exception would have come from Long.parseLong
, which throws a NumberFormatException
.
I've added a commit to follow @liach's suggestion to at least handle the ArithmeticException so as to not introduce new exceptions into the mix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, sounds reasonable.
Hi @ALUMINIS650, thanks for making a comment in an OpenJDK project! All comments and discussions in the OpenJDK Community must be made available under the OpenJDK Terms of Use. If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please Use "Add GitHub user ALUMINIS650" for the summary. If you are not an OpenJDK Author, Committer or Reviewer, simply check the box below to accept the OpenJDK Terms of Use for your comments.
Your comment will be automatically restored once you have accepted the OpenJDK Terms of Use. |
Webrevs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the improvements. I think we need to prioritize behavioral compatibility with this change, so we will want to run the JCK tests as well for the extra safety.
test/micro/org/openjdk/bench/java/text/DecimalFormatParseBench.java
Outdated
Show resolved
Hide resolved
src/java.base/share/classes/jdk/internal/math/FloatingDecimal.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On our CI, I did a java_text JCK run as well as tiers 1-3 on all platforms which both came back good. I just have a final comment.
@@ -1824,6 +1837,17 @@ private static BinaryToASCIIConverter getBinaryToASCIIConverter(float f) { | |||
return buf; | |||
} | |||
|
|||
static ASCIIToBinaryConverter readDoubleSignlessDigits(int decExp, char[] digits, int length) { | |||
if (decExp < MIN_DECIMAL_EXPONENT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check needed? I think ASCIIToBinaryConverter
will return the proper zero value when doubleValue()
is invoked.
if (decExponent < MIN_DECIMAL_EXPONENT - 1) {
return (isNegative) ? -0.0 : 0.0;
And if this explicit check is a shortcut, I don't think we would need one for an edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately some check is required (a test fails), but I now realize what I had was wrong. The issue is that on line 1084 (https://github.com/openjdk/jdk/pull/25644/files#diff-79e6fd549b5ec5e7f49658581beddcb07fcbb0c09ae8e1117c385b66514da6d2R1084)) exp can overflow and become positive again. I've updated the check to avoid the overflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah got it, I see your point. We would have goten underflow in ASCIIToBinaryConverter.doubleValue()
for some extreme cases without a check.
Is there a specific example you have that requires the switch to the newer check? Adding a comment along those lines might be helpful. Actually, I thought DigitList caps decimalAt
to Integer.MIN/MAX, so then the first check you had would have been fine. (Maybe I am missing something?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a specific example, so I've reverted to my original check. I'm a bit unsettled by the check for an extreme value later in doubleValue()
comparing against MIN_DECIMAL_EXPONENT - 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, the original check you had is easier to understand what is happening without further context, so I prefer your switch back.
I think we are fine from (negative) "extreme values" in doubleValue()
because of the check you have implemented in the first place. i.e. we avoid any potential underflow from int exp = decExponent - kDigits;
. I think we do need a comment to accompany the check. (Why do we check? why not check the max exponent value?)
Also, should the check be against MIN_DECIMAL_EXPONENT - 1
for consistency with doubleValue()
? (Functionally, I don't think it matters.)
This reverts commit 6a07287.
/reviewers 2 reviewer |
@j3graham This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable from a code cleanup point of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current form looks good to me. The long parsing did not change much on my machine performance wise, but I think it is a good simplification to include.
This PR replaces construction of intermediate strings to be parsed with more direct manipulation of numbers. It also has a more streamlined mechanism of handling
Long.MIN_VALUE
when parsing longs by usingLong.parseUnsignedLong
As a small side-effect it also eliminates the use of a cached StringBuilder in DigitList.
Testing:
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25644/head:pull/25644
$ git checkout pull/25644
Update a local copy of the PR:
$ git checkout pull/25644
$ git pull https://git.openjdk.org/jdk.git pull/25644/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25644
View PR using the GUI difftool:
$ git pr show -t 25644
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25644.diff
Using Webrev
Link to Webrev Comment