-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Base: fix floating-point div #49561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Base: fix floating-point div #49561
Conversation
the functions defined seem similar to what's inside https://github.com/JuliaLang/julia/blob/master/base/twiceprecision.jl ? |
Can it really be said to fix the issue if the resulting ratio is still nonzero? Is it not possible to be correct in all cases? |
Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. TODO: evaluate the accuracy improvements and write tests after JuliaLang#49616 gets merged Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. TODO: check whether there are improvements for JuliaLang#26553 Updates JuliaLang#49589
Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. TODO: evaluate the accuracy improvements and write tests after JuliaLang#49616 gets merged Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. TODO: check whether there are improvements for JuliaLang#26553 Updates JuliaLang#49589
Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. TODO: evaluate the accuracy improvements and write tests after JuliaLang#49616 gets merged Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. TODO: check whether there are improvements for JuliaLang#26553 Updates JuliaLang#49589
Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent some harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. TODO: write tests TODO: some tests fail sometimes: ``` Test Failed at /home/nsajko/tmp/julia/test/ranges.jl:223 Expression: cmp_sn2(Tw(xw / yw), astuple(x / y)..., slopbits) ``` Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. Proof that JuliaLang#26553 is fixed: ```julia-repl julia> const TwicePrecision = Base.TwicePrecision Base.TwicePrecision julia> a = TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) Base.TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) julia> b = TwicePrecision{Float64}(3.4, 8.881784197001253e-17) Base.TwicePrecision{Float64}(3.4, 8.881784197001253e-17) julia> a_minus_b_inexact = Tuple(a - b) (-2.9714285714285715, 1.0150610510858574e-16) julia> BigFloat(a) == sum(map(BigFloat, Tuple(a))) true julia> BigFloat(b) == sum(map(BigFloat, Tuple(b))) true julia> a_minus_b = BigFloat(a) - BigFloat(b) -2.971428571428571428571428571428577679589762353999797347402693647078382455095635 julia> Float64(a_minus_b) == first(a_minus_b_inexact) true julia> Float64(a_minus_b - Float64(a_minus_b)) == a_minus_b_inexact[begin + 1] true ``` Fixes JuliaLang#26553 Updates JuliaLang#49589
Update the arithmetic code to use algorithms published since twiceprecision.jl got written. Handle complex numbers. Prevent some harmful promotions. Some of the introduced extended precision arithmetic code will also be used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561. Results: The example in JuliaLang#33677 still gives the same result. I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in the meantime. Proof that JuliaLang#26553 is fixed: ```julia-repl julia> const TwicePrecision = Base.TwicePrecision Base.TwicePrecision julia> a = TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) Base.TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17) julia> b = TwicePrecision{Float64}(3.4, 8.881784197001253e-17) Base.TwicePrecision{Float64}(3.4, 8.881784197001253e-17) julia> a_minus_b_inexact = Tuple(a - b) (-2.9714285714285715, 1.0150610510858574e-16) julia> BigFloat(a) == sum(map(BigFloat, Tuple(a))) true julia> BigFloat(b) == sum(map(BigFloat, Tuple(b))) true julia> a_minus_b = BigFloat(a) - BigFloat(b) -2.971428571428571428571428571428577679589762353999797347402693647078382455095635 julia> Float64(a_minus_b) == first(a_minus_b_inexact) true julia> Float64(a_minus_b - Float64(a_minus_b)) == a_minus_b_inexact[begin + 1] true ``` Fixes JuliaLang#26553 Updates JuliaLang#49589
Double-word arithmetics are used, except for rounding modes without `rem`, which now get simple fallback implementations (on master their `div` methods fail when called). The idea to fix `Float16` by widening to `Float32` is taken from Simon Byrne's JuliaLang#49637 A script used for assessing the correctness of `div`: ```julia using Random accurate_div(x, y, r::RoundingMode) = div(BigFloat(x), BigFloat(y), r) function count_wrong_floats( div_fun::Fun, r::RoundingMode, ::Type{F}, ::Type{U}, n::Int, m::Int, ) where {Fun <: Function, F, U} count_wrong_huge_quo = 0 count_wrong_friendly_quo = 0 count_total_huge_quo = 0 count_total_friendly_quo = 0 vec_x = zeros(U, m) vec_y = zeros(U, m) for i ∈ 1:n Random.rand!(vec_x) Random.rand!(vec_y) for (x_raw, y_raw) ∈ zip(vec_x, vec_y) x = reinterpret(F, x_raw) y = reinterpret(F, y_raw) (!isfinite(x) | !isfinite(y)) && continue quo_is_huge = (maxintfloat(F) < abs(x / y)) acc_big = accurate_div(x, y, r) acc = F(acc_big) # Skip cases when the result isn't representable, the # correct result is not specified for this case, and # it's not clear what a user would expect either. (acc == acc_big) || continue if quo_is_huge count_total_huge_quo += true else count_total_friendly_quo += true end d = div_fun(x, y, r) is_ok = (d == acc) | (isnan(d) & isnan(acc)) if !is_ok if quo_is_huge count_wrong_huge_quo += true else count_wrong_friendly_quo += true end end end end ( bad_ratio_huge_quotient = count_wrong_huge_quo / count_total_huge_quo, bad_ratio_friendly_quotient = count_wrong_friendly_quo / count_total_friendly_quo, total_huge_quotient_count = count_total_huge_quo, total_friendly_quotient_count = count_total_friendly_quo, ) end const float_types = (Float16, Float32, Float64) const bits_types = (UInt16, UInt32, UInt64) # not supported on master const rounding_modes_other = ( RoundNearestTiesAway, RoundNearestTiesUp ) const rounding_modes = ( RoundNearest, RoundUp, RoundDown, RoundFromZero, RoundToZero ) function experiment(itcnt::Int, itcnt_inner::Int) for (F, U) ∈ zip(float_types, bits_types) println("$F $U") for rm ∈ rounding_modes println(" $rm") flush(stdout) res = count_wrong_floats( div, rm, F, U, itcnt, itcnt_inner ) ch = res.total_huge_quotient_count cf = res.total_friendly_quotient_count rh = res.bad_ratio_huge_quotient rf = res.bad_ratio_friendly_quotient println(" quotients of huge magnitude:") println(" total count: $ch") println(" ratio of incorrect results: $rh") println(" quotients of more normal magnitude:") println(" total count: $cf") println(" ratio of incorrect results: $rf") flush(stdout) end println() flush(stdout) end nothing end experiment(20, 2^20) ``` Results on master: ``` Float16 UInt16 RoundingMode{:Nearest}() quotients of huge magnitude: total count: 371923 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15566588 ratio of incorrect results: 0.0017671823780522746 RoundingMode{:Up}() quotients of huge magnitude: total count: 375420 ratio of incorrect results: 5.327366682648766e-6 quotients of more normal magnitude: total count: 15567018 ratio of incorrect results: 0.004671543387436181 RoundingMode{:Down}() quotients of huge magnitude: total count: 375158 ratio of incorrect results: 2.665543584303147e-6 quotients of more normal magnitude: total count: 15565611 ratio of incorrect results: 0.004713146178457113 RoundingMode{:FromZero}() quotients of huge magnitude: total count: 376802 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15564500 ratio of incorrect results: 0.005449195284140191 RoundingMode{:ToZero}() quotients of huge magnitude: total count: 374335 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15563355 ratio of incorrect results: 0.003891191841347833 Float32 UInt32 RoundingMode{:Nearest}() quotients of huge magnitude: total count: 73547 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 1227091 ratio of incorrect results: 0.0003218994905920188 RoundingMode{:Up}() quotients of huge magnitude: total count: 73215 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12272614 ratio of incorrect results: 0.0009024972186039583 RoundingMode{:Down}() quotients of huge magnitude: total count: 73355 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12269080 ratio of incorrect results: 0.0009091961255448657 RoundingMode{:FromZero}() quotients of huge magnitude: total count: 73961 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12273238 ratio of incorrect results: 0.0009261614579624383 RoundingMode{:ToZero}() quotients of huge magnitude: total count: 73327 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12268024 ratio of incorrect results: 0.0008781365279363653 Float64 UInt64 RoundingMode{:Nearest}() quotients of huge magnitude: total count: 9807 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11009917 ratio of incorrect results: 4.677601111797664e-5 RoundingMode{:Up}() quotients of huge magnitude: total count: 10015 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11012903 ratio of incorrect results: 0.00013012009639965048 RoundingMode{:Down}() quotients of huge magnitude: total count: 10048 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11011455 ratio of incorrect results: 0.00012886580383791242 RoundingMode{:FromZero}() quotients of huge magnitude: total count: 9902 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11009214 ratio of incorrect results: 0.00012843787031481085 RoundingMode{:ToZero}() quotients of huge magnitude: total count: 9879 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11009033 ratio of incorrect results: 0.00013797760439086704 ``` Results after this change (everything is correct): ``` Float16 UInt16 RoundingMode{:Nearest}() quotients of huge magnitude: total count: 372015 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15568185 ratio of incorrect results: 0.0 RoundingMode{:Up}() quotients of huge magnitude: total count: 375936 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15564509 ratio of incorrect results: 0.0 RoundingMode{:Down}() quotients of huge magnitude: total count: 375737 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15564872 ratio of incorrect results: 0.0 RoundingMode{:FromZero}() quotients of huge magnitude: total count: 377102 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15563927 ratio of incorrect results: 0.0 RoundingMode{:ToZero}() quotients of huge magnitude: total count: 374618 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15567337 ratio of incorrect results: 0.0 Float32 UInt32 RoundingMode{:Nearest}() quotients of huge magnitude: total count: 73567 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12267866 ratio of incorrect results: 0.0 RoundingMode{:Up}() quotients of huge magnitude: total count: 73190 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12271827 ratio of incorrect results: 0.0 RoundingMode{:Down}() quotients of huge magnitude: total count: 73491 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12266797 ratio of incorrect results: 0.0 RoundingMode{:FromZero}() quotients of huge magnitude: total count: 73504 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12269685 ratio of incorrect results: 0.0 RoundingMode{:ToZero}() quotients of huge magnitude: total count: 73552 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12269727 ratio of incorrect results: 0.0 Float64 UInt64 RoundingMode{:Nearest}() quotients of huge magnitude: total count: 9750 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11007512 ratio of incorrect results: 0.0 RoundingMode{:Up}() quotients of huge magnitude: total count: 10046 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11009576 ratio of incorrect results: 0.0 RoundingMode{:Down}() quotients of huge magnitude: total count: 9939 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11014458 ratio of incorrect results: 0.0 RoundingMode{:FromZero}() quotients of huge magnitude: total count: 9855 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11007247 ratio of incorrect results: 0.0 RoundingMode{:ToZero}() quotients of huge magnitude: total count: 9798 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11006320 ratio of incorrect results: 0.0 ``` Results for methods that weren't implemented at all on master (those that miss a `rem` method): ``` Float16 UInt16 RoundingMode{:NearestTiesAway}() quotients of huge magnitude: total count: 373333 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15564325 ratio of incorrect results: 0.0 RoundingMode{:NearestTiesUp}() quotients of huge magnitude: total count: 374334 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 15567551 ratio of incorrect results: 0.0 Float32 UInt32 RoundingMode{:NearestTiesAway}() quotients of huge magnitude: total count: 73722 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12271459 ratio of incorrect results: 0.00301610427904294 RoundingMode{:NearestTiesUp}() quotients of huge magnitude: total count: 73861 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 12273032 ratio of incorrect results: 0.003042850373078144 Float64 UInt64 RoundingMode{:NearestTiesAway}() quotients of huge magnitude: total count: 9981 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11009598 ratio of incorrect results: 0.00044415790658296515 RoundingMode{:NearestTiesUp}() quotients of huge magnitude: total count: 9926 ratio of incorrect results: 0.0 quotients of more normal magnitude: total count: 11013978 ratio of incorrect results: 0.00046023335074756825 ``` Fixes JuliaLang#49450
@giordano there are now tests, I guess you should remove the needs-tests label? |
Just to be clear, I'm not saying that |
div_impl4(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} = | ||
T(div_impl(widen(x), widen(y), r)) | ||
|
||
div_impl3(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} = | ||
round(x / y, r) | ||
|
||
div_impl2(x::NTuple{2,T}, y::T) where {T<:AbstractFloat} = | ||
round(first(mw_div(x, y)), RoundNearest) | ||
|
||
# Approximately rounded x (with respect to y and r) | ||
frac_round(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} = | ||
add12(x, -rem(x, y, r)) | ||
|
||
div_impl1(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} = | ||
div_impl2(frac_round(x, y, r), y) | ||
|
||
div_impl0(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} = | ||
ifelse( | ||
isfinite(abs(x) + abs(y)), | ||
div_impl1(x, y, r), | ||
div_impl3(x, y, r), # try to avoid overflow | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These need better names that actually reflect what they do. (and should probably be collapsed into their callers.
canonicalize2(hi, lo) | ||
end | ||
|
||
div_impl4(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're widening, can't we just use round(x/y, r)
?
δ_t = δ_hi - last(π) # exact operation | ||
δ = δ_t + x_lo | ||
lo = δ / y | ||
canonicalize2(hi, lo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we only care about the high part from this, we could save a few clocks by just returning hi+lo
which would save ~3 adds.
|
||
# "DWDivFP3" AKA "Algorithm 15" from a paper by Joldes, Muller and | ||
# Popescu: https://doi.org/10.1145/3121432 | ||
hi = x_hi / y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be interested in if this speeds up and retains most of the accuracy if you replay x_hi / y
with x_hi * inv(y)
doing so would let you save a division on since then you could do the same for lo = δ / y
(and since we only care about the high word of this, it could fold with the canonicalize2
into return fma(inv(y), δ, hi)
saving another instruction.
Double-word arithmetics are used, except for rounding modes without
rem
, which now get simple fallback implementations (on master theirdiv
methods fail when called).The idea to fix
Float16
by widening toFloat32
is taken fromSimon Byrne's #49637
A script used for assessing the correctness of
div
:Results on master:
Results after this change (everything is correct):
Results for methods that weren't implemented at all on master (those
that miss a
rem
method):Fixes #49450