Skip to content

Base: fix floating-point div #49561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Base: fix floating-point div #49561

wants to merge 1 commit into from

Conversation

nsajko
Copy link
Contributor

@nsajko nsajko commented Apr 29, 2023

Double-word arithmetics are used, except for rounding modes without
rem, which now get simple fallback implementations (on master their
div methods fail when called).

The idea to fix Float16 by widening to Float32 is taken from
Simon Byrne's #49637

A script used for assessing the correctness of div:

using Random

accurate_div(x, y, r::RoundingMode) =
    div(BigFloat(x), BigFloat(y), r)

function count_wrong_floats(
    div_fun::Fun,
    r::RoundingMode,
    ::Type{F},
    ::Type{U},
    n::Int,
    m::Int,
) where {Fun <: Function, F, U}
    count_wrong_huge_quo = 0
    count_wrong_friendly_quo = 0

    count_total_huge_quo = 0
    count_total_friendly_quo = 0

    vec_x = zeros(U, m)
    vec_y = zeros(U, m)

    for i  1:n
        Random.rand!(vec_x)
        Random.rand!(vec_y)

        for (x_raw, y_raw)  zip(vec_x, vec_y)
            x = reinterpret(F, x_raw)
            y = reinterpret(F, y_raw)

            (!isfinite(x) | !isfinite(y)) && continue

            quo_is_huge = (maxintfloat(F) < abs(x / y))

            acc_big = accurate_div(x, y, r)

            acc = F(acc_big)

            # Skip cases when the result isn't representable, the
            # correct result is not specified for this case, and
            # it's not clear what a user would expect either.
            (acc == acc_big) || continue

            if quo_is_huge
                count_total_huge_quo += true
            else
                count_total_friendly_quo += true
            end

            d = div_fun(x, y, r)

            is_ok = (d == acc) | (isnan(d) & isnan(acc))

            if !is_ok
                if quo_is_huge
                    count_wrong_huge_quo += true
                else
                    count_wrong_friendly_quo += true
                end
            end
        end
    end

    (
        bad_ratio_huge_quotient =
            count_wrong_huge_quo / count_total_huge_quo,
        bad_ratio_friendly_quotient =
            count_wrong_friendly_quo / count_total_friendly_quo,
        total_huge_quotient_count =
            count_total_huge_quo,
        total_friendly_quotient_count =
            count_total_friendly_quo,
    )
end

const float_types = (Float16, Float32, Float64)
const bits_types = (UInt16, UInt32, UInt64)

 # not supported on master
const rounding_modes_other = (
    RoundNearestTiesAway, RoundNearestTiesUp
)

const rounding_modes = (
    RoundNearest, RoundUp, RoundDown, RoundFromZero, RoundToZero
)

function experiment(itcnt::Int, itcnt_inner::Int)
    for (F, U)  zip(float_types, bits_types)
        println("$F $U")
        for rm  rounding_modes
            println("  $rm")
            flush(stdout)
            res = count_wrong_floats(
                div, rm, F, U, itcnt, itcnt_inner
            )
            ch = res.total_huge_quotient_count
            cf = res.total_friendly_quotient_count
            rh = res.bad_ratio_huge_quotient
            rf = res.bad_ratio_friendly_quotient
            println("    quotients of huge magnitude:")
            println("      total count: $ch")
            println("      ratio of incorrect results: $rh")
            println("    quotients of more normal magnitude:")
            println("      total count: $cf")
            println("      ratio of incorrect results: $rf")
            flush(stdout)
        end
        println()
        flush(stdout)
    end
    nothing
end

experiment(20, 2^20)

Results on master:

Float16 UInt16
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 371923
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15566588
      ratio of incorrect results: 0.0017671823780522746
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 375420
      ratio of incorrect results: 5.327366682648766e-6
    quotients of more normal magnitude:
      total count: 15567018
      ratio of incorrect results: 0.004671543387436181
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 375158
      ratio of incorrect results: 2.665543584303147e-6
    quotients of more normal magnitude:
      total count: 15565611
      ratio of incorrect results: 0.004713146178457113
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 376802
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564500
      ratio of incorrect results: 0.005449195284140191
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 374335
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15563355
      ratio of incorrect results: 0.003891191841347833

Float32 UInt32
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 73547
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: https://github.com/JuliaLang/julia/commit/12270911bc594ac6d690f06750d016fa0dffca4e
      ratio of incorrect results: 0.0003218994905920188
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 73215
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12272614
      ratio of incorrect results: 0.0009024972186039583
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 73355
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12269080
      ratio of incorrect results: 0.0009091961255448657
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 73961
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12273238
      ratio of incorrect results: 0.0009261614579624383
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 73327
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12268024
      ratio of incorrect results: 0.0008781365279363653

Float64 UInt64
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 9807
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009917
      ratio of incorrect results: 4.677601111797664e-5
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 10015
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11012903
      ratio of incorrect results: 0.00013012009639965048
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 10048
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11011455
      ratio of incorrect results: 0.00012886580383791242
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 9902
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009214
      ratio of incorrect results: 0.00012843787031481085
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 9879
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009033
      ratio of incorrect results: 0.00013797760439086704

Results after this change (everything is correct):

Float16 UInt16
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 372015
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15568185
      ratio of incorrect results: 0.0
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 375936
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564509
      ratio of incorrect results: 0.0
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 375737
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564872
      ratio of incorrect results: 0.0
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 377102
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15563927
      ratio of incorrect results: 0.0
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 374618
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15567337
      ratio of incorrect results: 0.0

Float32 UInt32
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 73567
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12267866
      ratio of incorrect results: 0.0
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 73190
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12271827
      ratio of incorrect results: 0.0
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 73491
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12266797
      ratio of incorrect results: 0.0
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 73504
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12269685
      ratio of incorrect results: 0.0
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 73552
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12269727
      ratio of incorrect results: 0.0

Float64 UInt64
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 9750
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11007512
      ratio of incorrect results: 0.0
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 10046
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009576
      ratio of incorrect results: 0.0
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 9939
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11014458
      ratio of incorrect results: 0.0
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 9855
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11007247
      ratio of incorrect results: 0.0
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 9798
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11006320
      ratio of incorrect results: 0.0

Results for methods that weren't implemented at all on master (those
that miss a rem method):

Float16 UInt16
  RoundingMode{:NearestTiesAway}()
    quotients of huge magnitude:
      total count: 373333
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564325
      ratio of incorrect results: 0.0
  RoundingMode{:NearestTiesUp}()
    quotients of huge magnitude:
      total count: 374334
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15567551
      ratio of incorrect results: 0.0

Float32 UInt32
  RoundingMode{:NearestTiesAway}()
    quotients of huge magnitude:
      total count: 73722
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12271459
      ratio of incorrect results: 0.00301610427904294
  RoundingMode{:NearestTiesUp}()
    quotients of huge magnitude:
      total count: 73861
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12273032
      ratio of incorrect results: 0.003042850373078144

Float64 UInt64
  RoundingMode{:NearestTiesAway}()
    quotients of huge magnitude:
      total count: 9981
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009598
      ratio of incorrect results: 0.00044415790658296515
  RoundingMode{:NearestTiesUp}()
    quotients of huge magnitude:
      total count: 9926
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11013978
      ratio of incorrect results: 0.00046023335074756825

Fixes #49450

@longemen3000
Copy link
Contributor

the functions defined seem similar to what's inside https://github.com/JuliaLang/julia/blob/master/base/twiceprecision.jl ?

@giordano giordano added maths Mathematical functions needs tests Unit tests are required for this change bugfix This change fixes an existing bug labels Apr 29, 2023
@Seelengrab
Copy link
Contributor

Can it really be said to fix the issue if the resulting ratio is still nonzero? Is it not possible to be correct in all cases?

@nsajko nsajko marked this pull request as draft April 29, 2023 13:12
nsajko added a commit to nsajko/julia that referenced this pull request May 4, 2023
Update the arithmetic code to use algorithms published since
twiceprecision.jl got written.

Handle complex numbers.

Prevent harmful promotions.

Some of the introduced extended precision arithmetic code will also be
used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561.

TODO: evaluate the accuracy improvements and write tests after JuliaLang#49616
gets merged

Results:

The example in JuliaLang#33677 still gives the same result.

I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in
the meantime.

TODO: check whether there are improvements for JuliaLang#26553

Updates JuliaLang#49589
nsajko added a commit to nsajko/julia that referenced this pull request May 4, 2023
Update the arithmetic code to use algorithms published since
twiceprecision.jl got written.

Handle complex numbers.

Prevent harmful promotions.

Some of the introduced extended precision arithmetic code will also be
used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561.

TODO: evaluate the accuracy improvements and write tests after JuliaLang#49616
gets merged

Results:

The example in JuliaLang#33677 still gives the same result.

I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in
the meantime.

TODO: check whether there are improvements for JuliaLang#26553

Updates JuliaLang#49589
nsajko added a commit to nsajko/julia that referenced this pull request May 4, 2023
Update the arithmetic code to use algorithms published since
twiceprecision.jl got written.

Handle complex numbers.

Prevent harmful promotions.

Some of the introduced extended precision arithmetic code will also be
used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561.

TODO: evaluate the accuracy improvements and write tests after JuliaLang#49616
gets merged

Results:

The example in JuliaLang#33677 still gives the same result.

I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in
the meantime.

TODO: check whether there are improvements for JuliaLang#26553

Updates JuliaLang#49589
nsajko added a commit to nsajko/julia that referenced this pull request May 4, 2023
Update the arithmetic code to use algorithms published since
twiceprecision.jl got written.

Handle complex numbers.

Prevent some harmful promotions.

Some of the introduced extended precision arithmetic code will also be
used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561.

TODO: write tests

TODO: some tests fail sometimes:
```
Test Failed at /home/nsajko/tmp/julia/test/ranges.jl:223
  Expression: cmp_sn2(Tw(xw / yw), astuple(x / y)..., slopbits)
```

Results:

The example in JuliaLang#33677 still gives the same result.

I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in
the meantime.

Proof that JuliaLang#26553 is fixed:
```julia-repl
julia> const TwicePrecision = Base.TwicePrecision
Base.TwicePrecision

julia> a = TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17)
Base.TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17)

julia> b = TwicePrecision{Float64}(3.4, 8.881784197001253e-17)
Base.TwicePrecision{Float64}(3.4, 8.881784197001253e-17)

julia> a_minus_b_inexact = Tuple(a - b)
(-2.9714285714285715, 1.0150610510858574e-16)

julia> BigFloat(a) == sum(map(BigFloat, Tuple(a)))
true

julia> BigFloat(b) == sum(map(BigFloat, Tuple(b)))
true

julia> a_minus_b = BigFloat(a) - BigFloat(b)
-2.971428571428571428571428571428577679589762353999797347402693647078382455095635

julia> Float64(a_minus_b) == first(a_minus_b_inexact)
true

julia> Float64(a_minus_b - Float64(a_minus_b)) == a_minus_b_inexact[begin + 1]
true
```

Fixes JuliaLang#26553

Updates JuliaLang#49589
nsajko added a commit to nsajko/julia that referenced this pull request May 4, 2023
Update the arithmetic code to use algorithms published since
twiceprecision.jl got written.

Handle complex numbers.

Prevent some harmful promotions.

Some of the introduced extended precision arithmetic code will also be
used in `base/div.jl` to tackle issue JuliaLang#49450 with PR JuliaLang#49561.

Results:

The example in JuliaLang#33677 still gives the same result.

I wasn't able to evaluate JuliaLang#23497 because of how much Julia changed in
the meantime.

Proof that JuliaLang#26553 is fixed:
```julia-repl
julia> const TwicePrecision = Base.TwicePrecision
Base.TwicePrecision

julia> a = TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17)
Base.TwicePrecision{Float64}(0.42857142857142855, 2.3790493384824782e-17)

julia> b = TwicePrecision{Float64}(3.4, 8.881784197001253e-17)
Base.TwicePrecision{Float64}(3.4, 8.881784197001253e-17)

julia> a_minus_b_inexact = Tuple(a - b)
(-2.9714285714285715, 1.0150610510858574e-16)

julia> BigFloat(a) == sum(map(BigFloat, Tuple(a)))
true

julia> BigFloat(b) == sum(map(BigFloat, Tuple(b)))
true

julia> a_minus_b = BigFloat(a) - BigFloat(b)
-2.971428571428571428571428571428577679589762353999797347402693647078382455095635

julia> Float64(a_minus_b) == first(a_minus_b_inexact)
true

julia> Float64(a_minus_b - Float64(a_minus_b)) == a_minus_b_inexact[begin + 1]
true
```

Fixes JuliaLang#26553

Updates JuliaLang#49589
@nsajko nsajko marked this pull request as ready for review May 5, 2023 18:13
Double-word arithmetics are used, except for rounding modes without
`rem`, which now get simple fallback implementations (on master their
`div` methods fail when called).

The idea to fix `Float16` by widening to `Float32` is taken from
Simon Byrne's JuliaLang#49637

A script used for assessing the correctness of `div`:
```julia
using Random

accurate_div(x, y, r::RoundingMode) =
    div(BigFloat(x), BigFloat(y), r)

function count_wrong_floats(
    div_fun::Fun,
    r::RoundingMode,
    ::Type{F},
    ::Type{U},
    n::Int,
    m::Int,
) where {Fun <: Function, F, U}
    count_wrong_huge_quo = 0
    count_wrong_friendly_quo = 0

    count_total_huge_quo = 0
    count_total_friendly_quo = 0

    vec_x = zeros(U, m)
    vec_y = zeros(U, m)

    for i ∈ 1:n
        Random.rand!(vec_x)
        Random.rand!(vec_y)

        for (x_raw, y_raw) ∈ zip(vec_x, vec_y)
            x = reinterpret(F, x_raw)
            y = reinterpret(F, y_raw)

            (!isfinite(x) | !isfinite(y)) && continue

            quo_is_huge = (maxintfloat(F) < abs(x / y))

            acc_big = accurate_div(x, y, r)

            acc = F(acc_big)

            # Skip cases when the result isn't representable, the
            # correct result is not specified for this case, and
            # it's not clear what a user would expect either.
            (acc == acc_big) || continue

            if quo_is_huge
                count_total_huge_quo += true
            else
                count_total_friendly_quo += true
            end

            d = div_fun(x, y, r)

            is_ok = (d == acc) | (isnan(d) & isnan(acc))

            if !is_ok
                if quo_is_huge
                    count_wrong_huge_quo += true
                else
                    count_wrong_friendly_quo += true
                end
            end
        end
    end

    (
        bad_ratio_huge_quotient =
            count_wrong_huge_quo / count_total_huge_quo,
        bad_ratio_friendly_quotient =
            count_wrong_friendly_quo / count_total_friendly_quo,
        total_huge_quotient_count =
            count_total_huge_quo,
        total_friendly_quotient_count =
            count_total_friendly_quo,
    )
end

const float_types = (Float16, Float32, Float64)
const bits_types = (UInt16, UInt32, UInt64)

 # not supported on master
const rounding_modes_other = (
    RoundNearestTiesAway, RoundNearestTiesUp
)

const rounding_modes = (
    RoundNearest, RoundUp, RoundDown, RoundFromZero, RoundToZero
)

function experiment(itcnt::Int, itcnt_inner::Int)
    for (F, U) ∈ zip(float_types, bits_types)
        println("$F $U")
        for rm ∈ rounding_modes
            println("  $rm")
            flush(stdout)
            res = count_wrong_floats(
                div, rm, F, U, itcnt, itcnt_inner
            )
            ch = res.total_huge_quotient_count
            cf = res.total_friendly_quotient_count
            rh = res.bad_ratio_huge_quotient
            rf = res.bad_ratio_friendly_quotient
            println("    quotients of huge magnitude:")
            println("      total count: $ch")
            println("      ratio of incorrect results: $rh")
            println("    quotients of more normal magnitude:")
            println("      total count: $cf")
            println("      ratio of incorrect results: $rf")
            flush(stdout)
        end
        println()
        flush(stdout)
    end
    nothing
end

experiment(20, 2^20)
```

Results on master:
```
Float16 UInt16
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 371923
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15566588
      ratio of incorrect results: 0.0017671823780522746
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 375420
      ratio of incorrect results: 5.327366682648766e-6
    quotients of more normal magnitude:
      total count: 15567018
      ratio of incorrect results: 0.004671543387436181
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 375158
      ratio of incorrect results: 2.665543584303147e-6
    quotients of more normal magnitude:
      total count: 15565611
      ratio of incorrect results: 0.004713146178457113
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 376802
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564500
      ratio of incorrect results: 0.005449195284140191
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 374335
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15563355
      ratio of incorrect results: 0.003891191841347833

Float32 UInt32
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 73547
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 1227091
      ratio of incorrect results: 0.0003218994905920188
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 73215
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12272614
      ratio of incorrect results: 0.0009024972186039583
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 73355
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12269080
      ratio of incorrect results: 0.0009091961255448657
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 73961
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12273238
      ratio of incorrect results: 0.0009261614579624383
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 73327
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12268024
      ratio of incorrect results: 0.0008781365279363653

Float64 UInt64
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 9807
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009917
      ratio of incorrect results: 4.677601111797664e-5
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 10015
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11012903
      ratio of incorrect results: 0.00013012009639965048
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 10048
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11011455
      ratio of incorrect results: 0.00012886580383791242
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 9902
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009214
      ratio of incorrect results: 0.00012843787031481085
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 9879
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009033
      ratio of incorrect results: 0.00013797760439086704
```

Results after this change (everything is correct):
```
Float16 UInt16
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 372015
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15568185
      ratio of incorrect results: 0.0
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 375936
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564509
      ratio of incorrect results: 0.0
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 375737
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564872
      ratio of incorrect results: 0.0
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 377102
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15563927
      ratio of incorrect results: 0.0
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 374618
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15567337
      ratio of incorrect results: 0.0

Float32 UInt32
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 73567
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12267866
      ratio of incorrect results: 0.0
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 73190
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12271827
      ratio of incorrect results: 0.0
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 73491
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12266797
      ratio of incorrect results: 0.0
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 73504
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12269685
      ratio of incorrect results: 0.0
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 73552
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12269727
      ratio of incorrect results: 0.0

Float64 UInt64
  RoundingMode{:Nearest}()
    quotients of huge magnitude:
      total count: 9750
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11007512
      ratio of incorrect results: 0.0
  RoundingMode{:Up}()
    quotients of huge magnitude:
      total count: 10046
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009576
      ratio of incorrect results: 0.0
  RoundingMode{:Down}()
    quotients of huge magnitude:
      total count: 9939
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11014458
      ratio of incorrect results: 0.0
  RoundingMode{:FromZero}()
    quotients of huge magnitude:
      total count: 9855
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11007247
      ratio of incorrect results: 0.0
  RoundingMode{:ToZero}()
    quotients of huge magnitude:
      total count: 9798
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11006320
      ratio of incorrect results: 0.0
```

Results for methods that weren't implemented at all on master (those
that miss a `rem` method):
```
Float16 UInt16
  RoundingMode{:NearestTiesAway}()
    quotients of huge magnitude:
      total count: 373333
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15564325
      ratio of incorrect results: 0.0
  RoundingMode{:NearestTiesUp}()
    quotients of huge magnitude:
      total count: 374334
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 15567551
      ratio of incorrect results: 0.0

Float32 UInt32
  RoundingMode{:NearestTiesAway}()
    quotients of huge magnitude:
      total count: 73722
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12271459
      ratio of incorrect results: 0.00301610427904294
  RoundingMode{:NearestTiesUp}()
    quotients of huge magnitude:
      total count: 73861
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 12273032
      ratio of incorrect results: 0.003042850373078144

Float64 UInt64
  RoundingMode{:NearestTiesAway}()
    quotients of huge magnitude:
      total count: 9981
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11009598
      ratio of incorrect results: 0.00044415790658296515
  RoundingMode{:NearestTiesUp}()
    quotients of huge magnitude:
      total count: 9926
      ratio of incorrect results: 0.0
    quotients of more normal magnitude:
      total count: 11013978
      ratio of incorrect results: 0.00046023335074756825
```

Fixes JuliaLang#49450
@nsajko
Copy link
Contributor Author

nsajko commented May 5, 2023

@giordano there are now tests, I guess you should remove the needs-tests label?

@giordano giordano removed the needs tests Unit tests are required for this change label May 5, 2023
@nsajko
Copy link
Contributor Author

nsajko commented May 5, 2023

Just to be clear, I'm not saying that div is always correct with this change. In some cases, overflows and underflows should cause the answer to be incorrect. This could presumably be further improved by using frexp and ldexp to try and prevent more overflows and underflows, but I don't see the performance hit being worth it.

Comment on lines +381 to +402
div_impl4(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} =
T(div_impl(widen(x), widen(y), r))

div_impl3(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} =
round(x / y, r)

div_impl2(x::NTuple{2,T}, y::T) where {T<:AbstractFloat} =
round(first(mw_div(x, y)), RoundNearest)

# Approximately rounded x (with respect to y and r)
frac_round(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} =
add12(x, -rem(x, y, r))

div_impl1(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} =
div_impl2(frac_round(x, y, r), y)

div_impl0(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} =
ifelse(
isfinite(abs(x) + abs(y)),
div_impl1(x, y, r),
div_impl3(x, y, r), # try to avoid overflow
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These need better names that actually reflect what they do. (and should probably be collapsed into their callers.

canonicalize2(hi, lo)
end

div_impl4(x::T, y::T, r::RoundingMode) where {T<:AbstractFloat} =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're widening, can't we just use round(x/y, r)?

δ_t = δ_hi - last(π) # exact operation
δ = δ_t + x_lo
lo = δ / y
canonicalize2(hi, lo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we only care about the high part from this, we could save a few clocks by just returning hi+lo which would save ~3 adds.


# "DWDivFP3" AKA "Algorithm 15" from a paper by Joldes, Muller and
# Popescu: https://doi.org/10.1145/3121432
hi = x_hi / y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be interested in if this speeds up and retains most of the accuracy if you replay x_hi / y with x_hi * inv(y) doing so would let you save a division on since then you could do the same for lo = δ / y (and since we only care about the high word of this, it could fold with the canonicalize2 into return fma(inv(y), δ, hi) saving another instruction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix This change fixes an existing bug maths Mathematical functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect results from floating-point division (fld, cld, div)
5 participants