Skip to content

Document that reduce-like functions use init one or more times #53945

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,12 @@ New library features
Standard library changes
------------------------

* The `init` keyword argument in `reduce(op, itr; init)` and other reduction functions with implementation-defined
associativity (`mapreduce`, `maximum`, `minimum`, `sum`, `prod`, `any`, and `all`)
is now guaranteed to be used for non-empty collections one or more times. This ensures that all calls
to the reducing function `op` have one argument that is either `init` or the result from a
previous `op` evaluation. Previously `init` was explicitly allowed to be omitted from the reduction entirely.

#### StyledStrings

#### JuliaSyntaxHighlighting
Expand Down
144 changes: 98 additions & 46 deletions base/reduce.jl
Original file line number Diff line number Diff line change
Expand Up @@ -280,29 +280,65 @@ mapreduce_impl(f, op, A::AbstractArrayOrBroadcasted, ifirst::Integer, ilast::Int
"""
mapreduce(f, op, itrs...; [init])

Apply function `f` to each element(s) in `itrs`, and then reduce the result using the binary
function `op`. If provided, `init` must be a neutral element for `op` that will be returned
for empty collections. It is unspecified whether `init` is used for non-empty collections.
In general, it will be necessary to provide `init` to work with empty collections.
Apply function `f` to each element(s) in `itrs`, and then repeatedly call the 2 argument
function `op` with those results or results from previous `op` evaluations until a single value is returned.

If provided, `init` is the return value for empty collections and is used one or more times as
an argument to `op` for non-empty collections. The `init` value is not transformed by the function `f`.
Using `init` ensures that all calls to `op` have one argument that is either `init`
or the result from a previous `op` evaluation, and the ordering of these arguments is
unspecified. As it may appear in the reduction one or more times, it must be a neutral element for `op` that
does not change the result by being used more than once. It is generally an error to call `mapreduce`
with empty collections without specifying an `init` value, but in unambiguous cases an
identity value for `op` may be returned; see [`Base.reduce_empty`](@ref) for more details.

In contrast with [`mapfoldl`](@ref) and [`mapfoldr`](@ref), the sequence of
function evaluations and the associativity of the reduction is not specified
and may vary between different methods and across Julia versions. Some implementations
may reuse the return value of `f` for elements that appear multiple times in the
collection(s).

For example, `mapreduce(√, +, [1, 4, 9, 16])` may be evaluated as the left-associative
`((√1+√4)+√9)+√16` _or_ the right-associative `√1+(√4+(√9+√16))`
_or_ as the potentially-parallel `(√1+√4)+(√9+√16)` and returns `10.0` regardless.
A non-associative function like `-` is not a valid `op` argument
as `mapreduce(√, -, [1, 4, 9, 16])` may return any of `-4.0`, `-2.0` or
`0.0`, depending upon which of the above associativity strategies is used.
Because floating-point roundoff errors typically break associativity,
even for common operations like + that are associative in exact arithmetic,
this also means that the floating-point errors incurred by mapreduce
are implementation-defined; for example `mapreduce(identity, +, [.1, .2, .3])` may return
either `0.6` or `0.6000000000000001`.

While the associativity of the reduction is not defined, `mapreduce` does preserve
the ordering of the iterator for ordered collections, so that the result does *not* require `op` to be commutative. For example,
`mapreduce(uppercase, *, ['j','u','l','i','a'])` is guaranteed to always
return the properly-spelled `"JULIA"` because `Array`s are ordered collections;
in contrast, the operand ordering is not guaranteed with an unordered collection like `Set`.

[`mapreduce`](@ref) is functionally equivalent to calling
`reduce(op, map(f, itr); init=init)`, but will in general execute faster since no
`reduce(op, map(f, itrs...); init=init)`, but will in general execute faster since no
intermediate collection needs to be created. See documentation for [`reduce`](@ref) and
[`map`](@ref).

Some commonly-used operators may have special implementations of a mapped reduction, and
are recommended instead of `mapreduce`: [`maximum`](@ref)`(itr)`, [`minimum`](@ref)`(itr)`, [`sum`](@ref)`(itr)`,
[`prod`](@ref)`(itr)`, [`any`](@ref)`(itr)`, [`all`](@ref)`(itr)`.

!!! compat "Julia 1.2"
`mapreduce` with multiple iterators requires Julia 1.2 or later.

# Examples
```jldoctest
julia> mapreduce(x->x^2, +, [1:3;]) # == 1 + 4 + 9
14
```
julia> mapreduce(√, +, [1, 4, 9])
6.0

The associativity of the reduction is implementation-dependent. Additionally, some
implementations may reuse the return value of `f` for elements that appear multiple times in
`itr`. Use [`mapfoldl`](@ref) or [`mapfoldr`](@ref) instead for
guaranteed left or right associativity and invocation of `f` for every value.
julia> mapreduce(identity, +, [.1, .2, .3]) ≈ 0.6
true

julia> mapreduce(uppercase, *, ['j','u','l','i','a'])
"JULIA"
```
"""
mapreduce(f, op, itr; kw...) = mapfoldl(f, op, itr; kw...)
mapreduce(f, op, itrs...; kw...) = reduce(op, Generator(f, itrs...); kw...)
Expand Down Expand Up @@ -452,36 +488,52 @@ _mapreduce(f, op, ::IndexCartesian, A::AbstractArrayOrBroadcasted) = mapfoldl(f,
"""
reduce(op, itr; [init])

Reduce the given collection `itr` with the given binary operator `op`. If provided, the
initial value `init` must be a neutral element for `op` that will be returned for empty
collections. It is unspecified whether `init` is used for non-empty collections.

For empty collections, providing `init` will be necessary, except for some special cases
(e.g. when `op` is one of `+`, `*`, `max`, `min`, `&`, `|`) when Julia can determine the
neutral element of `op`.

Reductions for certain commonly-used operators may have special implementations, and
Repeatedly call the 2 argument function `op` with the element(s) in `itr`
or results from previous `op` evaluations until a single value is returned.

If provided, `init` provides the return value for empty collections and is used one or more times as
an argument to `op` for non-empty collections.
Using `init` ensures that all calls to `op` have one argument that is either `init`
or the result from a previous `op` evaluation, and the ordering of these arguments is
unspecified. As it may appear in a reduction one or more times, it must be a neutral element for `op` that
does not change the result by being used more than once. It is generally an error to call `reduce`
with an empty collection without specifying an `init` value, but in unambiguous cases an
identity value for `op` may be returned; see [`Base.reduce_empty`](@ref) for more details.

In contrast with [`foldl`](@ref) and [`foldr`](@ref), the associativity of the reduction is
not specified and may vary between different methods and across Julia versions.
For example, `reduce(+, [1, 2, 3, 4])` may be evaluated as the left-associative
`((1+2)+3)+4` _or_ the right-associative `1+(2+(3+4))`
_or_ as the potentially-parallel `(1+2)+(3+4)` and returns `10.0` regardless.
A non-associative function like `-` is not a valid `op` argument
as `reduce(-, [1, 2, 3, 4])` may return any of `-4.0`, `-2.0` or
`0.0`, depending upon which of the above associativity strategies is used.
Because floating-point roundoff errors typically break associativity,
even for common operations like + that are associative in exact arithmetic,
this also means that the floating-point errors incurred by reduce
are implementation-defined; for example `reduce(+, [.1, .2, .3])` may return
either `0.6` or `0.6000000000000001`.

While the associativity of the reduction is not defined, `reduce` does preserve
the ordering of the iterator for ordered collections. For example,
`reduce(string, ['J','u','l','i','a'])` is guaranteed to always
return the properly-spelled `"Julia"` because `Array`s are ordered collections;
the returned ordering is not guaranteed with an unordered collection like `Set`.

Some commonly-used operators may have special implementations of a reduction, and
should be used instead: [`maximum`](@ref)`(itr)`, [`minimum`](@ref)`(itr)`, [`sum`](@ref)`(itr)`,
[`prod`](@ref)`(itr)`, [`any`](@ref)`(itr)`, [`all`](@ref)`(itr)`.
There are efficient methods for concatenating certain arrays of arrays
by calling `reduce(`[`vcat`](@ref)`, arr)` or `reduce(`[`hcat`](@ref)`, arr)`.

The associativity of the reduction is implementation dependent. This means that you can't
use non-associative operations like `-` because it is undefined whether `reduce(-,[1,2,3])`
should be evaluated as `(1-2)-3` or `1-(2-3)`. Use [`foldl`](@ref) or
[`foldr`](@ref) instead for guaranteed left or right associativity.

Some operations accumulate error. Parallelism will be easier if the reduction can be
executed in groups. Future versions of Julia might change the algorithm. Note that the
elements are not reordered if you use an ordered collection.

# Examples
```jldoctest
julia> reduce(*, [2; 3; 4])
24
julia> reduce(+, [1, 2, 3])
6

julia> reduce(+, [.1, .2, .3]) ≈ 0.6
true

julia> reduce(*, [2; 3; 4]; init=-1)
-24
julia> reduce(string, ['J','u','l','i','a'])
"Julia"
```
"""
reduce(op, itr; kw...) = mapreduce(identity, op, itr; kw...)
Expand All @@ -502,7 +554,7 @@ The return type is `Int` for signed integers of less than system word size, and
arguments, a common return type is found to which all arguments are promoted.

The value returned for empty `itr` can be specified by `init`. It must be
the additive identity (i.e. zero) as it is unspecified whether `init` is used
the additive identity (i.e. zero) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand Down Expand Up @@ -541,7 +593,7 @@ The return type is `Int` for signed integers of less than system word size, and
arguments, a common return type is found to which all arguments are promoted.

The value returned for empty `itr` can be specified by `init`. It must be
the additive identity (i.e. zero) as it is unspecified whether `init` is used
the additive identity (i.e. zero) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand Down Expand Up @@ -573,7 +625,7 @@ The return type is `Int` for signed integers of less than system word size, and
arguments, a common return type is found to which all arguments are promoted.

The value returned for empty `itr` can be specified by `init`. It must be the
multiplicative identity (i.e. one) as it is unspecified whether `init` is used
multiplicative identity (i.e. one) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand All @@ -597,7 +649,7 @@ The return type is `Int` for signed integers of less than system word size, and
arguments, a common return type is found to which all arguments are promoted.

The value returned for empty `itr` can be specified by `init`. It must be the
multiplicative identity (i.e. one) as it is unspecified whether `init` is used
multiplicative identity (i.e. one) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand Down Expand Up @@ -681,7 +733,7 @@ Return the largest result of calling function `f` on each element of `itr`.

The value returned for empty `itr` can be specified by `init`. It must be
a neutral element for `max` (i.e. which is less than or equal to any
other element) as it is unspecified whether `init` is used
other element) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand All @@ -708,7 +760,7 @@ Return the smallest result of calling function `f` on each element of `itr`.

The value returned for empty `itr` can be specified by `init`. It must be
a neutral element for `min` (i.e. which is greater than or equal to any
other element) as it is unspecified whether `init` is used
other element) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand All @@ -735,7 +787,7 @@ Return the largest element in a collection.

The value returned for empty `itr` can be specified by `init`. It must be
a neutral element for `max` (i.e. which is less than or equal to any
other element) as it is unspecified whether `init` is used
other element) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand Down Expand Up @@ -767,7 +819,7 @@ Return the smallest element in a collection.

The value returned for empty `itr` can be specified by `init`. It must be
a neutral element for `min` (i.e. which is greater than or equal to any
other element) as it is unspecified whether `init` is used
other element) as it may be used one or more times
for non-empty collections.

!!! compat "Julia 1.6"
Expand Down Expand Up @@ -802,7 +854,7 @@ The value returned for empty `itr` can be specified by `init`. It must be a 2-tu
first and second elements are neutral elements for `min` and `max` respectively
(i.e. which are greater/less than or equal to any other element). As a consequence, when
`itr` is empty the returned `(mn, mx)` tuple will satisfy `mn ≥ mx`. When `init` is
specified it may be used even for non-empty `itr`.
specified it may be used one or more times for non-empty `itr`.

!!! compat "Julia 1.8"
Keyword argument `init` requires Julia 1.8 or later.
Expand All @@ -829,7 +881,7 @@ return them as a 2-tuple. Only one pass is made over `itr`.

The value returned for empty `itr` can be specified by `init`. It must be a 2-tuple whose
first and second elements are neutral elements for `min` and `max` respectively
(i.e. which are greater/less than or equal to any other element). It is used for non-empty
(i.e. which are greater/less than or equal to any other element). It is used one or more times for non-empty
collections. Note: it implies that, for empty `itr`, the returned value `(mn, mx)` satisfies
`mn ≥ mx` even though for non-empty `itr` it satisfies `mn ≤ mx`. This is a "paradoxical"
but yet expected result.
Expand Down