Skip to content

Commit a68470c

Browse files
committed
remove Flux's optimisers, add new train! functions
1 parent b8bdc2d commit a68470c

File tree

9 files changed

+432
-914
lines changed

9 files changed

+432
-914
lines changed

NEWS.md

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,22 @@
11
# Flux Release Notes
22

3+
# v0.14
4+
5+
* The use of Zygote's implicit parameters (with `Flux.params` and global variables) is deprecated in favour of the explicit style.
6+
The function `train!` has new methods (accepting the model itself) to handle this.
7+
8+
* Sub-module `Flux.Optimise` has been removed, in favour of using [Optimisers.jl](https://github.com/FluxML/Optimisers.jl) more deeply.
9+
The function `train!` now lives in sub-module `Flux.Train`, and has re-written internals.
10+
11+
* One-hot arrays have moved to a new package [OneHotArrays.jl](https://github.com/FluxML/OneHotArrays.jl)
12+
313
## v0.13.4
414
* Added [`PairwiseFusion` layer](https://github.com/FluxML/Flux.jl/pull/1983)
515

6-
## v0.13
16+
## v0.13 (April 2022)
17+
718
* After a deprecations cycle, the datasets in `Flux.Data` have
8-
been removed in favour of MLDatasets.jl.
19+
been removed in favour of [MLDatasets.jl](https://github.com/JuliaML/MLDatasets.jl).
920
* `params` is not exported anymore since it is a common name and is also exported by Distributions.jl
1021
* `flatten` is not exported anymore due to clash with Iterators.flatten.
1122
* Remove Juno.jl progress bar support as it is now obsolete.
@@ -48,7 +59,7 @@ been removed in favour of MLDatasets.jl.
4859
* CUDA.jl 3.0 support
4960
* Bug fixes and optimizations.
5061

51-
## v0.12.0
62+
## v0.12.0 (March 2021)
5263

5364
* Add [identity_init](https://github.com/FluxML/Flux.jl/pull/1524).
5465
* Add [Orthogonal Matrix initialization](https://github.com/FluxML/Flux.jl/pull/1496) as described in [Exact solutions to the nonlinear dynamics of learning in deep linear neural networks](https://arxiv.org/abs/1312.6120).
@@ -73,7 +84,7 @@ been removed in favour of MLDatasets.jl.
7384
* Adds the [AdaBelief](https://arxiv.org/abs/2010.07468) optimiser.
7485
* Other new features and bug fixes (see GitHub releases page)
7586

76-
## v0.11
87+
## v0.11 (July 2020)
7788

7889
* Moved CUDA compatibility to use [CUDA.jl instead of CuArrays.jl](https://github.com/FluxML/Flux.jl/pull/1204)
7990
* Add [kaiming initialization](https://arxiv.org/abs/1502.01852) methods: [kaiming_uniform and kaiming_normal](https://github.com/FluxML/Flux.jl/pull/1243)
@@ -101,7 +112,7 @@ keyword argument. The `Dropout` struct *whose behavior is left unchanged) is the
101112

102113
See GitHub's releases.
103114

104-
## v0.10.0
115+
## v0.10.0 (November 2019)
105116

106117
* The default AD engine has switched from [Tracker to Zygote.jl](https://github.com/FluxML/Flux.jl/pull/669)
107118
- The dependency on Tracker.jl has been removed.

src/Flux.jl

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ import Optimisers: Optimisers, trainable, destructure # before v0.13, Flux owne
1111

1212
using Zygote, ChainRulesCore
1313
using Zygote: Params, @adjoint, gradient, pullback, @nograd
14-
export gradient
14+
# export gradient # stop exporting this, to make people say "using Zygote", and make easier to replace
1515

1616
# Pirate error to catch a common mistake. (Internal function `base` because overloading `update!` is more likely to give ambiguities.)
1717
Optimisers.base(dx::Zygote.Grads) = error("Optimisers.jl cannot be used with Zygote.jl's implicit gradients, `Params` & `Grads`")
@@ -25,14 +25,15 @@ export Chain, Dense, Maxout, SkipConnection, Parallel, PairwiseFusion,
2525
fmap, cpu, gpu, f32, f64,
2626
testmode!, trainmode!
2727

28-
include("optimise/Optimise.jl")
29-
using .Optimise
30-
using .Optimise: @epochs
31-
using .Optimise: skip
32-
export Descent, Adam, Momentum, Nesterov, RMSProp,
33-
AdaGrad, AdaMax, AdaDelta, AMSGrad, NAdam, OAdam,
34-
AdamW, RAdam, AdaBelief, InvDecay, ExpDecay,
35-
WeightDecay, ClipValue, ClipNorm
28+
include("train/Train.jl")
29+
using .Train
30+
export train!
31+
# Stop exporting these, since Optimisers.jl exports the same names,
32+
# and with this PR, Flux.Adam() is literally a wrapper around Adam().
33+
# export Descent, Adam, Momentum, Nesterov, RMSProp,
34+
# AdaGrad, AdaMax, AdaDelta, AMSGrad, NAdam, OAdam,
35+
# AdamW, RAdam, AdaBelief, InvDecay, ExpDecay,
36+
# WeightDecay, ClipValue, ClipNorm
3637

3738
using CUDA
3839
const use_cuda = Ref{Union{Nothing,Bool}}(nothing)

src/deprecations.jl

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,10 @@ struct Zeros
3434
end
3535
Zeros(args...) = Zeros() # was used both Dense(10, 2, initb = Zeros) and Dense(rand(2,10), Zeros())
3636

37-
function Optimise.update!(x::AbstractArray, x̄)
38-
Base.depwarn("`Flux.Optimise.update!(x, x̄)` was not used internally and has been removed. Please write `x .-= x̄` instead.", :update!)
39-
x .-=
40-
end
37+
# function Optimise.update!(x::AbstractArray, x̄)
38+
# Base.depwarn("`Flux.Optimise.update!(x, x̄)` was not used internally and has been removed. Please write `x .-= x̄` instead.", :update!)
39+
# x .-= x̄
40+
# end
4141

4242
function Diagonal(size::Integer...; kw...)
4343
Base.depwarn("Flux.Diagonal is now Flux.Scale, and also allows an activation function.", :Diagonal)
@@ -80,3 +80,6 @@ Base.@deprecate_binding RADAM RAdam
8080
Base.@deprecate_binding OADAM OAdam
8181
Base.@deprecate_binding ADAGrad AdaGrad
8282
Base.@deprecate_binding ADADelta AdaDelta
83+
84+
# What remains from the Optimise sub-module has moved to Train:
85+
Base.@deprecate_binding Optimise Train

src/optimise/Optimise.jl

Lines changed: 0 additions & 15 deletions
This file was deleted.

0 commit comments

Comments
 (0)