You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add more Duplicated methods for Enzyme.jl support (#2471)
* add more Duplicated methods
* update macro to zero, show
* make informative errors if you use Duplicated without loading Enzyme
* note on macro
* fix some tests
* add an Enzyme docs page
* tweaks & tests
* typos
* news, docs
* let Flux own the function update! to avoid piracy
* Revert "let Flux own the function update! to avoid piracy"
This reverts commit ca5a20f.
* demand Optimisers PR
* fixup
* force depwarns
* allow aux in withgradient
* disallow Active
* disallow trivial Duplicated
* don't use ReverseWithPrimal in gradient
* tweak
* giant post-rebase fixup after everything was moved around... all earlier commits are a mess now, probably
* clean up more rebase mess
* fix docs
* try out Ref for withgradient
* don't own `_make_zero!`
* add explicit errors for 2nd order
* more rebase problems
Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
* teach Flux.state about Duplicated
* another explicit error for Zygote mistake
* ahem
* don't use Enzyme's make_zero!, fix some bugs
* maybe this works?
* see if CI likes these
* turns out train! does have tests
* enzyme tests
* fix tests?
* minor comments
---------
Co-authored-by: Carlo Lucibello <carlo.lucibello@gmail.com>
Copy file name to clipboardExpand all lines: NEWS.md
+4-2Lines changed: 4 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
3
3
See also [github's page](https://github.com/FluxML/Flux.jl/releases) for a complete list of PRs merged before each release.
4
4
5
-
## v0.15.0
5
+
## v0.15.0 (December 2024)
6
6
* Recurrent layers have undergone a complete redesign in [PR 2500](https://github.com/FluxML/Flux.jl/pull/2500).
7
7
*`RNNCell`, `LSTMCell`, and `GRUCell` are now exported and provide functionality for single time-step processing: `rnncell(x_t, h_t) -> h_{t+1}`.
8
8
*`RNN`, `LSTM`, and `GRU` no longer store the hidden state internally, it has to be explicitely passed to the layer. Moreover, they now process entire sequences at once, rather than one element at a time: `rnn(x, h) -> h′`.
@@ -12,6 +12,8 @@ See also [github's page](https://github.com/FluxML/Flux.jl/releases) for a compl
12
12
Now Flux re-exports the optimisers from Optimisers.jl. Most users will be uneffected by this change.
13
13
The module is still available for now, but will be removed in a future release.
14
14
* Most Flux layers will [re-use memory via `NNlib.bias_act!`](https://github.com/FluxML/Flux.jl/pull/2327), when possible.
15
+
* Further support for Enzyme.jl, via methods of `Flux.gradient(loss, Duplicated(model))`.
16
+
Flux now owns & exports `gradient`, but without `Duplicated` this still defaults to calling Zygote.jl.
15
17
*`Flux.params` has been deprecated. Use Zygote's explicit differentiation instead,
16
18
`gradient(m -> loss(m, x, y), model)`, or use `Flux.trainables(model)` to get the trainable parameters.
17
19
* Flux now requires Functors.jl v0.5. This new release of Functors assumes all types to be functors by default. Therefore, applying `@layer` or `@functor` to a type is no longer strictly necessary for Flux's models. However, it is still recommended to use `@layer Model` for additional functionality like pretty printing.
@@ -40,7 +42,7 @@ See also [github's page](https://github.com/FluxML/Flux.jl/releases) for a compl
40
42
* After a deprecations cycle, the macro `@epochs` and the functions `Flux.stop`, `Flux.skip`, `Flux.zeros`, `Flux.ones` have been removed.
41
43
42
44
## v0.13.17
43
-
* Apple's Metal GPU acceleration preliminary support via the extension mechanism.
45
+
* Apple's Metal GPU acceleration preliminary support via the extension mechanism.
44
46
45
47
## v0.13.16
46
48
* Most greek-letter keyword arguments are deprecated in favour of ascii.
# [Automatic Differentiation using Enzyme.jl](@id autodiff-enzyme)
3
+
4
+
[Enzyme.jl](https://github.com/EnzymeAD/Enzyme.jl) is a new package for automatic differentiation.
5
+
Like Zygote.jl, calling `gradient(f, x)` causes it to hooks into the compiler and transform code that is executed while calculating `f(x)`, in order to produce code for `∂f/∂x`.
6
+
But it does so much later in the optimisation process (on LLVM instead of Julia's untyped IR) which you can [read about here](https://proceedings.nips.cc/paper/2020/file/9332c513ef44b682e9347822c2e457ac-Paper.pdf)].
7
+
It needs far fewer custom rules than Zygote/ChainRules, and in particular is able to support mutation of arrays.
8
+
9
+
Flux now builds in support for this, using Enzyme's own `Duplicated` type.
10
+
Calling `Duplicated` on any Flux model which was defined using `@layer` will allocate space for the gradient,
11
+
and passing that to `gradient` (or `withgradient`, or `train!`) will then use Enzyme instead of Zygote.
12
+
The gradient functions still return the gradient as usual, which can then be passed to `update!`:
13
+
14
+
```julia
15
+
julia>using Flux, Enzyme
16
+
17
+
julia> model =Chain(Dense(28^2=>32, sigmoid), Dense(32=>10), softmax); # from model zoo
18
+
19
+
julia> dup_model = Enzyme.Duplicated(model) # this allocates space for the gradient
Copy file name to clipboardExpand all lines: docs/src/reference/training/zygote.md
+3-1Lines changed: 3 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -4,8 +4,10 @@ CollapsedDocStrings = true
4
4
5
5
# [Automatic Differentiation using Zygote.jl](@id autodiff-zygote)
6
6
7
-
Flux re-exports the `gradient` from [Zygote](https://github.com/FluxML/Zygote.jl), and uses this function within [`train!`](@ref Flux.train!) to differentiate the model. Zygote has its own [documentation](https://fluxml.ai/Zygote.jl/dev/), in particular listing some [important limitations](https://fluxml.ai/Zygote.jl/dev/limitations/).
7
+
Flux's `gradient` function uses [Zygote](https://github.com/FluxML/Zygote.jl) by default, and also uses this function within [`train!`](@ref Flux.train!) to differentiate the model.
8
+
Zygote has its own [documentation](https://fluxml.ai/Zygote.jl/dev/), in particular listing some [important limitations](https://fluxml.ai/Zygote.jl/dev/limitations/).
8
9
10
+
Flux also has support for Enzyme.jl, documented [on its own page](@ref autodiff-enzyme).
_make_zero_inner!(x) =nothing# any other Functors leaf type
27
+
28
+
#= # This _make_zero! matches what Flux allows elsewhere:
29
+
julia> Flux.setup(Adam(), (1:3.)')
30
+
ERROR: model must be fully mutable for `train!` to work, got `x::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}`.
31
+
If `x .+= dx` is in fact ok, define `Optimisers.maywrite(::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}) = true`
32
+
=#
33
+
# Perhaps canonical way for Enzyme is more like this:
34
+
# function _make_zero!(x::AbstractArray{<:Number})
35
+
# if Enzyme.guess_activity(typeof(x), Reverse) <: Duplicated
0 commit comments