Skip to content

Commit 5f84b68

Browse files
mcabbottdarsnack
andauthored
doc changes re at-functor and at-layer (#2390)
* doc changes re at-functor and at-layer * fix a doctest * more fixes * public at-layer * add a sentence comparing to freeze/thaw * Apply suggestions from code review Co-authored-by: Kyle Daruwalla <daruwalla@wisc.edu> * two fixes re SignDecay --------- Co-authored-by: Kyle Daruwalla <daruwalla@wisc.edu>
1 parent 2ff9304 commit 5f84b68

File tree

12 files changed

+25
-17
lines changed

12 files changed

+25
-17
lines changed

NEWS.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ See also [github's page](https://github.com/FluxML/Flux.jl/releases) for a compl
77
This also adds `show` methods for pretty printing.
88

99
## v0.14.12
10-
* New `SignDecay` optimiser, like `` WeightNorm` but for L1 norm.
10+
* New `SignDecay` optimiser, like `WeightDecay` but for L1 norm.
1111

1212
## v0.14.0 (July 2023)
1313
* Flux now requires julia v1.9 or later.

Project.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
name = "Flux"
22
uuid = "587475ba-b771-5e3f-ad9e-33799f191a9c"
3-
version = "0.14.12"
3+
version = "0.14.13"
44

55
[deps]
66
Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"

docs/src/models/advanced.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ Join(combine, paths...) = Join(combine, paths)
142142
```
143143
Notice that we parameterized the type of the `paths` field. This is necessary for fast Julia code; in general, `T` might be a `Tuple` or `Vector`, but we don't need to pay attention to what it specifically is. The same goes for the `combine` field.
144144

145-
The next step is to use [`Functors.@layer`](@ref) to make our struct behave like a Flux layer. This is important so that calling `params` on a `Join` returns the underlying weight arrays on each path.
145+
The next step is to use [`Flux.@layer`](@ref) to make our struct behave like a Flux layer. This is important so that calling `Flux.setup` on a `Join` maps over the underlying trainable arrays on each path.
146146
```julia
147147
Flux.@layer Join
148148
```

docs/src/models/basics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -255,7 +255,7 @@ m(5) # => 26
255255

256256
## Layer Helpers
257257

258-
There is still one problem with this `Affine` layer, that Flux does not know to look inside it. This means that [`Flux.train!`](@ref) won't see its parameters, nor will [`gpu`](@ref) be able to move them to your GPU. These features are enabled by the [`@functor`](@ref Functors.@functor) macro:
258+
There is still one problem with this `Affine` layer, that Flux does not know to look inside it. This means that [`Flux.train!`](@ref) won't see its parameters, nor will [`gpu`](@ref) be able to move them to your GPU. These features are enabled by the [`@layer`](@ref Flux.@layer) macro:
259259

260260
```julia
261261
Flux.@layer Affine

docs/src/models/functors.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,11 @@
22

33
Flux models are deeply nested structures, and [Functors.jl](https://github.com/FluxML/Functors.jl) provides tools needed to explore such objects, apply functions to the parameters they contain, and re-build them.
44

5-
New layers should be annotated using the `Functors.@functor` macro. This will enable [`params`](@ref Flux.params) to see the parameters inside, and [`gpu`](@ref) to move them to the GPU.
5+
!!! compat "Flux ≤ 0.14"
6+
All layers were previously defined with the `Functors.@functor` macro.
7+
This still works, but it is recommended that you use the new [`Flux.@layer`](@ref Flux.@layer) macro instead.
8+
Both allow [`Flux.setup`](@ref Flux.setup) to see the parameters inside, and [`gpu`](@ref) to move them to the GPU, but [`Flux.@layer`](@ref Flux.@layer) also overloads printing,
9+
and offers a way to define `trainable` at the same time.
610

711
`Functors.jl` has its own [notes on basic usage](https://fluxml.ai/Functors.jl/stable/#Basic-Usage-and-Implementation) for more details. Additionally, the [Advanced Model Building and Customisation](@ref man-advanced) page covers the use cases of `Functors` in greater details.
812

docs/src/models/layers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The `Dense` exemplifies several features:
1212

1313
* The bias vector is always initialised [`Flux.zeros32`](@ref). The keyword `bias=false` will turn this off, i.e. keeping the bias permanently zero.
1414

15-
* It is annotated with [`@functor`](@ref Functors.@functor), which means that [`params`](@ref Flux.params) will see the contents, and [`gpu`](@ref Flux.gpu) will move their arrays to the GPU.
15+
* It is annotated with [`@layer`](@ref Flux.@layer), which means that [`Flux.setup`](@ref Flux.setup) will see the contents, and [`gpu`](@ref Flux.gpu) will move their arrays to the GPU.
1616

1717
By contrast, `Chain` itself contains no parameters, but connects other layers together.
1818
The section on [dataflow layers](@ref man-dataflow-layers) introduces others like this.

docs/src/saving.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,12 +16,12 @@ julia> struct MyModel
1616
net
1717
end
1818
19-
julia> Flux.@functor MyModel
19+
julia> Flux.@layer MyModel
2020
2121
julia> MyModel() = MyModel(Chain(Dense(10, 5, relu), Dense(5, 2)));
2222
2323
julia> model = MyModel()
24-
MyModel(Chain(Dense(10 => 5, relu), Dense(5 => 2)))
24+
MyModel(Chain(Dense(10 => 5, relu), Dense(5 => 2))) # 67 parameters
2525
2626
julia> model_state = Flux.state(model);
2727

docs/src/training/optimisers.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -112,6 +112,7 @@ Similar to optimisers, Flux also defines some simple decays that can be used in
112112
ExpDecay
113113
InvDecay
114114
WeightDecay
115+
SignDecay
115116
```
116117

117118
## Gradient Clipping

docs/src/training/training.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -384,6 +384,9 @@ Flux.thaw!(opt_state)
384384
The earlier "implicit" equivalent was to pass to `gradient` an object referencing only
385385
part of the model, such as `Flux.params(bimodel.layers.enc)`.
386386

387+
While `adjust!` and `freeze!`/`thaw!` make temporary modifications to the optimiser state,
388+
permanently removing some fields of a new layer type from training is usually done
389+
when defining the layer, by calling for example [`@layer`](@ref Flux.@layer)` NewLayer trainable=(weight,)`.
387390

388391
## Implicit or Explicit?
389392

src/Flux.jl

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,11 +34,11 @@ export Chain, Dense, Embedding, Maxout, SkipConnection, Parallel, PairwiseFusion
3434

3535
@compat(public, ( # mark unexported symbols as API, on Julia 1.11
3636
# modules
37-
Losses,
37+
Losses, Train,
3838
# layers
3939
Bilinear, Scale, dropout,
4040
# utils
41-
outputsize, state,
41+
outputsize, state, create_bias, @layer,
4242
))
4343

4444
include("optimise/Optimise.jl")

0 commit comments

Comments
 (0)