You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/models/advanced.md
+18-14Lines changed: 18 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -36,34 +36,38 @@ For an intro to Flux and automatic differentiation, see this [tutorial](https://
36
36
37
37
Taking reference from our example `Affine` layer from the [basics](@ref man-basics).
38
38
39
-
By default all the fields in the `Affine` type are collected as its parameters, however, in some cases it may be desired to hold other metadata in our "layers" that may not be needed for training, and are hence supposed to be ignored while the parameters are collected. With Flux, it is possible to mark the fields of our layers that are trainable in two ways.
40
-
41
-
The first way of achieving this is through overloading the `trainable` function.
39
+
By default all the fields in the `Affine` type are collected as its parameters, however, in some cases it may be desired to hold other metadata in our "layers" that may not be needed for training, and are hence supposed to be ignored while the parameters are collected. With Flux, the way to mark some fields of our layer as trainable is through overloading the `trainable` function:
Only the fields returned by `trainable` will be collected as trainable parameters of the layer when calling `Flux.params`.
56
+
Only the fields returned by `trainable` will be collected as trainable parameters of the layer when calling `Flux.params`, and only these fields will be seen by `Flux.setup` and `Flux.update!` for training. But all fields wil be seen by `gpu` and similar functions, for example:
Another way of achieving this is through the `@functor` macro directly. Here, we can mark the fields we are interested in by grouping them in the second argument:
63
+
Note that there is no need to overload `trainable` to hide fields which do not contain trainable parameters. (For example, activation functions, or Boolean flags.) These are always ignored by `params` and by training:
61
64
62
-
```julia
63
-
Flux.@functor Affine (W,)
65
+
```julia-repl
66
+
julia> Flux.params(Affine(true, [10, 11, 12.0]))
67
+
Params([])
64
68
```
65
69
66
-
However, doing this requires the `struct` to have a corresponding constructor that accepts those parameters.
70
+
It is also possible to further restrict what fields are seen by writing `@functor Affine (W,)`. However, this is not recommended. This requires the `struct` to have a corresponding constructor that accepts only `W` as an argument, and the ignored fields will not be seen by functions like `gpu` (which is usually undesired).
0 commit comments