@@ -16,8 +16,6 @@ The dropout function. If `active` is `true`,
16
16
for each input, either sets that input to `0` (with probability
17
17
`p`) or scales it by `1 / (1 - p)`. `dims` specifies the unbroadcasted dimensions,
18
18
e.g. `dims=1` applies dropout along columns and `dims=2` along rows.
19
- This is used as a regularisation, i.e. it reduces overfitting during training.
20
-
21
19
If `active` is `false`, it just returns the input `x`.
22
20
23
21
Specify `rng` for custom RNGs instead of the default RNG.
@@ -55,11 +53,16 @@ ChainRulesCore.@non_differentiable dropout_mask(::Any, ::Any, ::Any)
55
53
"""
56
54
Dropout(p; dims=:, rng = rng_from_array())
57
55
58
- Dropout layer. In the forward pass, applies the [`Flux.dropout`](@ref) function on the input.
56
+ Dropout layer.
57
+
58
+ While training, for each input, this layer either sets that input to `0` (with probability
59
+ `p`) or scales it by `1 / (1 - p)`. To apply dropout along certain dimension(s), specify the
60
+ `dims` keyword. e.g. `Dropout(p; dims = 3)` will randomly zero out entire channels on WHCN input
61
+ (also called 2D dropout). This is used as a regularisation, i.e. it reduces overfitting during
62
+ training.
59
63
60
- To apply dropout along certain dimension(s), specify the `dims` keyword.
61
- e.g. `Dropout(p; dims = 3)` will randomly zero out entire channels on WHCN input
62
- (also called 2D dropout).
64
+ In the forward pass, this layer applies the [`Flux.dropout`](@ref) function. See that for more
65
+ details.
63
66
64
67
Specify `rng` to use a custom RNG instead of the default.
65
68
Custom RNGs are only supported on the CPU.
0 commit comments