@@ -569,6 +569,9 @@ for more general scheduling techniques.
569
569
570
570
# Examples
571
571
572
+ `InvDecay` is tipically composed with other optimizers
573
+ as the last transformation of the gradient:
574
+
572
575
```julia
573
576
# Inverse decay of the learning rate
574
577
# with starting value 0.001 and decay coefficient 0.01.
@@ -604,12 +607,16 @@ a minimum of `clip`.
604
607
two decay operations.
605
608
- `clip`: Minimum value of learning rate.
606
609
610
+
611
+ See also the [Scheduling Optimisers](@ref) section of the docs
612
+ for more general scheduling techniques.
613
+
607
614
# Examples
608
- To apply exponential decay to an optimiser:
609
- ```julia
610
- Optimiser(ExpDecay(..), Opt(..))
611
615
612
- opt = Optimiser(ExpDecay(), ADAM())
616
+ `ExpDecay` is tipically composed with other optimizers
617
+ as the last transformation of the gradient:
618
+ ```julia
619
+ opt = Optimiser(ADAM(), ExpDecay())
613
620
```
614
621
"""
615
622
mutable struct ExpDecay <: AbstractOptimiser
@@ -620,7 +627,8 @@ mutable struct ExpDecay <: AbstractOptimiser
620
627
current:: IdDict
621
628
end
622
629
623
- ExpDecay (opt = 0.001 , decay = 0.1 , decay_step = 1000 , clip = 1e-4 ) = ExpDecay (opt, decay, decay_step, clip, IdDict ())
630
+ ExpDecay (opt = 0.001 , decay = 0.1 , decay_step = 1000 , clip = 1e-4 ) =
631
+ ExpDecay (opt, decay, decay_step, clip, IdDict ())
624
632
625
633
function apply! (o:: ExpDecay , x, Δ)
626
634
η, s, decay = o. eta, o. step, o. decay
0 commit comments