You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1794: Tidy up `Maxout` r=mcabbott a=mcabbott
Maxout is from #698 . This:
* adds pretty printing
* changes the explicit signature to `Maxout(layer, layer, layer)`, rather than providing a tuple, to be more like other layers (with deprecation)
* adds more examples to the docstring, and combines the two
* changes not to use `mapreduce`. I see now this was a performance choice at the time, discussed here #647 (comment) , but with Zygote this is much slower.
Before:
```
julia> using Flux
julia> m3 = Maxout(() -> Dense(5, 7, tanh), 3)
Maxout{Tuple{Dense{typeof(tanh), Matrix{Float32}, Vector{Float32}}, Dense{typeof(tanh), Matrix{Float32}, Vector{Float32}}, Dense{typeof(tanh), Matrix{Float32}, Vector{Float32}}}}((Dense(5, 7, tanh), Dense(5, 7, tanh), Dense(5, 7, tanh)))
julia> x = rand(Float32, 5, 11);
julia> `@btime` gradient(sum∘m3, $x);
min 112.792 μs, mean 123.774 μs (930 allocations, 49.09 KiB. GC mean 3.71%)
```
After:
```
julia> m3 = Maxout(() -> Dense(5, 7, tanh), 3)
Maxout(
Dense(5, 7, tanh), # 42 parameters
Dense(5, 7, tanh), # 42 parameters
Dense(5, 7, tanh), # 42 parameters
) # Total: 6 arrays, 126 parameters, 888 bytes.
julia> x = rand(Float32, 5, 11);
julia> `@btime` gradient(sum∘m3, $x);
min 34.541 μs, mean 38.448 μs (493 allocations, 32.48 KiB. GC mean 6.63%)
```
Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
0 commit comments