@@ -211,48 +211,66 @@ function Base.show(io::IO, l::Diagonal)
211
211
end
212
212
213
213
"""
214
- Maxout(over)
214
+ Maxout(layers...)
215
+ Maxout(f, n_alts)
215
216
216
- The [Maxout](https://arxiv.org/abs/1302.4389) layer has a number of
217
- internal layers which all receive the same input. It returns the elementwise
218
- maximum of the internal layers' outputs.
217
+ This contains a number of internal layes, each of which receives the same input.
218
+ Its output is the elementwise maximum of the the internal layers' outputs.
219
219
220
- Maxout over linear dense layers satisfies the univeral approximation theorem.
221
- """
222
- struct Maxout{FS<: Tuple }
223
- over:: FS
224
- end
220
+ Instead of defining layers individually, you can provide a zero-argument function
221
+ which constructs them, and the number to construct.
225
222
226
- """
227
- Maxout(f, n_alts)
223
+ Maxout over linear dense layers satisfies the univeral approximation theorem.
224
+ See Goodfellow, Warde-Farley, Mirza, Courville & Bengio "Maxout Networks"
225
+ [https://arxiv.org/abs/1302.4389](1302.4389).
228
226
229
- Construct a Maxout layer over `n_alts` instances of the layer given by `f`.
230
- The function takes no arguments and should return some callable layer.
231
- Conventionally, this is a linear dense layer.
227
+ See also [`Parallel`](@ref) to reduce with other operators.
232
228
233
229
# Examples
230
+ ```
231
+ julia> m = Maxout(Dense([1;;], false, abs2), Dense([3;;]));
234
232
235
- This constructs a `Maxout` layer over 4 internal dense linear layers, each
236
- identical in structure (784 inputs, 128 outputs):
237
- ```jldoctest
238
- julia> insize = 784;
233
+ julia> m([-2 -1 0 1 2])
234
+ 1×5 Matrix{Int64}:
235
+ 4 1 0 3 6
239
236
240
- julia> outsize = 128;
237
+ julia> m3 = Maxout(() -> Dense(5, 7, tanh), 3)
238
+ Maxout(
239
+ Dense(5, 7, tanh), # 42 parameters
240
+ Dense(5, 7, tanh), # 42 parameters
241
+ Dense(5, 7, tanh), # 42 parameters
242
+ ) # Total: 6 arrays, 126 parameters, 888 bytes.
241
243
242
- julia> Maxout(()->Dense(insize, outsize), 4);
244
+ julia> Flux.outputsize(m3, (5, 11))
245
+ (7, 11)
243
246
```
244
247
"""
245
- function Maxout (f, n_alts)
248
+ struct Maxout{FS<: Tuple }
249
+ over:: FS
250
+ Maxout (layers... ) = new {typeof(layers)} (layers)
251
+ end
252
+
253
+ function Maxout (f:: Function , n_alts:: Integer )
246
254
over = Tuple (f () for _ in 1 : n_alts)
247
- return Maxout (over)
255
+ return Maxout (over... )
248
256
end
249
257
250
258
@functor Maxout
251
259
252
260
function (mo:: Maxout )(input:: AbstractArray )
253
- mapreduce (f -> f (input), (acc, out) -> max .(acc, out), mo. over)
261
+ outs = map (lay -> lay (input), mo. over)
262
+ return max .(outs... )
263
+ end
264
+
265
+ trainable (mo:: Maxout ) = mo. over
266
+
267
+ function Base. show (io:: IO , mo:: Maxout )
268
+ print (io, " Maxout(" )
269
+ _show_layers (io, mo. over)
270
+ print (io, " )" )
254
271
end
255
272
273
+
256
274
"""
257
275
SkipConnection(layer, connection)
258
276
@@ -277,6 +295,8 @@ julia> sm = SkipConnection(m, (mx, x) -> cat(mx, x, dims=3));
277
295
julia> size(sm(x)) == (5, 5, 11, 10)
278
296
true
279
297
```
298
+
299
+ See also [`Parallel`](@ref), [`Maxout`](@ref).
280
300
"""
281
301
struct SkipConnection{T,F}
282
302
layers:: T
390
410
Parallel(connection, layers...)
391
411
Parallel(connection; name = layer, ...)
392
412
393
- Create a ' Parallel' layer that passes an input array to each path in
413
+ Create a ` Parallel` layer that passes an input array to each path in
394
414
`layers`, before reducing the output with `connection`.
395
415
396
416
Called with one input `x`, this is equivalent to `reduce(connection, [l(x) for l in layers])`.
@@ -399,6 +419,9 @@ If called with multiple inputs, they are `zip`ped with the layers, thus `Paralle
399
419
Like [`Chain`](@ref), its sub-layers may be given names using the keyword constructor.
400
420
These can be accessed by indexing: `m[1] == m[:name]` is the first layer.
401
421
422
+ See also [`SkipConnection`](@ref) which is `Parallel` with one `identity`,
423
+ and [`Maxout`](@ref) which reduces by broadcasting `max`.
424
+
402
425
# Examples
403
426
404
427
```jldoctest
0 commit comments