You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
1462: Add Parallel layer r=DhairyaLGandhi a=darsnack
Since #1289 stalled, I have added an implementation of `Parallel` with some of the changes we discussed during ML calls. This version excludes most of the structural layers in #1289 like `Join`, `Split`, and `Nop`. I also added the ability for the user to specify the reduction operator. If it is acceptable, I would like to remap `SkipConnection` to `Parallel` (not a deprecation exactly).
The reason for submitting this PR now is because I am creating pre-trained weights for the networks in FluxML/Metalhead.jl#70, and there is a lot of code that can be replaced with a `Parallel`. So, I'd like to have `Parallel` in Flux before continuing with training to make the process easier.
### PR Checklist
- [x] Tests are added
- [x] Entry in NEWS.md
- [x] Documentation, if applicable
- [x] Final review from @DhairyaLGandhi (for API changes).
cc @CarloLucibello
Co-authored-by: Kyle Daruwalla <daruwalla@wisc.edu>
Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>
Copy file name to clipboardExpand all lines: NEWS.md
+1Lines changed: 1 addition & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,7 @@
8
8
* Removed kwarg only constructors for [`convolutional layers`](https://github.com/FluxML/Flux.jl/pull/1379).
9
9
* Add [sparse initialization](https://github.com/FluxML/Flux.jl/pull/1454) as described in [Deep learning via Hessian-free optimization](https://dl.acm.org/doi/abs/10.5555/3104322.3104416).
10
10
* Moved GPU CI to use buildkite instead of GitLab
11
+
* New [`Parallel` layer](https://github.com/FluxML/Flux.jl/pull/1462) adds inception module-like building blocks.
11
12
* Other new features and bug fixes (see GitHub releases page)
Copy file name to clipboardExpand all lines: docs/src/models/advanced.md
+133Lines changed: 133 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -70,3 +70,136 @@ by simply deleting it from `ps`:
70
70
ps =params(m)
71
71
delete!(ps, m[2].b)
72
72
```
73
+
74
+
## Custom multiple input or output layer
75
+
76
+
Sometimes a model needs to receive several separate inputs at once or produce several separate outputs at once. In other words, there multiple paths within this high-level layer, each processing a different input or producing a different output. A simple example of this in machine learning literature is the [inception module](https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf).
77
+
78
+
Naively, we could have a struct that stores the weights of along each path and implement the joining/splitting in the forward pass function. But that would mean a new struct any time the operations along each path changes. Instead, this guide will show you how to construct a high-level layer (like [`Chain`](@ref)) that is made of multiple sub-layers for each path.
79
+
80
+
### Multiple inputs: a custom `Join` layer
81
+
82
+
Our custom `Join` layer will accept multiple inputs at once, pass each input through a separate path, then combine the results together. Note that this layer can already be constructed using [`Parallel`](@ref), but we will first walk through how do this manually.
83
+
84
+
We start by defining a new struct, `Join`, that stores the different paths and a combine operation as its fields.
85
+
```julia
86
+
using Flux
87
+
using CUDA
88
+
89
+
# custom join layer
90
+
struct Join{T, F}
91
+
combine::F
92
+
paths::T
93
+
end
94
+
95
+
# allow Join(op, m1, m2, ...) as a constructor
96
+
Join(combine, paths...) =Join(combine, paths)
97
+
```
98
+
Notice that we parameterized the type of the `paths` field. This is necessary for fast Julia code; in general, `T` might be a `Tuple` or `Vector`, but we don't need to pay attention to what it specifically is. The same goes for the `combine` field.
99
+
100
+
The next step is to use [`Flux.@functor`](@ref) to make our struct behave like a Flux layer. This is important so that calling `params` on a `Join` returns the underlying weight arrays on each path.
101
+
```julia
102
+
Flux.@functor Join
103
+
```
104
+
105
+
Finally, we define the forward pass. For `Join`, this means applying each `path` in `paths` to each input array, then using `combine` to merge the results.
Lastly, we can test our new layer. Thanks to the proper abstractions in Julia, our layer works on GPU arrays out of the box!
112
+
```julia
113
+
model =Chain(
114
+
Join(vcat,
115
+
Chain(
116
+
Dense(1, 5),
117
+
Dense(5, 1)
118
+
),
119
+
Dense(1, 2),
120
+
Dense(1, 1),
121
+
),
122
+
Dense(4, 1)
123
+
) |> gpu
124
+
125
+
xs =map(gpu, (rand(1), rand(1), rand(1)))
126
+
127
+
model(xs)
128
+
# returns a single float vector with one value
129
+
```
130
+
131
+
#### Using `Parallel`
132
+
133
+
Flux already provides [`Parallel`](@ref) that can offer the same functionality. In this case, `Join` is going to just be syntactic sugar for `Parallel`.
134
+
```julia
135
+
Join(combine, paths) =Parallel(combine, paths)
136
+
Join(combine, paths...) =Join(combine, paths)
137
+
138
+
# use vararg/tuple version of Parallel forward pass
139
+
model =Chain(
140
+
Join(vcat,
141
+
Chain(
142
+
Dense(1, 5),
143
+
Dense(5, 1)
144
+
),
145
+
Dense(1, 2),
146
+
Dense(1, 1),
147
+
),
148
+
Dense(4, 1)
149
+
) |> gpu
150
+
151
+
xs =map(gpu, (rand(1), rand(1), rand(1)))
152
+
153
+
model(xs)
154
+
# returns a single float vector with one value
155
+
```
156
+
157
+
### Multiple outputs: a custom `Split` layer
158
+
159
+
Our custom `Split` layer will accept a single input, then pass the input through a separate path to produce multiple outputs.
160
+
161
+
We start by following the same steps as the `Join` layer: define a struct, use [`Flux.@functor`](@ref), and define the forward pass.
0 commit comments