Skip to content

Commit b08cb67

Browse files
Add a quick start example, and change some headings (#2069)
* add quickstart page * tidy welcome page * adjust folders, and some headings * move one page to first section * tweaks * say linear regression somewhere, just not in the title * tweaks * add emoji for API, re-order * also mention function names * activations intro * move destructure to a new file, along with modules * tweaks * tweaks * sciml link * less negative spacing * rm all negative spacing * better Layer Helpers section * move Custom Layers to Tutorials section * fixup * Apply 3 suggestions Co-authored-by: Saransh Chopra <saransh0701@gmail.com> * one more Co-authored-by: Saransh Chopra <saransh0701@gmail.com> Co-authored-by: Saransh Chopra <saransh0701@gmail.com>
1 parent 1ec32c2 commit b08cb67

File tree

17 files changed

+261
-85
lines changed

17 files changed

+261
-85
lines changed

docs/make.jl

Lines changed: 24 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -9,37 +9,43 @@ makedocs(
99
sitename = "Flux",
1010
# strict = [:cross_references,],
1111
pages = [
12-
"Home" => "index.md",
12+
"Getting Started" => [
13+
"Welcome" => "index.md",
14+
"Quick Start" => "models/quickstart.md",
15+
"Fitting a Line" => "models/overview.md",
16+
"Gradients and Layers" => "models/basics.md",
17+
],
1318
"Building Models" => [
14-
"Overview" => "models/overview.md",
15-
"Basics" => "models/basics.md",
19+
"Built-in Layers 📚" => "models/layers.md",
1620
"Recurrence" => "models/recurrence.md",
17-
"Layer Reference" => "models/layers.md",
18-
"Loss Functions" => "models/losses.md",
19-
"Regularisation" => "models/regularisation.md",
20-
"Custom Layers" => "models/advanced.md",
21-
"NNlib.jl" => "models/nnlib.md",
22-
"Activation Functions" => "models/activation.md",
21+
"Activation Functions 📚" => "models/activation.md",
22+
"NNlib.jl 📚 (`softmax`, `conv`, ...)" => "models/nnlib.md",
2323
],
2424
"Handling Data" => [
25-
"MLUtils.jl" => "data/mlutils.md",
26-
"OneHotArrays.jl" => "data/onehot.md",
25+
"MLUtils.jl 📚 (`DataLoader`, ...)" => "data/mlutils.md",
26+
"OneHotArrays.jl 📚 (`onehot`, ...)" => "data/onehot.md",
2727
],
2828
"Training Models" => [
29-
"Optimisers" => "training/optimisers.md",
3029
"Training" => "training/training.md",
31-
"Callback Helpers" => "training/callbacks.md",
32-
"Zygote.jl" => "training/zygote.md",
30+
"Regularisation" => "models/regularisation.md",
31+
"Loss Functions 📚" => "models/losses.md",
32+
"Optimisation Rules 📚" => "training/optimisers.md", # TODO move optimiser intro up to Training
33+
"Callback Helpers 📚" => "training/callbacks.md",
34+
"Zygote.jl 📚 (`gradient`, ...)" => "training/zygote.md",
3335
],
34-
"GPU Support" => "gpu.md",
3536
"Model Tools" => [
37+
"GPU Support" => "gpu.md",
3638
"Saving & Loading" => "saving.md",
37-
"Shape Inference" => "outputsize.md",
38-
"Weight Initialisation" => "utilities.md",
39-
"Functors.jl" => "models/functors.md",
39+
"Shape Inference 📚" => "outputsize.md",
40+
"Weight Initialisation 📚" => "utilities.md",
41+
"Flat vs. Nested 📚" => "destructure.md",
42+
"Functors.jl 📚 (`fmap`, ...)" => "models/functors.md",
4043
],
4144
"Performance Tips" => "performance.md",
4245
"Flux's Ecosystem" => "ecosystem.md",
46+
"Tutorials" => [ # TODO, maybe
47+
"Custom Layers" => "models/advanced.md", # TODO move freezing to Training
48+
],
4349
],
4450
format = Documenter.HTML(
4551
sidebar_sitename = false,

docs/src/assets/flux.css

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -100,8 +100,6 @@ article pre {
100100
max-width: none;
101101
padding: 1em;
102102
border-radius: 10px 0px 0px 10px;
103-
margin-left: -1em;
104-
margin-right: -2em;
105103
}
106104

107105
.hljs-comment {

docs/src/assets/oneminute.png

326 KB
Loading

docs/src/data/mlutils.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,23 @@
1-
# Working with data using MLUtils.jl
1+
# Working with Data, using MLUtils.jl
22

33
Flux re-exports the `DataLoader` type and utility functions for working with
44
data from [MLUtils](https://github.com/JuliaML/MLUtils.jl).
55

6-
## DataLoader
6+
## `DataLoader`
77

8-
`DataLoader` can be used to handle iteration over mini-batches of data.
8+
The `DataLoader` can be used to create mini-batches of data, in the format [`train!`](@ref Flux.train!) expects.
99

1010
`Flux`'s website has a [dedicated tutorial](https://fluxml.ai/tutorials/2021/01/21/data-loader.html) on `DataLoader` for more information.
1111

1212
```@docs
1313
MLUtils.DataLoader
1414
```
1515

16-
## Utility functions for working with data
16+
## Utility Functions
1717

1818
The utility functions are meant to be used while working with data;
1919
these functions help create inputs for your models or batch your dataset.
2020

21-
Below is a non-exhaustive list of such utility functions.
22-
2321
```@docs
2422
MLUtils.unsqueeze
2523
MLUtils.flatten

docs/src/destructure.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# [Flat vs. Nested Structures](@id man-destructure)
2+
3+
4+
A Flux model is a nested structure, with parameters stored within many layers. Sometimes you may want a flat representation of them, to interact with functions expecting just one vector. This is provided by `destructure`:
5+
6+
```julia
7+
julia> model = Chain(Dense(2=>1, tanh), Dense(1=>1))
8+
Chain(
9+
Dense(2 => 1, tanh), # 3 parameters
10+
Dense(1 => 1), # 2 parameters
11+
) # Total: 4 arrays, 5 parameters, 276 bytes.
12+
13+
julia> flat, rebuild = Flux.destructure(model)
14+
(Float32[0.863101, 1.2454957, 0.0, -1.6345707, 0.0], Restructure(Chain, ..., 5))
15+
16+
julia> rebuild(zeros(5)) # same structure, new parameters
17+
Chain(
18+
Dense(2 => 1, tanh), # 3 parameters (all zero)
19+
Dense(1 => 1), # 2 parameters (all zero)
20+
) # Total: 4 arrays, 5 parameters, 276 bytes.
21+
```
22+
23+
Both `destructure` and the `Restructure` function can be used within gradient computations. For instance, this computes the Hessian `∂²L/∂θᵢ∂θⱼ` of some loss function, with respect to all parameters of the Flux model. The resulting matrix has off-diagonal entries, which cannot really be expressed in a nested structure:
24+
25+
```julia
26+
julia> x = rand(Float32, 2, 16);
27+
28+
julia> grad = gradient(m -> sum(abs2, m(x)), model) # nested gradient
29+
((layers = ((weight = Float32[10.339018 11.379145], bias = Float32[22.845667], σ = nothing), (weight = Float32[-29.565302;;], bias = Float32[-37.644184], σ = nothing)),),)
30+
31+
julia> function loss(v::Vector)
32+
m = rebuild(v)
33+
y = m(x)
34+
sum(abs2, y)
35+
end;
36+
37+
julia> gradient(loss, flat) # flat gradient, same numbers
38+
(Float32[10.339018, 11.379145, 22.845667, -29.565302, -37.644184],)
39+
40+
julia> Zygote.hessian(loss, flat) # second derivative
41+
5×5 Matrix{Float32}:
42+
-7.13131 -5.54714 -11.1393 -12.6504 -8.13492
43+
-5.54714 -7.11092 -11.0208 -13.9231 -9.36316
44+
-11.1393 -11.0208 -13.7126 -27.9531 -22.741
45+
-12.6504 -13.9231 -27.9531 18.0875 23.03
46+
-8.13492 -9.36316 -22.741 23.03 32.0
47+
48+
julia> Flux.destructure(grad) # acts on non-models, too
49+
(Float32[10.339018, 11.379145, 22.845667, -29.565302, -37.644184], Restructure(Tuple, ..., 5))
50+
```
51+
52+
### All Parameters
53+
54+
The function `destructure` now lives in [`Optimisers.jl`](https://github.com/FluxML/Optimisers.jl).
55+
(Be warned this package is unrelated to the `Flux.Optimisers` sub-module! The confusion is temporary.)
56+
57+
```@docs
58+
Optimisers.destructure
59+
Optimisers.trainable
60+
Optimisers.isnumeric
61+
```
62+
63+
### All Layers
64+
65+
Another kind of flat view of a nested model is provided by the `modules` command. This extracts a list of all layers:
66+
67+
```@docs
68+
Flux.modules
69+
```

docs/src/index.md

Lines changed: 15 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,31 @@
11
# Flux: The Julia Machine Learning Library
22

3-
Flux is a library for machine learning geared towards high-performance production pipelines. It comes "batteries-included" with many useful tools built in, but also lets you use the full power of the Julia language where you need it. We follow a few key principles:
3+
Flux is a library for machine learning. It comes "batteries-included" with many useful tools built in, but also lets you use the full power of the Julia language where you need it. We follow a few key principles:
44

55
* **Doing the obvious thing**. Flux has relatively few explicit APIs for features like regularisation or embeddings. Instead, writing down the mathematical form will work – and be fast.
6-
* **Extensible by default**. Flux is written to be highly extensible and flexible while being performant. Extending Flux is as simple as using your own code as part of the model you want - it is all [high-level Julia code](https://github.com/FluxML/Flux.jl/blob/ec16a2c77dbf6ab8b92b0eecd11661be7a62feef/src/layers/recurrent.jl#L131). When in doubt, it’s well worth looking at [the source](https://github.com/FluxML/Flux.jl/). If you need something different, you can easily roll your own.
7-
* **Performance is key**. Flux integrates with high-performance AD tools such as [Zygote.jl](https://github.com/FluxML/Zygote.jl) for generating fast code. Flux optimizes both CPU and GPU performance. Scaling workloads easily to multiple GPUs can be done with the help of Julia's [GPU tooling](https://github.com/JuliaGPU/CUDA.jl) and projects like [DaggerFlux.jl](https://github.com/DhairyaLGandhi/DaggerFlux.jl).
8-
* **Play nicely with others**. Flux works well with Julia libraries from [data frames](https://github.com/JuliaComputing/JuliaDB.jl) and [images](https://github.com/JuliaImages/Images.jl) to [differential equation solvers](https://github.com/JuliaDiffEq/DifferentialEquations.jl), so you can easily build complex data processing pipelines that integrate Flux models.
6+
* **Extensible by default**. Flux is written to be highly extensible and flexible while being performant. Extending Flux is as simple as using your own code as part of the model you want - it is all [high-level Julia code](https://github.com/FluxML/Flux.jl/blob/ec16a2c77dbf6ab8b92b0eecd11661be7a62feef/src/layers/recurrent.jl#L131). When in doubt, it’s well worth looking at [the source](https://github.com/FluxML/Flux.jl/tree/master/src). If you need something different, you can easily roll your own.
7+
* **Play nicely with others**. Flux works well with Julia libraries from [images](https://github.com/JuliaImages/Images.jl) to [differential equation solvers](https://github.com/SciML/DifferentialEquations.jl), so you can easily build complex data processing pipelines that integrate Flux models.
98

109
## Installation
1110

12-
Download [Julia 1.6](https://julialang.org/) or later, if you haven't already. You can add Flux using Julia's package manager, by typing `] add Flux` in the Julia prompt.
11+
Download [Julia 1.6](https://julialang.org/downloads/) or later, preferably the current stable release. You can add Flux using Julia's package manager, by typing `] add Flux` in the Julia prompt.
1312

14-
If you have CUDA you can also run `] add CUDA` to get GPU support; see [here](gpu.md) for more details.
13+
This will automatically install several other packages, including [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) which supports Nvidia GPUs. To directly access some of its functionality, you may want to add `] add CUDA` too. The page on [GPU support](gpu.md) has more details.
1514

16-
NOTE: Flux used to have a CuArrays.jl dependency until v0.10.4, replaced by CUDA.jl in v0.11.0. If you're upgrading Flux from v0.10.4 or a lower version, you may need to remove CuArrays (run `] rm CuArrays`) before you can upgrade.
15+
Other closely associated packages, also installed automatically, include [Zygote](https://github.com/FluxML/Zygote.jl), [Optimisers](https://github.com/FluxML/Optimisers.jl), [NNlib](https://github.com/FluxML/NNlib.jl), [Functors](https://github.com/FluxML/Functors.jl) and [MLUtils](https://github.com/JuliaML/MLUtils.jl).
1716

1817
## Learning Flux
1918

20-
There are several different ways to learn Flux. If you just want to get started writing models, the [model zoo](https://github.com/FluxML/model-zoo/) gives good starting points for many common ones. This documentation provides a reference to all of Flux's APIs, as well as a from-scratch introduction to Flux's take on models and how they work. Once you understand these docs, congratulations, you also understand [Flux's source code](https://github.com/FluxML/Flux.jl), which is intended to be concise, legible and a good reference for more advanced concepts.
19+
The [quick start](models/quickstart.md) page trains a simple neural network.
20+
21+
This rest of this documentation provides a from-scratch introduction to Flux's take on models and how they work, starting with [fitting a line](models/overview.md). Once you understand these docs, congratulations, you also understand [Flux's source code](https://github.com/FluxML/Flux.jl), which is intended to be concise, legible and a good reference for more advanced concepts.
22+
23+
Sections with 📚 contain API listings. The same text is avalable at the Julia prompt, by typing for example `?gpu`.
24+
25+
If you just want to get started writing models, the [model zoo](https://github.com/FluxML/model-zoo/) gives good starting points for many common ones.
2126

2227
## Community
2328

24-
All Flux users are welcome to join our community on the [Julia forum](https://discourse.julialang.org/), or the [slack](https://discourse.julialang.org/t/announcing-a-julia-slack/4866) (channel #machine-learning). If you have questions or issues we'll try to help you out.
29+
Everyone is welcome to join our community on the [Julia discourse forum](https://discourse.julialang.org/), or the [slack chat](https://discourse.julialang.org/t/announcing-a-julia-slack/4866) (channel #machine-learning). If you have questions or issues we'll try to help you out.
2530

26-
If you're interested in hacking on Flux, the [source code](https://github.com/FluxML/Flux.jl) is open and easy to understand -- it's all just the same Julia code you work with normally. You might be interested in our [intro issues](https://github.com/FluxML/Flux.jl/labels/good%20first%20issue) to get started or our [contributing guide](https://github.com/FluxML/Flux.jl/blob/master/CONTRIBUTING.md).
31+
If you're interested in hacking on Flux, the [source code](https://github.com/FluxML/Flux.jl) is open and easy to understand -- it's all just the same Julia code you work with normally. You might be interested in our [intro issues](https://github.com/FluxML/Flux.jl/labels/good%20first%20issue) to get started, or our [contributing guide](https://github.com/FluxML/Flux.jl/blob/master/CONTRIBUTING.md).

docs/src/models/activation.md

Lines changed: 27 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ These non-linearities used between layers of your model are exported by the [NNl
55

66
Note that, unless otherwise stated, activation functions operate on scalars. To apply them to an array you can call `σ.(xs)`, `relu.(xs)` and so on. Alternatively, they can be passed to a layer like `Dense(784 => 1024, relu)` which will handle this broadcasting.
77

8+
Functions like [`softmax`](@ref) are sometimes described as activation functions, but not by Flux. They must see all the outputs, and hence cannot be broadcasted. See the next page for details.
9+
10+
### Alphabetical Listing
11+
812
```@docs
913
celu
1014
elu
@@ -32,8 +36,29 @@ tanhshrink
3236
trelu
3337
```
3438

35-
Julia's `Base.Math` also provide `tanh`, which can be used as an activation function:
39+
### One More
40+
41+
Julia's `Base.Math` also provides `tanh`, which can be used as an activation function.
42+
43+
Note that many Flux layers will automatically replace this with [`NNlib.tanh_fast`](@ref) when called, as Base's `tanh` is slow enough to sometimes be a bottleneck.
44+
45+
```julia
46+
julia> using UnicodePlots
47+
48+
julia> lineplot(tanh, -3, 3, height=7)
49+
┌────────────────────────────────────────┐
50+
1 │⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⣀⠤⠔⠒⠒⠉⠉⠉⠉⠉⠉⠉⠉⠉│ tanh(x)
51+
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⠀⠀⡠⠖⠋⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
52+
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡇⡰⠊⠁⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
53+
f(x) │⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⡤⡯⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤⠤│
54+
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⡠⠎⠁⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
55+
│⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⠴⠊⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
56+
-1 │⣀⣀⣀⣀⣀⣀⣀⣀⣀⡤⠤⠔⠒⠉⠁⠀⠀⠀⠀⠀⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀│
57+
└────────────────────────────────────────┘
58+
-3⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀3⠀
59+
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀x⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
60+
```
3661

3762
```@docs
3863
tanh
39-
```
64+
```

docs/src/models/basics.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Flux Basics
1+
# [How Flux Works: Gradients and Layers](@id man-basics)
22

33
## Taking Gradients
44

@@ -211,14 +211,27 @@ m = Chain(x -> x^2, x -> x+1)
211211
m(5) # => 26
212212
```
213213

214-
## Layer helpers
214+
## Layer Helpers
215215

216-
Flux provides a set of helpers for custom layers, which you can enable by calling
216+
There is still one problem with this `Affine` layer, that Flux does not know to look inside it. This means that [`Flux.train!`](@ref) won't see its parameters, nor will [`gpu`](@ref) be able to move them to your GPU. These features are enabled by the `@functor` macro:
217217

218-
```julia
218+
```
219219
Flux.@functor Affine
220220
```
221221

222-
This enables a useful extra set of functionality for our `Affine` layer, such as [collecting its parameters](../training/optimisers.md) or [moving it to the GPU](../gpu.md).
222+
Finally, most Flux layers make bias optional, and allow you to supply the function used for generating random weights. We can easily add these refinements to the `Affine` layer as follows:
223+
224+
```
225+
function Affine((in, out)::Pair; bias=true, init=Flux.randn32)
226+
W = init(out, in)
227+
b = Flux.create_bias(W, bias, out)
228+
Affine(W, b)
229+
end
230+
231+
Affine(3 => 1, bias=false, init=ones) |> gpu
232+
```
223233

224-
For some more helpful tricks, including parameter freezing, please checkout the [advanced usage guide](advanced.md).
234+
```@docs
235+
Functors.@functor
236+
Flux.create_bias
237+
```

docs/src/models/functors.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Flux models are deeply nested structures, and [Functors.jl](https://github.com/F
44

55
New layers should be annotated using the `Functors.@functor` macro. This will enable [`params`](@ref Flux.params) to see the parameters inside, and [`gpu`](@ref) to move them to the GPU.
66

7-
`Functors.jl` has its own [notes on basic usage](https://fluxml.ai/Functors.jl/stable/#Basic-Usage-and-Implementation) for more details. Additionally, the [Advanced Model Building and Customisation](@ref Advanced-Model-Building-and-Customisation) page covers the use cases of `Functors` in greater details.
7+
`Functors.jl` has its own [notes on basic usage](https://fluxml.ai/Functors.jl/stable/#Basic-Usage-and-Implementation) for more details. Additionally, the [Advanced Model Building and Customisation](../models/advanced.md) page covers the use cases of `Functors` in greater details.
88

99
```@docs
1010
Functors.@functor

docs/src/models/layers.md

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -86,12 +86,3 @@ Many normalisation layers behave differently under training and inference (testi
8686
Flux.testmode!
8787
trainmode!
8888
```
89-
90-
91-
## Listing All Layers
92-
93-
The `modules` command uses Functors to extract a flat list of all layers:
94-
95-
```@docs
96-
Flux.modules
97-
```

0 commit comments

Comments
 (0)