You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* remove train from quickstart example
* fixes & suggestions
* better bullet points
* dump train! and gpu from the readme too
* remove a few comments
* rm mention of Zygote
* maybe we should have a much simpler readme example
* tweaks
* no more cbrt, no more abs2
* remove controversial println code, and make it shorter
* fix some fences
* maybe this example should run on the GPU, since it easily can, even though this is slower
* let's replace explicit printing with showprogress macro, it's pretty and doesn't waste lines
* add graph of the loss, since we log it? also move to a folder.
* one more .. perhaps
Copy file name to clipboardExpand all lines: README.md
+12-12Lines changed: 12 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -18,23 +18,23 @@
18
18
19
19
Flux is an elegant approach to machine learning. It's a 100% pure-Julia stack, and provides lightweight abstractions on top of Julia's native GPU and AD support. Flux makes the easy things easy while remaining fully hackable.
20
20
21
-
Works best with [Julia 1.8](https://julialang.org/downloads/) or later. Here's a simple example to try it out:
21
+
Works best with [Julia 1.8](https://julialang.org/downloads/) or later. Here's a very short example to try it out:
22
22
```julia
23
-
using Flux # should install everything for you, including CUDA
23
+
using Flux, Plots
24
+
data = [([x], 2x-x^3) for x in-2:0.1f0:2]
24
25
25
-
x =hcat(digits.(0:3, base=2, pad=2)...) |> gpu # let's solve the XOR problem!
26
-
y = Flux.onehotbatch(xor.(eachrow(x)...), 0:1) |> gpu
27
-
data = ((Float32.(x), y) for _ in1:100) # an iterator making Tuples
26
+
model =Chain(Dense(1=>23, tanh), Dense(23=>1, bias=false), only)
28
27
29
-
model =Chain(Dense(2=>3, sigmoid), BatchNorm(3), Dense(3=>2)) |> gpu
30
-
optim =Adam(0.1, (0.7, 0.95))
31
-
mloss(x, y) = Flux.logitcrossentropy(model(x), y) # closes over model
Flux.train!(mloss, Flux.params(model), data, optim) # updates model & optim
34
-
35
-
all((softmax(model(x)) .>0.5) .== y) # usually 100% accuracy.
34
+
plot(x ->2x-x^3, -2, 2, legend=false)
35
+
scatter!(-2:0.1:2, [model([x]) for x in-2:0.1:2])
36
36
```
37
37
38
-
See the [documentation](https://fluxml.github.io/Flux.jl/) for details, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples. Ask questions on the [Julia discourse](https://discourse.julialang.org/) or [slack](https://discourse.julialang.org/t/announcing-a-julia-slack/4866).
38
+
The [quickstart page](https://fluxml.ai/Flux.jl/stable/models/quickstart/) has a longer example. See the [documentation](https://fluxml.github.io/Flux.jl/) for details, or the [model zoo](https://github.com/FluxML/model-zoo/) for examples. Ask questions on the [Julia discourse](https://discourse.julialang.org/) or [slack](https://discourse.julialang.org/t/announcing-a-julia-slack/4866).
39
39
40
40
If you use Flux in your research, please [cite](CITATION.bib) our work.
This XOR ("exclusive or") problem is a variant of the famous one which drove Minsky and Papert to invent deep neural networks in 1969. For small values of "deep" -- this has one hidden layer, while earlier perceptrons had none. (What they call a hidden layer, Flux calls the output of the first layer, `model[1](noisy)`.)
58
81
59
82
Since then things have developed a little.
60
83
61
-
## Features of Note
84
+
## Features to Note
62
85
63
86
Some things to notice in this example are:
64
87
65
-
* The batch dimension of data is always the last one. Thus a `2×1000 Matrix` is a thousand observations, each a column of length 2.
66
-
67
-
* The `model` can be called like a function, `y = model(x)`. It encapsulates the parameters (and state).
88
+
* The batch dimension of data is always the last one. Thus a `2×1000 Matrix` is a thousand observations, each a column of length 2. Flux defaults to `Float32`, but most of Julia to `Float64`.
68
89
69
-
*But the model does not contain the loss function, nor the optimisation rule. Instead the [`Adam()`](@ref Flux.Adam) object stores between iterations the momenta it needs.
90
+
*The `model` can be called like a function, `y = model(x)`. Each layer like [`Dense`](@ref Flux.Dense) is an ordinary `struct`, which encapsulates some arrays of parameters (and possibly other state, as for [`BatchNorm`](@ref Flux.BatchNorm)).
70
91
71
-
*The function[`train!`](@ref Flux.train!) likes data as an iterator generating `Tuple`s, here produced by [`DataLoader`](@ref). This mutates both the `model` and the optimiser state inside `opt`.
92
+
*But the model does not contain the loss function, nor the optimisation rule. The [`Adam`](@ref Flux.Adam) object stores between iterations the momenta it needs. And [`Flux.crossentropy`](@ref Flux.Losses.crossentropy) is an ordinary function.
72
93
73
-
There are other ways to train Flux models, for more control than `train!` provides:
94
+
* The `do` block creates an anonymous function, as the first argument of `gradient`. Anything executed within this is differentiated.
74
95
75
-
* Within Flux, you can easily write a training loop, calling [`gradient`](@ref) and [`update!`](@ref Flux.update!).
96
+
Instead of calling [`gradient`](@ref Zygote.gradient) and [`update!`](@ref Flux.update!) separately, there is a convenience function [`train!`](@ref Flux.train!). If we didn't want anything extra (like logging the loss), we could replace the training loop with the following:
76
97
77
-
* For a lower-level way, see the package [Optimisers.jl](https://github.com/FluxML/Optimisers.jl).
78
-
79
-
* For higher-level ways, see [FluxTraining.jl](https://github.com/FluxML/FluxTraining.jl) and [FastAI.jl](https://github.com/FluxML/FastAI.jl).
0 commit comments