Skip to content

Commit b67a3a9

Browse files
committed
Create a new 'tutorials' section
1 parent df06a6d commit b67a3a9

File tree

2 files changed

+27
-109
lines changed

2 files changed

+27
-109
lines changed

docs/make.jl

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@ makedocs(
1414
"Quick Start" => "getting_started/quickstart.md",
1515
"Fitting a Line" => "getting_started/overview.md",
1616
"Gradients and Layers" => "getting_started/basics.md",
17-
"Linear Regression" => "getting_started/linear_regression.md"
17+
],
18+
"Tutorials" => [
19+
"Linear Regression" => "tutorials/linear_regression.md",
1820
],
1921
"Building Models" => [
2022
"Built-in Layers 📚" => "models/layers.md",

docs/src/getting_started/linear_regression.md renamed to docs/src/tutorials/linear_regression.md

Lines changed: 24 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,8 @@ Under the hood, Flux uses a technique called automatic differentiation to take g
1111

1212
The following page contains a step-by-step walkthrough of the linear regression algorithm in `Julia` using `Flux`! We will start by creating a simple linear regression model for dummy data and then move on to a real dataset. The first part would involve writing some parts of the model on our own, which will later be replaced by `Flux`.
1313

14-
## A simple linear regression model
14+
---
15+
1516
Let us start by building a simple linear regression model. This model would be trained on the data points of the form `(x₁, y₁), (x₂, y₂), ... , (xₙ, yₙ)`. In the real world, these `x`s can have multiple features, and the `y`s denote a label. In our example, each `x` has a single feature; hence, our data would have `n` data points, each point mapping a single feature to a single label.
1617

1718
Importing the required `Julia` packages -
@@ -20,8 +21,9 @@ Importing the required `Julia` packages -
2021
julia> using Flux, Plots
2122
```
2223

23-
### Generating a dataset
24-
The data usually comes from the real world, which we will be exploring in the last part of this guide, but we don't want to jump straight to the relatively harder part. Here we will generate the `x`s of our data points and map them to the respective `y`s using a simple function. Remember, here each `x` is equivalent to a feature, and each `y` is the corresponding label. Combining all the `x`s and `y`s would create the complete dataset.
24+
## Generating a dataset
25+
26+
The data usually comes from the real world, which we will be exploring in the last part of this tutorial, but we don't want to jump straight to the relatively harder part. Here we will generate the `x`s of our data points and map them to the respective `y`s using a simple function. Remember, here each `x` is equivalent to a feature, and each `y` is the corresponding label. Combining all the `x`s and `y`s would create the complete dataset.
2527

2628
```jldoctest linear_regression_simple
2729
julia> x = hcat(collect(Float32, -3:0.1:3)...)
@@ -60,7 +62,7 @@ The data looks random enough now! The `x` and `y` values are still somewhat corr
6062

6163
We can now proceed ahead and build a model for our dataset!
6264

63-
### Building a model
65+
## Building a model
6466

6567
A linear regression model is defined mathematically as -
6668

@@ -149,7 +151,7 @@ julia> flux_loss(flux_model, x, y)
149151
22.74856f0
150152
```
151153

152-
Everything works as before! It almost feels like `Flux` provides us with smart wrappers for the functions we could have written on our own. Now, as the last step of this section, let's see how different the `flux_model` is from our custom `model`. A good way to go about this would be to fix the parameters of both models to be the same. Let's change the parameters of our custom `model` to match that of the `flux_model` -
154+
Everything works as before! It almost feels like `Flux` provides us with smart wrappers for the functions we could have written on our own. Now, as the last step of this section, let's see how different the `flux_model` is from our `custom_model`. A good way to go about this would be to fix the parameters of both models to be the same. Let's change the parameters of our `custom_model` to match that of the `flux_model` -
153155

154156

155157
```jldoctest linear_regression_simple; filter = r"[+-]?([0-9]*[.])?[0-9]+(f[+-]*[0-9])?"
@@ -165,7 +167,9 @@ julia> custom_loss(W, b, x, y), flux_loss(flux_model, x, y)
165167
(22.74856f0, 22.74856f0)
166168
```
167169

168-
The losses are identical! This means that our `model` and the `flux_model` are identical on some level, and the loss functions are completely identical! The difference in models would be that `Flux`'s [`Dense`](@ref) layer supports many other arguments that can be used to customize the layer further. But, for this guide, let us stick to our simple custom `model`.
170+
The losses are identical! This means that our `model` and the `flux_model` are identical on some level, and the loss functions are completely identical! The difference in models would be that `Flux`'s [`Dense`](@ref) layer supports many other arguments that can be used to customize the layer further. But, for this tutorial, let us stick to our simple `custom_model`.
171+
172+
## Training the model
169173

170174
Let's train our model using the classic Gradient Descent algorithm. According to the gradient descent algorithm, the weights and biases should be iteratively updated using the following mathematical equations -
171175

@@ -237,8 +241,9 @@ There was a significant reduction in loss, and the parameters were updated!
237241

238242
We can train the model even more or tweak the hyperparameters to achieve the desired result faster, but let's stop here. We trained our model for 42 epochs, and loss went down from `22.74856` to `7.6680417f`. Time for some visualization!
239243

240-
### Results
241-
The main objective of this guide was to fit a line to our dataset using the linear regression algorithm. The training procedure went well, and the loss went down significantly! Let's see what the fitted line looks like. Remember, `Wx + b` is nothing more than a line's equation, with `slope = W[1]` and `y-intercept = b[1]` (indexing at `1` as `W` and `b` are iterable).
244+
## Results
245+
246+
The main objective of this tutorial was to fit a line to our dataset using the linear regression algorithm. The training procedure went well, and the loss went down significantly! Let's see what the fitted line looks like. Remember, `Wx + b` is nothing more than a line's equation, with `slope = W[1]` and `y-intercept = b[1]` (indexing at `1` as `W` and `b` are iterable).
242247

243248
Plotting the line and the data points using `Plot.jl` -
244249
```jldoctest linear_regression_simple
@@ -252,14 +257,15 @@ julia> plot!((x) -> b[1] + W[1] * x, -3, 3, label="Custom model", lw=2);
252257

253258
The line fits well! There is room for improvement, but we leave that up to you! You can play with the optimisers, the number of epochs, learning rate, etc. to improve the fitting and reduce the loss!
254259

255-
## Linear regression model on a real dataset
260+
### Linear regression model on a real dataset
261+
256262
We now move on to a relatively complex linear regression model. Here we will use a real dataset from [`MLDatasets.jl`](https://github.com/JuliaML/MLDatasets.jl), which will not confine our data points to have only one feature. Let's start by importing the required packages -
257263

258264
```jldoctest linear_regression_complex
259265
julia> using Flux, Statistics, MLDatasets, DataFrames
260266
```
261267

262-
### Data
268+
## Gathering real data
263269
Let's start by initializing our dataset. We will be using the [`BostonHousing`](https://juliaml.github.io/MLDatasets.jl/stable/datasets/misc/#MLDatasets.BostonHousing) dataset consisting of `506` data points. Each of these data points has `13` features and a corresponding label, the house's price. The `x`s are still mapped to a single `y`, but now, a single `x` data point has 13 features.
264270

265271
```jldoctest linear_regression_complex
@@ -300,7 +306,7 @@ julia> std(x_train_n)
300306

301307
The standard deviation is now close to one! Our data is ready!
302308

303-
### Model
309+
## Building a Flux model
304310
We can now directly use `Flux` and let it do all the work internally! Let's define a model that takes in 13 inputs (13 features) and gives us a single output (the label). We will then pass our entire data through this model in one go, and `Flux` will handle everything for us! Remember, we could have declared a model in plain `Julia` as well. The model will have 14 parameters: 13 weights and 1 bias.
305311

306312
```jldoctest linear_regression_complex
@@ -322,7 +328,8 @@ julia> loss(model, x_train_n, y_train)
322328

323329
We can now proceed to the training phase!
324330

325-
### Training
331+
## Training the Flux model
332+
326333
The training procedure would make use of the same mathematics, but now we can pass in the model inside the `gradient` call and let `Flux` and `Zygote` handle the derivatives!
327334

328335
```jldoctest linear_regression_complex
@@ -333,7 +340,7 @@ julia> function train_model()
333340
end;
334341
```
335342

336-
Contrary to our last training procedure, let's say that this time we don't want to hardcode the number of epochs. We want the training procedure to stop when the loss converges, that is, when `change in loss < δ`. The quantity `δ` can be altered according to a user's need, but let's fix it to `10⁻³` for this guide.
343+
Contrary to our last training procedure, let's say that this time we don't want to hardcode the number of epochs. We want the training procedure to stop when the loss converges, that is, when `change in loss < δ`. The quantity `δ` can be altered according to a user's need, but let's fix it to `10⁻³` for this tutorial.
337344

338345
We can write such custom training loops effortlessly using `Flux` and plain `Julia`!
339346
```jldoctest linear_regression_complex
@@ -366,8 +373,9 @@ julia> loss(model, x_train_n, y_train)
366373

367374
The loss went down significantly! It can be minimized further by choosing an even smaller `δ`.
368375

369-
### Testing
370-
The last step of this guide would be to test our model using the testing data. We will first normalise the testing data and then calculate the corresponding loss.
376+
## Testing the Flux model
377+
378+
The last step of this tutorial would be to test our model using the testing data. We will first normalise the testing data and then calculate the corresponding loss.
371379

372380
```jldoctest linear_regression_complex; filter = r"[+-]?([0-9]*[.])?[0-9]+(f[+-]*[0-9])?"
373381
julia> x_test_n = Flux.normalise(x_test);
@@ -380,100 +388,8 @@ The loss is not as small as the loss of the training data, but it looks good! Th
380388

381389
---
382390

383-
Summarising this guide, we started by generating a random yet correlated dataset for our custom model. We then saw how a simple linear regression model could be built with and without `Flux`, and how they were almost identical.
391+
Summarising this tutorial, we started by generating a random yet correlated dataset for our `custom model`. We then saw how a simple linear regression model could be built with and without `Flux`, and how they were almost identical.
384392

385393
Next, we trained the model by manually writing down the Gradient Descent algorithm and optimising the loss. We also saw how `Flux` provides various wrapper functionalities and keeps the API extremely intuitive and simple for the users.
386394

387395
After getting familiar with the basics of `Flux` and `Julia`, we moved ahead to build a machine learning model for a real dataset. We repeated the exact same steps, but this time with a lot more features and data points, and by harnessing `Flux`'s full capabilities. In the end, we developed a training loop that was smarter than the hardcoded one and ran the model on our normalised dataset to conclude the tutorial.
388-
389-
## Copy-pastable code
390-
### Dummy dataset
391-
```julia
392-
using Flux, Plots
393-
394-
# data
395-
x = hcat(collect(Float32, -3:0.1:3)...)
396-
f(x) = @. 3x + 2
397-
y = f(x)
398-
x = x .* reshape(rand(Float32, 61), (1, 61))
399-
400-
# plot the data
401-
plot(reshape(x, (61, 1)), reshape(y, (61, 1)), lw = 3, seriestype = :scatter, label = "", title = "Generated data", xlabel = "x", ylabel= "y")
402-
403-
# custom model and parameters
404-
custom_model(W, b, x) = @. W*x + b
405-
W = rand(Float32, 1, 1)
406-
b = [0.0f0]
407-
408-
# loss function
409-
function custom_loss(W, b, x, y)
410-
ŷ = custom_model(W, b, x)
411-
sum((y .- ŷ).^2) / length(x)
412-
end;
413-
414-
print("Initial loss: ", custom_loss(W, b, x, y), "\n")
415-
416-
# train
417-
function train_custom_model()
418-
dLdW, dLdb, _, _ = gradient(custom_loss, W, b, x, y)
419-
@. W = W - 0.1 * dLdW
420-
@. b = b - 0.1 * dLdb
421-
end
422-
423-
for i = 1:40
424-
train_custom_model()
425-
end
426-
427-
print("Final loss: ", custom_loss(W, b, x, y), "\n")
428-
429-
# plot data and results
430-
plot(reshape(x, (61, 1)), reshape(y, (61, 1)), lw = 3, seriestype = :scatter, label = "", title = "Simple Linear Regression", xlabel = "x", ylabel= "y")
431-
plot!((x) -> b[1] + W[1] * x, -3, 3, label="Custom model", lw=2)
432-
```
433-
### Real dataset
434-
```julia
435-
using Flux, Statistics, MLDatasets
436-
437-
# data
438-
x, y = BostonHousing(as_df=false)[:]
439-
x_train, x_test, y_train, y_test = x[:, 1:400], x[:, 401:end], y[:, 1:400], y[:, 401:end]
440-
x_train_n = Flux.normalise(x_train)
441-
442-
# model
443-
model = Dense(13 => 1)
444-
445-
# loss function
446-
function loss(model, x, y)
447-
ŷ = model(x)
448-
Flux.mse(ŷ, y)
449-
end;
450-
451-
print("Initial loss: ", loss(model, x_train_n, y_train), "\n")
452-
453-
# train
454-
function train_custom_model()
455-
dLdm, _, _ = gradient(loss, model, x_train_n, y_train)
456-
@. model.weight = model.weight - 0.000001 * dLdm.weight
457-
@. model.bias = model.bias - 0.000001 * dLdm.bias
458-
end
459-
460-
loss_init = Inf;
461-
while true
462-
train_custom_model()
463-
if loss_init == Inf
464-
loss_init = loss(model, x_train_n, y_train)
465-
continue
466-
end
467-
if abs(loss_init - loss(model, x_train_n, y_train)) < 1e-4
468-
break
469-
else
470-
loss_init = loss(model, x_train_n, y_train)
471-
end
472-
end
473-
474-
print("Final loss: ", loss(model, x_train_n, y_train), "\n")
475-
476-
# test
477-
x_test_n = Flux.normalise(x_test);
478-
print("Test loss: ", loss(model, x_test_n, y_test), "\n")
479-
```

0 commit comments

Comments
 (0)