Skip to content

Commit c07c8f8

Browse files
committed
Lecture 12: Polishing touches
1 parent ccfb70f commit c07c8f8

File tree

5 files changed

+550
-401
lines changed

5 files changed

+550
-401
lines changed

docs/src/lecture_11/sparse.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,10 @@ Often, a regularization term is added. There are two possibilities. The [ridge r
2121
Both approaches try to keep the norm of parameters ``w`` small to prevent overfitting. The first approach results in a simpler numerical method, while the second one induces sparsity. Before we start with both topics, we will briefly mention matrix decompositions which plays a crucial part in numerical computations.
2222

2323

24-
## Theory of matrix eigendecomposition
24+
## [Theory of matrix eigendecomposition](@id matrix-eigen)
2525

2626

27-
Consider a square matrix ``A\in \mathbb R^{n\times n}`` with real-valued entries. We there exist ``\lambda\in\mathbb R`` and ``v\in\mathbb^n`` such that
27+
Consider a square matrix ``A\in \mathbb R^{n\times n}`` with real-valued entries. We there exist ``\lambda\in\mathbb R`` and ``v\in\mathbb R^n`` such that
2828

2929
```math
3030
Av = \lambda v,
@@ -85,7 +85,7 @@ Since this formula uses only matrix-vector multiplication and an inversion of a
8585

8686

8787

88-
## Theory of LASSO
88+
## [Theory of LASSO](@id lasso)
8989

9090

9191
Unlike ridge regression, LASSO does not have a closed-form solution. Since it is a structured convex problem, it can be solved the [ADMM algorithm](https://web.stanford.edu/~boyd/papers/pdf/admm_distr_stats.pdf). It is a primal-dual algorithm, which employs the primal original variable ``w``, the primal auxiliary variable ``z`` and the dual variable ``u`` with the iterative updates:

docs/src/lecture_12/diff_eq.md

Lines changed: 79 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,29 @@
11
# Julia package
22

3-
To solve differential equations, we will use package [DifferentialEquations](https://diffeq.sciml.ai/stable/).
3+
To solve differential equations, we use the package [DifferentialEquations](https://diffeq.sciml.ai/stable/), which consider ODEs in the form
44

5-
## Introduction
6-
7-
DifferentialEquations consider ODEs in the form
85
```math
96
\dot u(t) = f(t, u(t), p(t))
107
```
11-
with the initial condition ``u(t_0)= u_0``. While ``u`` is the solution, ``p`` described external parameters.
128

13-
We will start with a simple problem
9+
with the initial condition ``u(t_0)= u_0``. While ``u`` is the solution, ``p`` describes external parameters.
10+
11+
12+
13+
14+
## Introduction
15+
16+
We start with the following simple problem:
17+
1418
```math
15-
\dot u(t) = 0.98u
19+
\begin{aligned}
20+
\dot u(t) &= 0.98u, \\
21+
u(0) &= 1.
22+
\end{aligned}
1623
```
17-
with initial condition ``u(0) = 1``. This has closed-form solution ``u(t) = e^{0.98t}``. To solve this ODE by ```DifferentialEquations```, we first need to create the problem ```prob``` by supplying function ``f``, the initial point ``u_0`` and the time interval ``[t_0,t_1]`` to the constructor ```ODEProblem```
24+
25+
This equation has the closed-form solution ``u(t) = e^{0.98t}``. To solve it by `DifferentialEquations`, we first need to create the problem `prob` by supplying the function ``f``, the initial point ``u_0`` and the time interval ``[t_0,t_1]`` to the constructor `ODEProblem`.
26+
1827
```@example intro
1928
using DifferentialEquations
2029
@@ -27,49 +36,65 @@ prob = ODEProblem(f, u0, tspan)
2736
2837
nothing # hide
2938
```
30-
We can solve the ODE by
39+
40+
Then we use the `solve` function to solve the equation.
41+
3142
```@example intro
3243
sol = solve(prob)
3344
```
34-
The first line specifies that the solution was successful. We can automatically check whether the solution was successful by ```sol.retcode == :Success```. The second line specifies the interpolation method. Even though the solution was evaluated at only 5 points ```sol.t``` with values ```sol.u```, the interpolation
45+
46+
The first line says that the solution was successful, which can be automatically checked by accessing the field `sol.retcode`. The second line specifies the interpolation method, and the following lines the solution. Even though the solution was evaluated at only five points, the interpolation allows plotting a smooth function.
3547

3648
```@example intro
3749
using Plots
3850
39-
plot(sol)
51+
plot(sol; label="")
4052
4153
savefig("intro.svg") # hide
4254
```
4355

4456
![](intro.svg)
4557

46-
The ```sol``` structure is heavily overloaded. It can be used to evaluate the solution ``u`` at any time
58+
The `sol` structure can be used to evaluate the solution ``u``.
59+
4760
```@example intro
4861
sol(0.8)
4962
```
5063

51-
The next exercise shows how to specify the interpolation technique and compares the resutlts.
64+
65+
66+
67+
68+
69+
70+
The following exercise shows how to specify the interpolation technique and compares the results.
5271

5372
```@raw html
5473
<div class = "exercise-body">
5574
<header class = "exercise-header">Exercise:</header><p>
5675
```
57-
When calling the ```solve``` function, we can specify the interpolation way. Solve the ODE with linear interpolation (```dense=false```) and the Runge-Kutta method of fourth order (```RK4()```). Plot the results and compare them with the default and original solutions.
76+
77+
When calling the `solve` function, we can specify the interpolation way. Solve the ODE with linear interpolation (`dense=false`) and the Runge-Kutta method of the fourth order (`RK4()`). Plot the results and compare them with the default and original solutions.
78+
5879
```@raw html
5980
</p></div>
6081
<details class = "solution-body">
6182
<summary class = "solution-header">Solution:</summary><p>
6283
```
63-
To compues the additional solutions, we add the arguments as specified above
84+
85+
To compute the additional solutions, we add the arguments as specified above.
86+
6487
```@example intro
6588
sol2 = solve(prob, dense=false)
6689
sol3 = solve(prob, RK4())
6790
6891
nothing # hide
6992
```
70-
For plotting, we create a discretization ```ts``` of the time interval and then plot the four functions
93+
94+
We create a discretization ```ts``` of the time interval and then plot the four functions.
95+
7196
```@example intro
72-
ts = collect(range(tspan[1], tspan[2], length=100))
97+
ts = range(tspan...; length = 100)
7398
7499
plot(ts, t->exp(0.98*t), label="True solution", legend=:topleft)
75100
plot!(ts, t->sol(t), label="Default")
@@ -84,23 +109,26 @@ savefig("Comparison.svg") # hide
84109

85110
![](Comparison.svg)
86111

87-
We see that all solutions are the same with the exception of the linear approximation.
112+
We see that all solutions are the same except for the linear approximation.
88113

89114

90115

91116
## Lorenz system
92117

93-
[Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system) is a prime example of the [butterfly effect](https://en.wikipedia.org/wiki/Butterfly_effect) in the chaos theory. There, a small changes in the initial conditions results in large changes after a long time. This effect was first described in 1961 during work on weather modelling.
118+
The [Lorenz system](https://en.wikipedia.org/wiki/Lorenz_system) is a prime example of the [butterfly effect](https://en.wikipedia.org/wiki/Butterfly_effect) in the chaos theory. There, small changes in the initial conditions result in large changes in the solution. This effect was first described in 1961 during work on weather modelling.
119+
120+
The following equations describe the three-dimensional Lorenz system:
94121

95-
The three-dimensional Lorenz system is described by the set of equations
96122
```math
97123
\begin{aligned}
98124
\frac{\partial x}{\partial t} &= \sigma (y - x), \\
99125
\frac{\partial y}{\partial t} &= x (\rho - z) - y, \\
100126
\frac{\partial z}{\partial t} &= x y - \beta z.
101127
\end{aligned}
102128
```
103-
We define the right-hand side
129+
130+
We first define the right-hand side of the system.
131+
104132
```@example intro
105133
function lorenz(u, p, t)
106134
σ, ρ, β = p
@@ -112,7 +140,9 @@ end
112140
113141
nothing # hide
114142
```
115-
The parameters are saved in a tuple or array ```p```. Since the right-hand side of the Lorenz system is a vector, we need to return a vector as well. Now, we compute the solution in the same way as before.
143+
144+
The parameters are saved in a tuple or array `p`. Since the right-hand side of the Lorenz system is a vector, we need to return a vector as well. Now, we compute the solution in the same way as before.
145+
116146
```@example intro
117147
u0 = [1.0; 0.0; 0.0]
118148
p = [10; 28; 8/3]
@@ -124,7 +154,9 @@ sol = solve(prob)
124154
125155
nothing # hide
126156
```
127-
Using the same function to plot
157+
158+
We plot the solution:
159+
128160
```@example intro
129161
plot(sol)
130162
@@ -133,7 +165,8 @@ savefig("lorenz0.svg") # hide
133165

134166
![](lorenz0.svg)
135167

136-
results in two-dimensional graph of all coorinates. To plot 3D, we need to specify it
168+
Since this is a two-dimensional graph of all coordinates, we need to specify that we want to plot a 3D graph.
169+
137170
```@example intro
138171
plt1 = plot(sol, vars=(1,2,3), label="")
139172
@@ -142,7 +175,8 @@ savefig("lorenz1.svg") # hide
142175

143176
![](lorenz1.svg)
144177

145-
We see again the power of interpolation. If we used linear interpolation (connected the points)
178+
We see the power of interpolation again. If we used linear interpolation, which amounts to connecting the points, we would obtain a much coarse graph.
179+
146180
```@example intro
147181
plot(sol, vars=(1,2,3), denseplot=false; label="")
148182
@@ -151,34 +185,34 @@ savefig("lorenz2.svg") # hide
151185

152186
![](lorenz2.svg)
153187

154-
we would obtain a much coarse graph. This shows the strength of the ```DifferentialEquations``` package. With a small computational effort, it is able to compute a good solution. Note that the last plotting call is equivalent to
188+
This graph shows the strength of the `DifferentialEquations` package. With a small computational effort, it can compute a good solution. Note that the last plotting call is equivalent to:
189+
155190
```julia
156191
traj = hcat(sol.u...)
157192
plot(traj[1,:], traj[2,:], traj[3,:]; label="")
158193
```
159194

160-
In the introduction to this part, we mentioned the chaos theory. We will elaborate on this in the next exercise.
195+
In the introduction, we mentioned chaos theory. We will elaborate on this in the following exercise.
161196

162197
```@raw html
163198
<div class = "exercise-body">
164199
<header class = "exercise-header">Exercise:</header><p>
165200
```
166-
Consider the same setting as above but perturb the first parameter of ```p``` by the smallest possible value (with respect to the machine precision). Then solve the Lorenz system again and compare results by plotting the two trajectories next to each other.
201+
202+
Use the `nextfloat` function to perturb the first parameter of `p` by the smallest possible value. Then solve the Lorenz system again and compare the results by plotting the two trajectories next to each other.
203+
167204
```@raw html
168205
</p></div>
169206
<details class = "solution-body">
170207
<summary class = "solution-header">Solution:</summary><p>
171208
```
172-
The machine precision can be obtained by ```eps(T)```, where ```T``` is the desired type. However, when we add this to ```p[1]```, we obtain the same number
209+
210+
We start with the smallest possible perturbation of the initial value.
211+
173212
```@example intro
174-
p[1] + eps(eltype(p)) == p[1]
213+
p0 = (nextfloat(p[1]), p[2:end]...)
175214
```
176-
The reason is that ```p[1]``` has value 10 and the sum exceeds the allowed number of valid digits and it is truncated back to 10. We therefore cheat a bit and manually modify the number
177-
```@example intro
178-
p0 = (10.000000000000001,28,8/3)
179215

180-
nothing # hide
181-
```
182216
Then we plot the graphs as before
183217
```@example intro
184218
prob0 = ODEProblem(lorenz, u0, tspan, p0)
@@ -196,31 +230,37 @@ savefig("lorenz4.svg") # hide
196230

197231
![](lorenz4.svg)
198232

199-
The solutions look obviously different. Comparing the terminal states of both solutions
233+
The solutions look different. Comparing the terminal states of both solutions
234+
200235
```@example intro
201236
hcat(sol(tspan[2]), sol0(tspan[2]))
202237
```
238+
203239
shows that they are different by a large margin. This raises a natural question.
204240

205241

206242
```@raw html
207243
<div class = "exercise-body">
208244
<header class = "exercise-header">Exercise:</header><p>
209245
```
246+
210247
Can we trust the solutions? Why?
248+
211249
```@raw html
212250
</p></div>
213251
<details class = "solution-body">
214252
<summary class = "solution-header">Solution:</summary><p>
215253
```
254+
216255
Unfortunately, we cannot. Numerical methods always introduce some errors by
217256
- *Rounding errors* due to representing real numbers in machine precision.
218-
- *Discretization errors* for continuous systems when the derivative is approximated by some kind of finite difference.
219-
However, if the system itself is unstable in the way that an extremely small perturbation results in big differences in solutions, the numerical method even enhances these errors. The solution could be trusted on some small interval but not after it.
257+
- *Discretization errors* for continuous systems when the finite difference method approximates the derivative.
258+
However, if the system itself is unstable and an extremely small perturbation results in big differences in solutions, the numerical method even enhances these errors. The solution could be trusted on some small interval but not after it.
259+
220260
```@raw html
221261
</p></details>
222262
```
223263

224-
The next section will show a situation where we try to mitigate this possible effect by using mathematical formulas to compute the exact solution as long as possible. This delays the necessary discretization and may bring a better stability.
264+
The following section shows a situation where we try to mitigate this possible effect by using mathematical formulas to compute the exact solution as long as possible. This aproach delays the necessary discretization and may bring better stability.
225265

226266

0 commit comments

Comments
 (0)