Skip to content

Commit ccfb70f

Browse files
committed
Lecture 11: Polishing touches
1 parent 79fd892 commit ccfb70f

File tree

4 files changed

+180
-175
lines changed

4 files changed

+180
-175
lines changed

docs/make.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,9 +100,9 @@ lecture_10 = joinpath.("./lecture_10/", [
100100
])
101101

102102
lecture_11 = joinpath.("./lecture_11/", [
103+
"sparse.md",
103104
"monte.md",
104105
"glm.md",
105-
"sparse.md",
106106
])
107107

108108
lecture_12 = joinpath.("./lecture_12/", [

docs/src/lecture_11/monte.md

Lines changed: 15 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -345,7 +345,7 @@ While the rejection sampling provides a good approximation for the first two dis
345345

346346
This exercise computes the expected value
347347
```math
348-
\mathbb E_3 \cos(100x) = \int_{-\infty}^\infty \cos(100 x) f_3(x) dx,
348+
\mathbb E_3 \cos(100X) = \int_{-\infty}^\infty \cos(100 x) f_3(x) dx,
349349
```
350350
where we consider the expectation ``\mathbb E`` with respect to ``d_3\sim N(0, 0.01)`` with density ``f_3``. The first possibility to compute the expectation is to discretize the integral.
351351

@@ -361,19 +361,15 @@ nothing # hide
361361

362362
The second possibility is to approximate the integral by
363363
```math
364-
\mathbb E_3 \cos(100x) \approx \frac 1n\sum_{i=1}^n \cos(x_i),
364+
\mathbb E_3 \cos(100X) \approx \frac 1n\sum_{i=1}^n \cos(x_i),
365365
```
366-
where ``x_i`` are sampled from ``d_3``. We do this in `expectation1`, and `expectation2`, where the formed generates from the Distributions package while the latter uses our rejection sampling.
366+
where ``x_i`` are sampled from ``d_3``. We do this in `expectation1`, and `expectation2`, where the formed generates from the Distributions package while the latter uses our rejection sampling. We use the method of the `mean` function, which takes a function as its first argument.
367367

368368
```@example monte
369-
function expectation1(h, d; n=1000000)
370-
xs = rand(d, n)
371-
return mean(h.(xs))
372-
end
369+
expectation1(h, d; n = 1000000) = mean(h, rand(d, n))
373370
374-
function expectation2(h, f, x_max, xlims; n=1000000)
375-
xs = rejection_sampling(f, f(x_max), xlims...; n=n)
376-
return mean(h.(xs))
371+
function expectation2(h, f, f_max, xlims; n=1000000)
372+
return mean(h, rejection_sampling(f, f_max, xlims...; n))
377373
end
378374
379375
nothing # hide
@@ -382,29 +378,29 @@ nothing # hide
382378
If it is difficult to sample from ``d_3``, we can use a trick to sample from some other distribution. This is based on the following formula:
383379

384380
```math
385-
\mathbb E_3 h(x) = \int_{-\infty}^\infty h(x) f_3(x) dx = \int_{-\infty}^\infty h(x) \frac{f_3(x)}{f_1(x)}f_1(x) dx = \mathbb E_1 \frac{h(x)}{f_1(x)}.
381+
\mathbb E_3 h(x) = \int_{-\infty}^\infty h(x) f_3(x) dx = \int_{-\infty}^\infty h(x) \frac{f_3(x)}{f_1(x)}f_1(x) dx = \mathbb E_1 \frac{h(x)f_3(x)}{f_1(x)}.
386382
```
387383

388384
This gives rise to another implementation of the same thing.
389385

390386
```@example monte
391387
function expectation3(h, f, d_gen; n=1000000)
392-
xs = rand(d_gen, n)
393-
return mean(h.(xs) .* f.(xs) ./ pdf(d_gen, xs))
388+
g(x) = h(x)*f(x)/pdf(d_gen, x)
389+
return mean(g, rand(d_gen, n))
394390
end
395391
396392
nothing # hide
397393
```
398394

399-
We run these three approaches for ``20`` repetition.s
395+
We run these three approaches for ``20`` repetitions.
400396

401397
```@example monte
402398
n = 100000
403399
n_rep = 20
404400
405401
Random.seed!(666)
406402
e1 = [expectation1(h, d3; n=n) for _ in 1:n_rep]
407-
e2 = [expectation2(h, f3, d3.μ, xlims; n=n) for _ in 1:n_rep]
403+
e2 = [expectation2(h, f3, f3(d3.μ), xlims; n=n) for _ in 1:n_rep]
408404
e3 = [expectation3(h, f3, d1; n=n) for _ in 1:n_rep]
409405
410406
nothing # hide
@@ -505,7 +501,7 @@ Quantiles form an important concept in statistics. Its definition is slightly co
505501

506502
The quantile at level ``\alpha=0.5`` is the mean. Quantiles play an important role in estimates, where they form upper and lower bounds for confidence intervals. They are also used in hypothesis testing.
507503

508-
This part will investigate how quantiles on a finite sample differ from the true quantile. We will consider two ways of computing the quantile. Both of them sample ``n`` points from some distribution ``d``. The first one follows the statistical definition and selects the index of the ``n\alpha`` smallest observation. The second one uses the function `quantile`, which performs some interpolation.
504+
This part will investigate how quantiles on a finite sample differ from the true quantile. We will consider two ways of computing the quantile. Both of them sample ``n`` points from some distribution ``d``. The first one follows the statistical definition and selects the index of the ``n\alpha`` smallest observation by the `partialsort` function. The second one uses the function `quantile`, which performs some interpolation.
509505

510506
```@example monte
511507
quantile_sampled1(d, n::Int, α) = partialsort(rand(d, n), floor(Int, α*n))
@@ -514,7 +510,7 @@ quantile_sampled2(d, n::Int, α) = quantile(rand(d, n), α)
514510
nothing # hide
515511
```
516512

517-
We defined the vectorized version. This is not the efficient way because for every ``n``, the samples will be randomly generated again.
513+
We defined the vectorized version. This is not efficient because for every ``n``, the samples will be randomly generated again.
518514

519515
```@example monte
520516
quantile_sampled1(d, ns::AbstractVector, α) = quantile_sampled1.(d, ns, α)
@@ -554,7 +550,7 @@ plt2 = deepcopy(plt1)
554550
nothing # hide
555551
```
556552

557-
Now we add the sampled quantiles and the mean over all repetitions. Since we work with two plots at the same time, we specify into which plot we want to add the new data. It would be better to create a function for plotting and call it for `qs1` and `qs2`, but we wanted to show how to work two plots at the same time.
553+
Now we add the sampled quantiles and the mean over all repetitions. Since we work with two plots, we specify into which plot we want to add the new data. It would be better to create a function for plotting and call it for `qs1` and `qs2`, but we wanted to show how to work two plots simultaneously.
558554

559555
```@example monte
560556
for i in 1:size(qs1,1)
@@ -589,7 +585,7 @@ savefig(plt2, "quantile2.svg") # hide
589585
![](quantile1.svg)
590586
![](quantile2.svg)
591587

592-
Both sampled estimates give a lower estimate than the true quantile. In statistical methodology, these estimates are biased. We observe that the interpolated estimate is closer to the true value, and that computing the quantile even on ``10000`` points gives an uncertainty interval of approximately ``0.25``.
588+
Both sampled estimates give a lower estimate than the true quantile. In statistical methodology, these estimates are biased. We observe that the interpolated estimate is closer to the true value and that computing the quantile even on ``10000`` points gives an uncertainty interval of approximately ``0.25``.
593589

594590

595591
```@raw html

scripts/lecture_11/script.jl

Lines changed: 76 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,82 @@ using RDatasets
99
using SpecialFunctions
1010
using Statistics
1111

12+
13+
14+
15+
# Ridge regression
16+
17+
n = 10000
18+
m = 1000
19+
20+
Random.seed!(666)
21+
X = randn(n, m)
22+
23+
y = 10*X[:,1] + X[:,2] + randn(n)
24+
25+
# Exercise
26+
27+
28+
29+
# Comparison
30+
31+
@btime ridge_reg(X, y, 10);
32+
33+
@btime ridge_reg(X, y, 10, Q, Q_inv, λ);
34+
35+
# Dependence on mu
36+
37+
μs = range(0, 1000; length=50)
38+
39+
ws = hcat(ridge_reg.(Ref(X), Ref(y), μs, Ref(Q), Ref(Q_inv), Ref(λ))...)
40+
41+
plot(μs, abs.(ws');
42+
label="",
43+
yscale=:log10,
44+
xlabel="mu",
45+
ylabel="weights: log scale",
46+
)
47+
48+
# LASSO
49+
50+
S(x, η) = max(x-η, 0) - max(-x-η, 0)
51+
52+
function lasso(X, y, μ, Q, Q_inv, λ;
53+
max_iter = 100,
54+
ρ = 1e3,
55+
w = zeros(size(X,2)),
56+
u = zeros(size(X,2)),
57+
z = zeros(size(X,2)),
58+
)
59+
60+
for i in 1:max_iter
61+
w = Q * ( (Diagonal(1 ./.+ ρ)) * ( Q_inv * (X'*y + ρ*(z-u)))))
62+
z = S.(w + u, μ / ρ)
63+
u = u + w - z
64+
end
65+
return w, u, z
66+
end
67+
68+
ws = zeros(size(X,2), length(μs))
69+
70+
for (i, μ) in enumerate(μs)
71+
global w, u, z
72+
w, u, z = i > 1 ? lasso(X, y, μ, Q, Q_inv, λ; w, u, z) : lasso(X, y, μ, Q, Q_inv, λ)
73+
ws[:,i] = w
74+
end
75+
76+
plot(μs, abs.(ws');
77+
label="",
78+
yscale=:log10,
79+
xlabel="mu",
80+
ylabel="weights: log scale",
81+
)
82+
83+
84+
85+
86+
87+
1288
plot(0:0.1:10, gamma;
1389
xlabel="x",
1490
ylabel="gamma(x): log scale",
@@ -151,71 +227,3 @@ model = glm(@formula(W ~ 1 + N + Y + I + K + F), wages, InverseGaussian(), SqrtL
151227
# Exercise
152228

153229

154-
155-
# Ridge regression
156-
157-
n = 10000
158-
m = 1000
159-
160-
Random.seed!(666)
161-
X = randn(n, m)
162-
163-
y = 10*X[:,1] + X[:,2] + randn(n)
164-
165-
# Exercise
166-
167-
168-
169-
# Comparison
170-
171-
@btime ridge_reg(X, y, 10);
172-
173-
@btime ridge_reg(X, y, 10, Q, Q_inv, λ);
174-
175-
# Dependence on mu
176-
177-
μs = range(0, 1000; length=50)
178-
179-
ws = hcat(ridge_reg.(Ref(X), Ref(y), μs, Ref(Q), Ref(Q_inv), Ref(λ))...)
180-
181-
plot(μs, abs.(ws');
182-
label="",
183-
yscale=:log10,
184-
xlabel="mu",
185-
ylabel="weights: log scale",
186-
)
187-
188-
# LASSO
189-
190-
S(x, η) = max(x-η, 0) - max(-x-η, 0)
191-
192-
function lasso(X, y, μ, Q, Q_inv, λ;
193-
max_iter = 100,
194-
ρ = 1e3,
195-
w = zeros(size(X,2)),
196-
u = zeros(size(X,2)),
197-
z = zeros(size(X,2)),
198-
)
199-
200-
for i in 1:max_iter
201-
w = Q * ( (Diagonal(1 ./.+ ρ)) * ( Q_inv * (X'*y + ρ*(z-u)))))
202-
z = S.(w + u, μ / ρ)
203-
u = u + w - z
204-
end
205-
return w, u, z
206-
end
207-
208-
ws = zeros(size(X,2), length(μs))
209-
210-
for (i, μ) in enumerate(μs)
211-
global w, u, z
212-
w, u, z = i > 1 ? lasso(X, y, μ, Q, Q_inv, λ; w, u, z) : lasso(X, y, μ, Q, Q_inv, λ)
213-
ws[:,i] = w
214-
end
215-
216-
plot(μs, abs.(ws');
217-
label="",
218-
yscale=:log10,
219-
xlabel="mu",
220-
ylabel="weights: log scale",
221-
)

0 commit comments

Comments
 (0)