You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both approaches try to keep the norm of parameters ``w`` small to prevent overfitting. The first approach results in a simpler numerical method, while the second one induces sparsity. Before we start with both topics, we will briefly mention matrix decompositions which plays a crucial part in numerical computations.
@@ -27,7 +27,7 @@ Both approaches try to keep the norm of parameters ``w`` small to prevent overfi
27
27
Consider a square matrix ``A\in \mathbb R^{n\times n}`` with real-valued entries. We there exist ``\lambda\in\mathbb R`` and ``v\in\mathbb^n`` such that
28
28
29
29
```math
30
-
Av = \lambda b,
30
+
Av = \lambda v,
31
31
```
32
32
33
33
we say that ``\lambda`` is a eigenvalue of ``A`` and ``v`` is the corresponding eigenvector.
@@ -41,13 +41,13 @@ A = Q\Lambda Q^\top
41
41
and for any real number ``\mu``, we also have
42
42
43
43
```math
44
-
A + \mu I = Q(\Lambda + \mu I) Q^\top
44
+
A + \mu I = Q(\Lambda + \mu I) Q^\top.
45
45
```
46
46
47
47
Since the eigenvectors are perpendicular, ``Q`` is an orthonormal matrix and therefore ``Q^{-1} = Q^\top``. This implies that we can easily invert the matrix ``A + \mu I`` by
Because ``\Lambda + \mu I`` is a diagonal matrix, its inverse is simple to compute.
@@ -78,7 +78,7 @@ X^\top X = Q\Lambda Q^\top.
78
78
Then the formula for optimal weights simplifies into
79
79
80
80
```math
81
-
w = Q^\top (\Lambda+\mu I)^{-1} QX^\top y.
81
+
w = Q(\Lambda+\mu I)^{-1} Q^\top X^\top y.
82
82
```
83
83
84
84
Since this formula uses only matrix-vector multiplication and an inversion of a diagonal matrix, we can employ it to fast compute the solution for multiple values of ``\mu``.
0 commit comments