You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/design/changing_the_primal.md
+5-3Lines changed: 5 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -460,12 +460,12 @@ The differences in practice are around $10^{-15}$, which while very small on abs
460
460
Roughly speaking:
461
461
`Y=A\B` is the function that finds the least-square solution to `YA ≈ B`.
462
462
When solving such a system, the efficient way to do so is to factorize `A` into an appropriate factorized form such as `Cholesky` or `QR`, then perform the `\` operation on the factorized form.
463
-
The pullback of `A\B` with respect to `B` is `Ȳ-> A' \ Ȳ`.
464
-
It should be noted that this involves computing the factorization of `A'` (the adjoint of `A`).
463
+
The pullback of `A\B` with respect to `B` is `Ȳ-> A' \ Ȳ`.
464
+
It should be noted that this involves computing the factorization of `A'` (the adjoint of `A`).[^8]
465
465
In this computation the factorization of the original `A` can reused.
466
466
Doing so can give a 4x speed-up.
467
467
468
-
We don't have this in ChainRules.jl yet, because Julia is missing some definitions of `adjoint` of factorizations ([JuliaLang/julia#38293](https://github.com/JuliaLang/julia/issues/38293)).
468
+
We don't have this in ChainRules.jl yet, because Julia is missing some definitions of `adjoint` of factorizations ([JuliaLang/julia#38293](https://github.com/JuliaLang/julia/issues/38293)).[^8]
469
469
We have been promised them for Julia v1.7 though.
470
470
You can see what the code would look like in [PR #302](https://github.com/JuliaDiff/ChainRules.jl/pull/302).
471
471
@@ -505,3 +505,5 @@ Being able to change the primal computation is practically essential for a high
505
505
Rather than remembering `y` and `ex` to use in the pullback, we could compute `y / (1 + ex)` during the augmented primal, and just remember that.
506
506
507
507
[^7]: [Al-Mohy, Awad H. and Higham, Nicholas J. (2009) _Computing the Fréchet Derivative of the Matrix Exponential, with an application to Condition Number Estimation_. SIAM Journal On Matrix Analysis and Applications., 30 (4). pp. 1639-1657. ISSN 1095-7162](http://eprints.maths.manchester.ac.uk/1218/)
508
+
509
+
[^8]: To be clear here we mean `adjoint` as in the conjugate transpose of a matrix, rather than in the sense of reverse mode AD.
0 commit comments