You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/design/changing_the_primal.md
+1-6Lines changed: 1 addition & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,6 @@ It might be surprising to some AD authors, who might expect just a function that
6
6
In particularly, `rrule` allows you to _change_ how the primal result is computed.
7
7
We will illustrate in this document why being able to change the computation of the primal is crucial for efficient AD.
8
8
9
-
10
9
!!! note "What about `frule`?"
11
10
Discussion here is focused on on reverse mode and `rrule`.
12
11
Similar concerns do apply to forward mode and `frule`.
@@ -15,9 +14,6 @@ We will illustrate in this document why being able to change the computation of
15
14
In fact in forward mode there are even more opportunities to take advantage of sharing work between the primal and derivative computations.
16
15
A particularly notable example is in efficiently calculating the pushforward of solving a differential equation via expanding the system of equations to also include the derivatives before solving it.
17
16
18
-
19
-
20
-
21
17
## The Journey to `rrule`
22
18
23
19
Let's imagine a different system for rules, one that doesn't let you define the computation of the primal.
@@ -136,7 +132,6 @@ And it is faster to reuse the `exp(x)` in computing `σ(x)` and `σ(-x)`.
136
132
How can we incorporate this insight into our system?
137
133
We know we can compute both of these in the primal — because they only depend on `x` and not on `ȳ` — but there is nowhere to put them that is accessible both to the primal pass and the gradient pass code.
138
134
139
-
140
135
What if we introduced some variable called `intermediates` that is also recorded onto the tape during the primal pass?
141
136
We would need to be able to modify the primal pass to do this, so that we can actually put the data into the `intermediates`.
142
137
So we will introduce a function: `augmented_primal`, that will return the primal output plus the `intermediates` that we want to reuse in the gradient pass.
@@ -475,7 +470,7 @@ We don't have this in ChainRules.jl yet, because Julia is missing some definitio
475
470
I have been promised them for Julia v1.7 though.
476
471
You can see what the code would look like in [PR #302](https://github.com/JuliaDiff/ChainRules.jl/pull/302).
477
472
478
-
# Conclusion
473
+
##Conclusion
479
474
This document has explained why [`rrule`](@ref) is the way it is.
480
475
In particular it has highlighted why the primal computation is able to be changed from simply calling the function.
481
476
Further, it has explained why `rrule` returns a closure for the pullback, rather than it being a separate function.
0 commit comments