You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add more xrefs
* Fix some typos
* Update docs/src/index.md
Co-Authored-By: Lyndon White <oxinabox@ucc.asn.au>
Co-authored-by: Lyndon White <oxinabox@ucc.asn.au>
Copy file name to clipboardExpand all lines: docs/src/index.md
+12-11Lines changed: 12 additions & 11 deletions
Original file line number
Diff line number
Diff line change
@@ -33,7 +33,7 @@ Knowing rules for more complicated functions speeds up the autodiff process as i
33
33
!!! terminology "`frule` and `rrule`"
34
34
`frule` and `rrule` are ChainRules specific terms.
35
35
Their exact functioning is fairly ChainRules specific, though other tools have similar functions.
36
-
The core notion is sometimes called _custom AD primitives_, _custom adjoints_, _custom_gradients_, _custom sensitivities_.
36
+
The core notion is sometimes called _custom AD primitives_, _custom adjoints_, _custom gradients_, _custom sensitivities_.
37
37
38
38
The rules are encoded as `frule`s and `rrule`s, for use in forward-mode and reverse-mode differentiation respectively.
39
39
@@ -63,7 +63,8 @@ end
63
63
where again `y = foo(args; kwargs...)`,
64
64
and `∂Y` is the result of propagating the derivative information forwards at that point.
65
65
This propagation is call the pushforward.
66
-
One could think of writing `∂Y = pushforward(Δself, Δargs)`, and often we will think of the `frule` as having the primal computation `y = foo(args...; kwargs...)`, and the push-forward `∂Y = pushforward(Δself, Δargs...)`
66
+
Often we will think of the `frule` as having the primal computation `y = foo(args...; kwargs...)`, and the pushforward `∂Y = pushforward(Δself, Δargs...)`,
67
+
even though they are not present in seperate forms in the code.
67
68
68
69
69
70
!!! note "Why `rrule` returns a pullback but `frule` doesn't return a pushforward"
@@ -228,9 +229,9 @@ Similarly every `pullback` returns an extra `∂self`, which for things without
228
229
229
230
### Pullback/Pushforward and Directional Derivative/Gradient
230
231
231
-
The most trivial use of the `pushforward` from within `frule` is to calculate the directional derivative:
232
+
The most trivial use of the `pushforward` from within `frule` is to calculate the [directional derivative](https://en.wikipedia.org/wiki/Directional_derivative):
232
233
233
-
If we would like to know the the directional derivative of `f` for an input change of `(1.5, 0.4, -1)`
234
+
If we would like to know the directional derivative of `f` for an input change of `(1.5, 0.4, -1)`
234
235
235
236
```julia
236
237
direction = (1.5, 0.4, -1) # (ȧ, ḃ, ċ)
@@ -244,7 +245,7 @@ y, ∂y_∂b = frule((Zero(), 0, 1, 0), f, a, b, c)
244
245
y, ∂y_∂c =frule((Zero(), 0, 0, 1), f, a, b, c)
245
246
```
246
247
247
-
Similarly, the most trivial use of `rrule` and returned `pullback` is to calculate the [Gradient](https://en.wikipedia.org/wiki/Gradient):
248
+
Similarly, the most trivial use of `rrule` and returned `pullback` is to calculate the [gradient](https://en.wikipedia.org/wiki/Gradient):
248
249
249
250
```julia
250
251
y, f_pullback =rrule(f, a, b, c)
@@ -259,20 +260,20 @@ And we thus have the partial derivatives ``\overline{\mathrm{self}}, = \dfrac{
259
260
The values that come back from pullbacks or pushforwards are not always the same type as the input/outputs of the primal function.
260
261
They are differentials, which correspond roughly to something able to represent the difference between two values of the primal types.
261
262
A differential might be such a regular type, like a `Number`, or a `Matrix`, matching to the original type;
262
-
or it might be one of the `AbstractDifferential` subtypes.
263
+
or it might be one of the [`AbstractDifferential`](@ref ChainRulesCore.AbstractDifferential) subtypes.
263
264
264
265
Differentials support a number of operations.
265
266
Most importantly: `+` and `*`, which let them act as mathematical objects.
266
267
267
268
The most important `AbstractDifferential`s when getting started are the ones about avoiding work:
268
269
269
-
-`Thunk`: this is a deferred computation. A thunk is a [word for a zero argument closure](https://en.wikipedia.org/wiki/Thunk). A computation wrapped in a `@thunk` doesn't get evaluated until `unthunk` is called on the thunk. `unthunk` is a no-op on non-thunked inputs.
270
-
-`One`, `Zero`: There are special representations of `1` and `0`. They do great things around avoiding expanding `Thunks` in multiplication and (for `Zero`) addition.
270
+
-[`Thunk`](@ref): this is a deferred computation. A thunk is a [word for a zero argument closure](https://en.wikipedia.org/wiki/Thunk). A computation wrapped in a `@thunk` doesn't get evaluated until [`unthunk`](@ref) is called on the thunk. `unthunk` is a no-op on non-thunked inputs.
271
+
-[`One`](@ref), [`Zero`](@ref): There are special representations of `1` and `0`. They do great things around avoiding expanding `Thunks` in multiplication and (for `Zero`) addition.
271
272
272
273
### Other `AbstractDifferential`s:
273
-
-`Composite{P}`: this is the differential for tuples and structs. Use it like a `Tuple` or `NamedTuple`. The type parameter `P` is for the primal type.
274
-
-`DoesNotExist`: Zero-like, represents that the operation on this input is not differentiable. Its primal type is normally `Integer` or `Bool`.
275
-
-`InplaceableThunk`: it is like a `Thunk` but it can do in-place `add!`.
274
+
-[`Composite{P}`](@ref Composite): this is the differential for tuples and structs. Use it like a `Tuple` or `NamedTuple`. The type parameter `P` is for the primal type.
275
+
-[`DoesNotExist`](@ref): Zero-like, represents that the operation on this input is not differentiable. Its primal type is normally `Integer` or `Bool`.
276
+
-[`InplaceableThunk`](@ref): it is like a `Thunk` but it can do in-place `add!`.
0 commit comments