You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/index.md
+22-27Lines changed: 22 additions & 27 deletions
Original file line number
Diff line number
Diff line change
@@ -74,14 +74,14 @@ Almost always the _pushforward_/_pullback_ will be declared locally within the `
74
74
##### Less formally
75
75
76
76
- The **pushforward** takes a wiggle in the _input space_, and tells what wobble you would create in the output space, by passing it through the function.
77
-
- The **pullback** takes wobblyness information with respect to the function's output, and tells the equivalent wobblyness with repect to the functions input.
77
+
- The **pullback** takes wobbliness information with respect to the function's output, and tells the equivalent wobbliness with respect to the functions input.
78
78
79
79
##### More formally
80
80
The **pushforward** of ``f`` takes the _sensitivity_ of the input of ``f`` to a quantity, and gives the _sensitivity_ of the output of ``f`` to that quantity
81
81
The **pullback** of ``f`` takes the _sensitivity_ of a quantity to the output of ``f``, and gives the _sensitivity_ of that quantity to the input of ``f``.
82
82
83
83
#### Math
84
-
This is all a bit simplied by talking in 1D.
84
+
This is all a bit simplified by talking in 1D.
85
85
86
86
##### Lighter Math
87
87
For a chain of expressions:
@@ -91,30 +91,25 @@ b = g(a)
91
91
c = h(b)
92
92
```
93
93
94
-
The pullback of `g`, which incorperates the knowledge of `∂b/∂a`,
95
-
applies the chainrule to go from `∂c/∂b` to `∂c/∂a`.
94
+
The pullback of `g`, which incorporates the knowledge of `∂b/∂a`,
95
+
applies the chain rule to go from `∂c/∂b` to `∂c/∂a`.
96
96
97
-
the pushforward of `g`, which also incorperates the knowledge of `∂b/∂a`,
98
-
applies the chainrule to go from `∂a/∂x` to `∂b/∂x`.
97
+
The pushforward of `g`, which also incorporates the knowledge of `∂b/∂a`,
98
+
applies the chain rule to go from `∂a/∂x` to `∂b/∂x`.
99
99
100
100
#### Heavier Math
101
-
If I have some functions: ``g(a)``, ``h(b)`` and ``f(x)=g(h(x))``,
102
-
and I know the pullback of ``g``, at ``h(x)`` written: ``\mathrm{pullback}_{g(a)|a=h(x)}``,
101
+
If I have some functions: ``g(a)``, ``h(b)`` and ``f(x)=g(h(x))``, and I know
102
+
the pullback of ``g``, at ``h(x)`` written: ``\mathrm{pullback}_{g(a)|a=h(x)}``,
103
+
and I know the derivative of ``h`` with respect to its input ``b`` at ``g(x)``,
104
+
written: ``\left.\dfrac{∂h}{∂b}\right|_{b=g(x)}`` Then I can use the pullback to
105
+
find: ``\dfrac{∂f}{∂x}``:
103
106
104
-
and I know the deriviative of h with respect to its input ``b`` at ``g(x)``, written:
The input to the pushforward is often called the _perturbation_.
136
131
If the function is `y = f(x)` often the pushforward will be written `ẏ = pushforward(ṡelf, ẋ)`.
137
-
(`ẏ` is commonly used to represent the pertubation for `y`)
132
+
(`ẏ` is commonly used to represent the perturbation for `y`)
138
133
139
134
!!! note
140
135
@@ -421,18 +416,18 @@ As a notation that is the same across propagators, regardless of direction. (Inc
421
416
422
417
423
418
### Why does `frule` and `rrule` return the function evaluation?
424
-
You might wonder why `frule(f, x)` returns `f(x)` and the pushforward for `f` at `x`, and similarly for `rrule`returing`f(x)` and the pullback for `f` at `x`.
425
-
Why not just return the pushforward/pullback, and let the user call `f(x)` to get the answer seperately?
419
+
You might wonder why `frule(f, x)` returns `f(x)` and the pushforward for `f` at `x`, and similarly for `rrule`returning`f(x)` and the pullback for `f` at `x`.
420
+
Why not just return the pushforward/pullback, and let the user call `f(x)` to get the answer separately?
426
421
427
422
There are two reasons the rules also calculate the `f(x)`.
428
423
1. For some rules the output value is used in the definition of its propagator. For example `tan`.
429
424
2. For some rules an alternative way of calculating `f(x)` can give the same answer while also generating intermediate values that can be used in the calculations within the propagator.
430
425
431
426
### Where are the gradients for keyword arguments?
432
427
_pullbacks_ do not return a gradient for keyword arguments;
433
-
similarly _pushfowards_ do not accept a pertubation for keyword arguments.
428
+
similarly _pushfowards_ do not accept a perturbation for keyword arguments.
434
429
This is because in practice functions are very rarely differentiable with respect to keyword arguments.
435
-
As a rule keyword arguments tend to control side-effects, like logging verbsoity,
430
+
As a rule keyword arguments tend to control side-effects, like logging verbosity,
436
431
or to be functionality changing to perform a different operation, e.g. `dims=3`, and thus not differentiable.
437
-
To the best of our knowledge no julia AD system, with support for the definition of custom primatives, supports differentating with respect to keyword arguments.
432
+
To the best of our knowledge no Julia AD system, with support for the definition of custom primitives, supports differentiating with respect to keyword arguments.
438
433
At some point in the future ChainRules may support these. Maybe.
0 commit comments