@@ -45,18 +45,25 @@ julia> t()()
45
45
### When to `@thunk`?
46
46
When writing `rrule`s (and to a lesser exent `frule`s), it is important to `@thunk`
47
47
appropriately.
48
- Propagation rule's that return multiple derivatives are not able to do all the computing themselves.
49
- By `@thunk`ing the work required for each, they then compute only what is needed.
48
+ Propagation rule's that return multiple derivatives may not hae all deriviatives used.
49
+ By `@thunk`ing the work required for each derivative, they then compute only what is needed.
50
+
51
+ #### How to thunks prevent work?
52
+ If we have `res = pullback(...) = @thunk(f(x)), @thunk(g(x))`
53
+ then if we did `dx + res[1]` then only `f(x)` would be evaluated, not `g(x)`.
54
+ Also if we did `Zero() * res[1]` then the result would be `Zero()` and `f(x)` would not be evaluated.
50
55
51
56
#### So why not thunk everything?
52
57
`@thunk` creates a closure over the expression, which (effectively) creates a `struct`
53
58
with a field for each variable used in the expression, and call overloaded.
54
59
55
60
Do not use `@thunk` if this would be equal or more work than actually evaluating the expression itself. Examples being:
56
- - The expression wrapping something in a `struct`, such as `Adjoint(x)` or `Diagonal(x)`
57
61
- The expression being a constant
62
+ - The expression is merely wrapping something in a `struct`, such as `Adjoint(x)` or `Diagonal(x)`
58
63
- The expression being itself a `thunk`
59
64
- The expression being from another `rrule` or `frule` (it would be `@thunk`ed if required by the defining rule already)
65
+ - There is only one derivative being returned, so from the fact that the user called `frule`/`rrule`
66
+ they clearly will want to use that one.
60
67
"""
61
68
struct Thunk{F} <: AbstractThunk
62
69
f:: F
0 commit comments