|
1 | 1 | # What to return for non-differentiable points
|
2 | 2 | !!! info "What is the short version?"
|
3 |
| - If the function is not differentiable due to e.g. a branch, like `abs`, your rule can reasonably claim the derivative at that point is the value from either branch, *or* any value in-between (e.g. for `abs` claiming 0 is a good idea). |
4 |
| - If it is not differentiable due to the primal not being defined on one side, you can set it to what ever you like. |
5 |
| - Your rule should claim a derivative that is *useful*. |
6 |
| -In calculus one learns that if the derivative as computed by approaching from the left, |
7 |
| -and the derivative one computes as approaching from the right are not equal then the derivative is not defined, |
8 |
| -and we say the function is not differentiable at that point. |
9 |
| -This is distinct from the notion captured by [`NoTangent`](@ref), which is that the tangent space itself is not defined: because in some sense the primal value can not be perturbed e.g. is is a discrete type. |
| 3 | + If the function is not-differentiable choose to return something useful rather than erroring. |
| 4 | + For a branch a function is not differentiable due to e.g. a branch, like `abs`, your rule can reasonably claim the derivative at that point is the value from either branch, *or* any value in-between. |
| 5 | + In particular for local optima (like in the case of `abs`) claiming the derivative is 0 is a good idea. |
| 6 | + Similarly, if derivative is from one side is not defined, or is not finite, return the derivative from the other side. |
| 7 | + Throwing an error, or returning `NaN` is generally the least useful option. |
10 | 8 |
|
11 | 9 | However, contrary to what calculus says most autodiff systems will return an answer for such functions.
|
12 | 10 | For example for: `abs_left(x) = (x <= 0) ? -x : x`, AD will say the derivative at `x=0` is `-1`.
|
@@ -137,4 +135,4 @@ These rough rules are:
|
137 | 135 | - If the derivative from one side is finite and the other isn't, say it is the derivative taken from finite side.
|
138 | 136 | - When derivative from each side is not equal, strongly consider reporting the average
|
139 | 137 |
|
140 |
| -Our goal as always, is to get a pragmatically useful result for everyone, which must by necessity also avoid a pathological result for anyone. |
| 138 | +Our goal as always, is to get a pragmatically useful result for everyone, which must by necessity also avoid a pathological result for anyone. |
0 commit comments