@@ -21,7 +21,7 @@ Which you *can* do.
21
21
However, there is no where to go with an error, the user still wants a derivative; so this is not useful.
22
22
23
23
Let us explore what is useful:
24
- # Case Studies
24
+ ## Case Studies
25
25
26
26
``` @setup nondiff
27
27
using Plots
@@ -72,8 +72,9 @@ We could say there derivative at 0 is:
72
72
- 3: which is the mean of ` [1, 5] ` , and agrees with central finite differencing
73
73
74
74
All of these options are perfectly nice members of the [ subderivative] ( https://en.wikipedia.org/wiki/Subderivative ) .
75
- Saying it is ` 3 ` is the arguably the nicest, but it is also the most expensive to compute; and it will
76
-
75
+ ` 3 ` is the arguably the nicest, but it is also the most expensive to compute.
76
+ In general all are acceptable.
77
+
77
78
78
79
### Derivative zero almost everywhere
79
80
@@ -88,25 +89,38 @@ The other option for `x->ceil(x)` would be relax the problem into `x->x`, and th
88
89
But that it too weird, if the use wanted a relaxation of the problem then they would provide one.
89
90
We can not be imposing that relaxation on to ` ceil ` for everyone is not reasonable.
90
91
91
- ### Primal finite, and derivative nonfinite and same on both sides
92
-
92
+ ### Not defined on one-side
93
93
``` @example nondiff
94
- plot(cbrt)
94
+ plot(x->exp(2log(x)))
95
+ plot!(; xlims=(-10,10), ylims=(-10,10)) #hide
95
96
```
96
97
98
+ We do not have to worry about what to return for the side where it is not defined.
99
+ As we will never be asked for the derivative at e.g. ` x=-2.5 ` since the primal function errors.
100
+ But we do need to worry about at the boundary -- if that boundary point doesn't error.
101
+
102
+ Since we will never be asked about the left-hand side (as the primal errors), we can use just the right-hand side derivative.
103
+ In this case giving 0.0.
104
+ `
105
+ Also nice in this case is that it agrees with the symbolic simplification of ` x->exp(2log(x)) ` into ` x->x^2 ` .
106
+
97
107
108
+ ### Derivative nonfinite and same on both sides
98
109
99
- ### Primal and derivative Non-finite and different on both sides
100
110
``` @example nondiff
101
- plot(x->inv(x^2))
102
- plot!(; xlims=(-1,1), ylims=(-100,100)) #hide
111
+ plot(cbrt)
103
112
```
104
113
105
- In this case the primal isn't finite, so the value of the derivative can be assumed to matter less.
106
- It is not surprising to see a nonfinite gradient for nonfinite primal.
107
- So it is fine to have a the gradient being nonfinite.
114
+ Here we have no real choice but to say the derivative at ` 0 ` is ` Inf ` .
115
+ We could consider as an alternative saying some large but finite value.
116
+ However, if too large it will just overflow rapidly anyway; and if too small it will not dominate over finite terms.
117
+ It is not possible to find a given value that is always large enough.
118
+ Our alternatives woud be to consider the dederivative at ` nextfloat(0.0) ` or ` prevfloat(0.0) ` .
119
+ But this is more or less the same as choosing some large value -- in this case an extremely large value that will rapidly overflow.
120
+
121
+
122
+ ### Derivative on-finite and different on both sides
108
123
109
- ## Primal finite and derivative nonfinite and different on each side
110
124
``` @example nondiff
111
125
plot(x-> sign(x) * cbrt(x))
112
126
```
@@ -115,28 +129,18 @@ In this example, the primal is defined and finite, so we would like a derivative
115
129
We are back in the case of a local minimal like we were for ` abs ` .
116
130
We can make most of the same arguments as we made there to justify saying the derivative is zero.
117
131
118
- ### Not defined on one-side
119
- ``` @example nondiff
120
- plot(x->exp(2log(x)))
121
- ```
122
-
123
- We do not have to worry about what to return for the side where it is not defined.
124
- As we will never be asked for the derivative at e.g. ` x=-2.5 ` since the primal function errors.
125
- But we do need to worry about at the boundary -- if that boundary point doesn't error.
126
-
127
- Since we will never be asked about the left-hand side (as the primal errors), we can use just the right-hand side derivative.
128
- In this case giving 0.0.
129
- `
130
- Also nice in this case is that it agrees with the symbolic simplification of ` x->exp(2log(x)) ` into ` x->x^2 ` .
132
+ ## Conclusion
131
133
134
+ From the case studies a few general rules can be seen for how to choose a value that is _ useful_ .
135
+ These rough rules are:
136
+ - Say the derivative is 0 at local optima
137
+ - If the derivative from one side is defined and the other isn't, say it is the derivative taken from defined side.
138
+ - If the derivative from one side is finite and the other isn't, say it is the derivative taken from finite side.
139
+ - When derivative from each side is not equal, strongly consider reporting the average
132
140
141
+ Our goal as always, is to get a pragmatically useful result for everyone, which must by necessity also avoid a pathological result for anyone.
133
142
134
- ### Not defined on one side, non-finite on the other
135
- ``` @example nondiff
136
- plot(log)
137
- ```
138
143
139
- Here there is no harm in taking the value on the defined, finite
140
144
141
145
### sub/super-differential convention
142
146
** TODO: Incorperate this with rest of the document. Or move to design notes**
0 commit comments