Skip to content

Commit 987b83a

Browse files
authored
Merge pull request #637 from dreivmeister/patch-1
Update nondiff_points.md
2 parents efc2f86 + dac184e commit 987b83a

File tree

1 file changed

+15
-15
lines changed

1 file changed

+15
-15
lines changed

docs/src/maths/nondiff_points.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ gr(framestyle=:origin, legend=false)
2929
```@example nondiff
3030
plot(x->x^3)
3131
```
32-
This is the standard case, one can returned the derivative that is defined according to school room calculus.
32+
This is the standard case, one can return the derivative that is defined according to school room calculus.
3333
Here we would reasonably say that at `x=0` the derivative is `3*0^2=0`.
3434

3535

@@ -40,18 +40,18 @@ Here we would reasonably say that at `x=0` the derivative is `3*0^2=0`.
4040
plot(abs)
4141
```
4242

43-
`abs` is the classic example of a function where the derivative is not defines as the limit from above is not equal to the limit from below
43+
`abs` is the classic example of a function where the derivative is not defined, as the limit from above is not equal to the limit from below.
4444

4545
$$\operatorname{abs}'(0) = \lim_{h \to 0^-} \dfrac{\operatorname{abs}(0)-\operatorname{abs}(0-h)}{0-h} = -1$$
4646
$$\operatorname{abs}'(0) = \lim_{h \to 0^+} \dfrac{\operatorname{abs}(0)-\operatorname{abs}(0-h)}{0-h} = 1$$
4747

48-
Now, as discussed in the introduction the AD system would on it's own choose either 1 or -1, depending on implementation.
48+
Now, as discussed in the introduction, the AD system would on it's own choose either 1 or -1, depending on implementation.
4949

5050
We however have a potentially much nicer answer available to use: 0.
5151

5252
This has a number of advantages.
5353
- It follows the rule that derivatives are zero at local minima (and maxima).
54-
- If you leave a gradient decent optimizer running it will eventually actually converge absolutely to the point -- where as with it being 1 or -1 it would never outright converge it would always flee.
54+
- If you leave a gradient descent optimizer running it will eventually actually converge absolutely to the point -- where as with it being 1 or -1 it would never outright converge it would always flee.
5555

5656
Further:
5757
- It is a perfectly nice member of the [subderivative](https://en.wikipedia.org/wiki/Subderivative).
@@ -61,9 +61,9 @@ Further:
6161
plot(x-> x < 0 ? x : 5x)
6262
```
6363

64-
Here was have 3 main options, all are good.
64+
Here we have 3 main options, all are good.
6565

66-
We could say there derivative at 0 is:
66+
We could say the derivative at 0 is:
6767
- 1: which agrees with backwards finite differencing
6868
- 5: which agrees with forwards finite differencing
6969
- 3: which is the mean of `[1, 5]`, and agrees with central finite differencing
@@ -82,9 +82,9 @@ plot(ceil)
8282
Here it is most useful to say the derivative is zero everywhere.
8383
The limits are zero from both sides.
8484

85-
The other option for `x->ceil(x)` would be relax the problem into `x->x`, and thus say it is 1 everywhere
86-
But that it too weird, if the use wanted a relaxation of the problem then they would provide one.
87-
We can not be imposing that relaxation on to `ceil` for everyone is not reasonable.
85+
The other option for `x->ceil(x)` would be to relax the problem into `x->x`, and thus say it is 1 everywhere.
86+
But that it too weird, if the user wanted a relaxation of the problem then they would provide one.
87+
We can not be imposing that relaxation on to `ceil`, as it is not reasonable for everyone.
8888

8989
### Not defined on one-side
9090
```@example nondiff
@@ -122,17 +122,17 @@ But this is more or less the same as choosing some large value -- in this case a
122122
plot(x-> sign(x) * cbrt(x))
123123
```
124124

125-
In this example, the primal is defined and finite, so we would like a derivative to defined.
126-
We are back in the case of a local minimal like we were for `abs`.
125+
In this example, the primal is defined and finite, so we would like a derivative to be defined.
126+
We are back in the case of a local minimum like we were for `abs`.
127127
We can make most of the same arguments as we made there to justify saying the derivative is zero.
128128

129129
## Conclusion
130130

131131
From the case studies a few general rules can be seen for how to choose a value that is _useful_.
132132
These rough rules are:
133-
- Say the derivative is 0 at local optima
134-
- If the derivative from one side is defined and the other isn't, say it is the derivative taken from defined side.
135-
- If the derivative from one side is finite and the other isn't, say it is the derivative taken from finite side.
136-
- When derivative from each side is not equal, strongly consider reporting the average
133+
- Say the derivative is 0 at local optima.
134+
- If the derivative from one side is defined and the other isn't, say it is the derivative taken from the defined side.
135+
- If the derivative from one side is finite and the other isn't, say it is the derivative taken from the finite side.
136+
- When derivative from each side is not equal, strongly consider reporting the average.
137137

138138
Our goal as always, is to get a pragmatically useful result for everyone, which must by necessity also avoid a pathological result for anyone.

0 commit comments

Comments
 (0)