You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now, as discussed in the introduction the AD system would on it's own choose either 1 or -1, depending on implementation.
52
+
53
+
We however have a potentially much nicer answer available to use: 0.
54
+
55
+
This has a number of advantages.
56
+
- It follows the rule that derivatives are zero at local minima (and maxima).
57
+
- If you leave a gradient decent optimizer running it will eventually actually converge absolutely to the point -- where as with it being 1 or -1 it would never outright converge it would always flee.
58
+
59
+
Further:
60
+
- It is a perfectly nice member of the [subderivative](https://en.wikipedia.org/wiki/Subderivative).
61
+
- It is the mean of the derivative on each side; which means that it will agree with central finite differencing at the point.
44
62
### Piecewise slope change
45
63
```@example nondiff
46
64
plot(x-> x < 0 ? x : 5x)
47
65
```
48
66
49
-
### Zero almost everywhere
67
+
### Derivative zero almost everywhere
68
+
69
+
```@example nondiff
70
+
plot(ceil)
71
+
```
72
+
73
+
### Primal finite, and derivative nonfinite and same on both sides
50
74
51
75
```@example nondiff
52
-
plot(round)
76
+
plot(cbrt)
53
77
```
54
78
55
-
### Non-finite and same on both sides
79
+
80
+
(derivative nonfinite and different on each side is not possible with a finite and defined primal.)
81
+
### Primal and derivative Non-finite and same on both sides
56
82
```@example nondiff
57
83
plot(x->inv(x^2))
58
84
plot!(; xlims=(-1,1), ylims=(-100,100)) #hide
59
85
```
60
86
61
-
### Non-finite and differing on both sides
87
+
### Primal and gradient Non-finite and differing on both sides
0 commit comments