Fix a few crossrefs + update Zygote's page (#2064)

Saransh-cpp · web-flow · commit 15c85908fdc8 · 2022-10-18T12:21:53.000-04:00
* Fix a few crossrefs + update Zygote's page

* Sync with recent changes

* Fix docs for default_rng_value
diff --git a/docs/src/training/zygote.md b/docs/src/training/zygote.md
@@ -1,29 +1,31 @@
 # Automatic Differentiation using Zygote.jl
 
-Flux re-exports the `gradient` from [Zygote](https://github.com/FluxML/Zygote.jl), and uses this function within [`train!`](@ref) to differentiate the model. Zygote has its own [documentation](https://fluxml.ai/Zygote.jl/dev/), in particular listing some [important limitations](https://fluxml.ai/Zygote.jl/dev/limitations/).
+Flux re-exports the `gradient` from [Zygote](https://github.com/FluxML/Zygote.jl), and uses this function within [`train!`](@ref Flux.train!) to differentiate the model. Zygote has its own [documentation](https://fluxml.ai/Zygote.jl/dev/), in particular listing some [important limitations](https://fluxml.ai/Zygote.jl/dev/limitations/).
 
-### Implicit style
+## Implicit style
 
 Flux uses primarily what Zygote calls "implicit" gradients, [described here](https://fluxml.ai/Zygote.jl/dev/#Explicit-and-Implicit-Parameters-1) in its documentation. 
 
 ```@docs
 Zygote.gradient
 Zygote.Params
 Zygote.Grads
+Zygote.jacobian(loss, ::Params)
 ```
 
-### Explicit style
+## Explicit style
 
 The other way of using Zygote, and using most other AD packages, is to explicitly provide a function and its arguments.
 
 ```@docs
 Zygote.gradient(f, args...)
 Zygote.withgradient(f, args...)
 Zygote.jacobian(f, args...)
+Zygote.withgradient
 ```
 
 
-### ChainRules
+## ChainRules
 
 Sometimes it is necessary to exclude some code, or a whole function, from automatic differentiation. This can be done using [ChainRules](https://github.com/JuliaDiff/ChainRules.jl):
 
@@ -36,4 +38,4 @@ To manually supply the gradient for one function, you should define a method of
 
 ```@docs
 ChainRulesCore.rrule
-```
+```
diff --git a/src/losses/functions.jl b/src/losses/functions.jl
@@ -273,7 +273,7 @@ Return the binary cross-entropy loss, computed as
 
     agg(@.(-y * log(ŷ + ϵ) - (1 - y) * log(1 - ŷ + ϵ)))
 
-Where typically, the prediction `ŷ` is given by the output of a [sigmoid](@ref Activation-Functions) activation.
+Where typically, the prediction `ŷ` is given by the output of a [sigmoid](@ref Activation-Functions-from-NNlib.jl) activation.
 The `ϵ` term is included to avoid infinity. Using [`logitbinarycrossentropy`](@ref) is recomended
 over `binarycrossentropy` for numerical stability.
 
diff --git a/src/utils.jl b/src/utils.jl
@@ -49,18 +49,20 @@ rng_from_array(::CuArray) = CUDA.default_rng()
 @non_differentiable rng_from_array(::Any)
 
 if VERSION >= v"1.7"
-  @doc """
-      default_rng_value()
-
-  Create an instance of the default RNG depending on Julia's version.
-  - Julia version is < 1.7: `Random.GLOBAL_RNG`
-  - Julia version is >= 1.7: `Random.default_rng()`
-  """
   default_rng_value() = Random.default_rng()
 else
   default_rng_value() = Random.GLOBAL_RNG
 end
 
+"""
+    default_rng_value()
+
+Create an instance of the default RNG depending on Julia's version.
+- Julia version is < 1.7: `Random.GLOBAL_RNG`
+- Julia version is >= 1.7: `Random.default_rng()`
+"""
+default_rng_value
+
 """
     glorot_uniform([rng = default_rng_value()], size...; gain = 1) -> Array
     glorot_uniform([rng]; kw...) -> Function