Skip to content

Commit af2276b

Browse files
authored
Merge pull request #16 from JuliaGNI/make_script_work
Refactored Symbolic Neural Networks
2 parents e05b6d8 + e09ee57 commit af2276b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+1956
-683
lines changed

.githooks/pre-push

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# pre-push git hook that runs all tests before pushing
2+
3+
red='\033[0;31m'
4+
green='\033[0;32m'
5+
no_color='\033[0m'
6+
7+
reponame=$(basename `git rev-parse --show-toplevel`)
8+
9+
10+
echo "\nRunning pre-push hook\n"
11+
echo "Testing $reponame"
12+
julia --project=@. -e "using Pkg; Pkg.test(\"SymbolicNeuralNetworks\")"
13+
14+
if [[ $? -ne 0 ]]; then
15+
echo "\n${red}ERROR - Tests must pass before push!\n${no_color}"
16+
exit 1
17+
fi
18+
19+
echo "\n${green}Git hook was SUCCESSFUL!${no_color}\n"

.github/workflows/CI.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,6 @@ jobs:
1919
fail-fast: false
2020
matrix:
2121
version:
22-
- '1.6'
2322
- '1.10'
2423
- '^1.11.0-0'
2524
os:

Project.toml

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,22 +5,30 @@ version = "0.1.2"
55

66
[deps]
77
AbstractNeuralNetworks = "60874f82-5ada-4c70-bd1c-fa6be7711c8a"
8-
KernelAbstractions = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
8+
Latexify = "23fbe1c1-3f47-55db-b15f-69d7ec21a316"
99
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
1010
RuntimeGeneratedFunctions = "7e49a35a-f44a-4d26-94aa-eba1b4ca6b47"
11-
SafeTestsets = "1bc83da4-3b8d-516f-aca4-4fe02f6d838f"
1211
Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"
1312

1413
[compat]
15-
AbstractNeuralNetworks = "0.1, 0.3, 0.4"
16-
KernelAbstractions = "0.9"
14+
AbstractNeuralNetworks = "0.3, 0.4"
15+
Documenter = "1.8.0"
16+
ForwardDiff = "0.10.38"
17+
Latexify = "0.16.5"
1718
RuntimeGeneratedFunctions = "0.5"
1819
SafeTestsets = "0.1"
1920
Symbolics = "5, 6"
21+
Zygote = "0.6.73"
2022
julia = "1.6"
2123

2224
[extras]
25+
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
26+
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
27+
Latexify = "23fbe1c1-3f47-55db-b15f-69d7ec21a316"
28+
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
29+
SafeTestsets = "1bc83da4-3b8d-516f-aca4-4fe02f6d838f"
2330
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
31+
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
2432

2533
[targets]
26-
test = ["Test"]
34+
test = ["Test", "ForwardDiff", "Random", "Documenter", "Latexify", "SafeTestsets", "Zygote"]

README.md

Lines changed: 47 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -6,51 +6,68 @@
66
[![Coverage](https://codecov.io/gh/JuliaGNI/SymbolicNeuralNetworks.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/JuliaGNI/SymbolicNeuralNetworks.jl)
77
[![PkgEval](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/S/SymbolicNeuralNetworks.svg)](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/S/SymbolicNeuralNetworks.html)
88

9-
SymbolicNeuralNetworks.jl was created to take advantage of [Symbolics.jl](https://symbolics.juliasymbolics.org/stable/) for training neural networks by accelarating their evaluation and by simplifing the computation of some derivatives of the neural network that may be needed for loss functions. This package is based on [AbstractNeuralNetwork.jl](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl) and can be applied to [GeometricMachineLearning.jl](https://github.com/JuliaGNI/GeometricMachineLearning.jl).
9+
In a perfect world we probably would not need `SymbolicNeuralNetworks`. Its motivation mainly comes from [`Zygote`](https://github.com/FluxML/Zygote.jl)'s inability to handle second-order derivatives in a decent way[^1]. We also note that if [`Enzyme`](https://github.com/EnzymeAD/Enzyme.jl) matures further, there may be no need for `SymoblicNeuralNetworks` anymore in the future. For now (December 2024) `SymbolicNeuralNetworks` offer a good way to incorporate derivatives into the loss function.
1010

11-
To accelerate the evaluation of the neural network, we change its evaluation method with its code generated by [Symbolics.jl](https://symbolics.juliasymbolics.org/stable/), performs some otpmizations on it, and generate the associate function with [RuntimeGeneratedFunctions.jl](https://github.com/SciML/RuntimeGeneratedFunctions.jl).
11+
[^1]: In some cases it is possible to perform second-order differentiation with `Zygote`, but when this is possible and when it is not is not entirely clear.
1212

13-
One can easily symbolize its neural network which will create another neural networks with the symbolize method
13+
`SymbolicNeuralNetworks` was created to take advantage of [`Symbolics`](https://symbolics.juliasymbolics.org/stable/) for training neural networks by accelerating their evaluation and by simplifying the computation of arbitrary derivatives of the neural network. This package is based on [`AbstractNeuralNetwork`](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl) and can be applied to [`GeometricMachineLearning`](https://github.com/JuliaGNI/GeometricMachineLearning.jl).
14+
15+
`SymbolicNeuralNetworks` creates a symbolic expression of the neural network, computes arbitrary combinations of derivatives and uses [`RuntimeGeneratedFunctions`](https://github.com/SciML/RuntimeGeneratedFunctions.jl) to compile a `Julia` function.
16+
17+
To create a symbolic neural network, we first design a `model` with [`AbstractNeuralNetwork`](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl):
1418
```julia
15-
symbolize(neuralnet, dim)
19+
using AbstractNeuralNetworks
20+
21+
c = Chain(Dense(2, 2, tanh), Linear(2, 1))
1622
```
17-
where neuralnet is a neural network in the framework of [AbstractNeuralNetwork.jl](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl) and dim the dimension of the input.
1823

19-
## Example
24+
We now call `SymbolicNeuralNetwork`:
2025

21-
```Julia
26+
```julia
2227
using SymbolicNeuralNetworks
23-
using GeometricMachineLearning
24-
using Symbolics
2528

26-
@variables sx[1:2]
27-
@variables nn(sx)[1:1]
28-
Dx1 = Differential(sx[1])
29-
Dx2 = Differential(sx[2])
30-
vectorfield = [0 1; -1 0] * [Dx1(nn[1]), Dx2(nn[1])]
31-
eqs = (x = sx, nn = nn, vectorfield = vectorfield)
29+
nn = SymbolicNeuralNetwork(c)
30+
```
3231

33-
arch = HamiltonianNeuralNetwork(2)
34-
shnn = SymbolicNeuralNetwork(arch; eqs = eqs)
32+
## Example
3533

36-
hnn = NeuralNetwork(arch, Float64)
37-
fun_vectorfield = functions(shnn).vectorfield
38-
```
34+
We now train the neural network by using `SymbolicPullback`[^2]:
3935

40-
## Performance
36+
[^2]: This example is discussed in detail in the docs.
4137

42-
Let see the performance to compute the vectorfield between SymbolicNeuralNetwork's version and Zygote's one:
43-
```Julia
44-
using Zygote
38+
```julia
39+
pb = SymbolicPullback(nn)
40+
41+
using GeometricMachineLearning
4542

46-
ω∇ₓnn(x, params) = [0 1; -1 0] * Zygote.gradient(x->hnn(x, params)[1], x)[1]
43+
# we generate the data and process them with `GeometricMachineLearning.DataLoader`
44+
x_vec = -1.:.1:1.
45+
y_vec = -1.:.1:1.
46+
xy_data = hcat([[x, y] for x in x_vec, y in y_vec]...)
47+
f(x::Vector) = exp.(-sum(x.^2))
48+
z_data = mapreduce(i -> f(xy_data[:, i]), hcat, axes(xy_data, 2))
4749

48-
println("Comparison of performances between Zygote and SymbolicNeuralNetwork for ω∇ₓnn")
49-
x = [0.5, 0.8]
50-
@time ω∇ₓnn(x, hnn.params)[1]
51-
@time fun_vectorfield(x, hnn.params)
50+
dl = DataLoader(xy_data, z_data)
51+
52+
nn_cpu = NeuralNetwork(c, CPU())
53+
o = Optimizer(AdamOptimizer(), nn_cpu)
54+
n_epochs = 1000
55+
batch = Batch(10)
56+
o(nn_cpu, dl, batch, n_epochs, pb.loss, pb)
5257
```
5358

54-
Let see another example of the training of a SympNet (an intrasec structure preserving architecture present in [GeometricMachineLearning.jl](https://github.com/JuliaGNI/GeometricMachineLearning.jl)) on an harmonic oscillator the data of which come from [GeometricProblem.jl](https://github.com/JuliaGNI/GeometricProblems.jl) :
59+
We can also train the neural network with `Zygote`-based[^3] automatic differentiation (AD):
5560

61+
[^3]: Note that here we can actually use `Zygote` without problems as it does not involve any complicated derivatives.
5662

63+
```julia
64+
pb_zygote = GeometricMachineLearning.ZygotePullback(FeedForwardLoss())
65+
o(nn_cpu, dl, batch, n_epochs, pb_zygote.loss, pb_zygote)
66+
```
67+
68+
## Development
69+
70+
We are using git hooks, e.g., to enforce that all tests pass before pushing. In order to activate these hooks, the following command must be executed once:
71+
```
72+
git config core.hooksPath .githooks
73+
```

docs/Project.toml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,9 @@
11
[deps]
2+
AbstractNeuralNetworks = "60874f82-5ada-4c70-bd1c-fa6be7711c8a"
3+
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
24
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
5+
GeometricMachineLearning = "194d25b2-d3f5-49f0-af24-c124f4aa80cc"
6+
Latexify = "23fbe1c1-3f47-55db-b15f-69d7ec21a316"
37
SymbolicNeuralNetworks = "aed23131-dcd0-47ca-8090-d21e605652e3"
8+
Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"
9+
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"

docs/make.jl

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
using SymbolicNeuralNetworks
22
using Documenter
3+
using Latexify: LaTeXString
4+
5+
# taken from https://github.com/korsbo/Latexify.jl/blob/master/docs/make.jl
6+
Base.show(io::IO, ::MIME"text/html", l::LaTeXString) = l.s
37

48
DocMeta.setdocmeta!(SymbolicNeuralNetworks, :DocTestSetup, :(using SymbolicNeuralNetworks); recursive=true)
59

@@ -13,9 +17,14 @@ makedocs(;
1317
canonical="https://JuliaGNI.github.io/SymbolicNeuralNetworks.jl",
1418
edit_link="main",
1519
assets=String[],
20+
mathengine = MathJax3()
1621
),
1722
pages=[
1823
"Home" => "index.md",
24+
"Tutorials" => [
25+
"Vanilla Symbolic Neural Network" => "symbolic_neural_networks.md",
26+
"Double Derivative" => "double_derivative.md",
27+
],
1928
],
2029
)
2130

docs/src/double_derivative.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# Arbitrarily Combining Derivatives
2+
3+
`SymbolicNeuralNetworks` can compute derivatives of arbitrary order of a neural network. For this we use two `struct`s:
4+
1. [`SymbolicNeuralNetworks.Jacobian`](@ref) and
5+
2. [`SymbolicNeuralNetworks.Gradient`](@ref).
6+
7+
!!! info "Terminology"
8+
Whereas the name `Jacobian` is standard for the matrix whose entries consist of all partial derivatives of the output of a function, the name `Gradient` is typically not used the way it is done here. Normally a *gradient* collects all the partial derivatives of a scalar function. In `SymbolicNeuralNetworks` the `struct` `Gradient` performs all partial derivatives of a symbolic array with respect to all the parameters of a neural network. So if we compute the `Gradient` of a matrix, then the corresponding routine returns *a matrix of neural network parameters*, each of which is the *standard gradient* of a matrix element. So it can be written as:
9+
```math
10+
\mathtt{Gradient}\left( \begin{pmatrix} m_{11} & m_{12} & \cdots & m_{1m} \\ m_{21} & m_{22} & \cdots & m_{2m} \\ \vdots & \vdots & \vdots & \vdots \\ m_{n1} & m_{n2} & \cdots & m_{nm} \end{pmatrix} \right) = \begin{pmatrix} \nabla_{\mathbb{P}}m_{11} & \nabla_{\mathbb{P}}m_{12} & \cdots & \nabla_{\mathbb{P}}m_{1m} \\ \nabla_{\mathbb{P}}m_{21} & \nabla_{\mathbb{P}}m_{22} & \cdots & \nabla_{\mathbb{P}}m_{2m} \\ \vdots & \vdots & \vdots & \vdots \\ \nabla_{\mathbb{P}}m_{n1} & \nabla_{\mathbb{P}}m_{n2} & \cdots & \nabla_{\mathbb{P}}m_{nm} \end{pmatrix},
11+
```
12+
where ``\mathbb{P}`` are the parameters of the neural network. For computational and consistency reasons each element ``\nabla_\mathbb{P}m_{ij}`` are `NeuralNetworkParameters`.
13+
14+
## Jacobian of a Neural Network
15+
16+
[`SymbolicNeuralNetworks.Jacobian`](@ref) differentiates a symbolic expression with respect to the input arguments of a neural network:
17+
18+
```@example jacobian_gradient
19+
using AbstractNeuralNetworks
20+
using SymbolicNeuralNetworks
21+
using SymbolicNeuralNetworks: Jacobian, Gradient, derivative
22+
using Latexify: latexify
23+
24+
c = Chain(Dense(2, 1, tanh; use_bias = false))
25+
nn = SymbolicNeuralNetwork(c)
26+
□ = Jacobian(nn)
27+
# we show the derivative with respect to
28+
derivative(□) |> latexify
29+
```
30+
31+
Note that the output of `nn` is one-dimensional and we use the convention
32+
33+
```math
34+
\square_{ij} = [\mathrm{jacobian}_{x}f]_{ij} = \frac{\partial}{\partial{}x_j}f_i,
35+
```
36+
so the output has shape ``\mathrm{input\_dim}\times\mathrm{output\_dim} = 1\times2``:
37+
38+
```@example jacobian_gradient
39+
@assert size(derivative(□)) == (1, 2) # hide
40+
size(derivative(□))
41+
```
42+
43+
## Gradient of a Neural Network
44+
45+
As described above [`SymbolicNeuralNetworks.Gradient`](@ref) differentiates every element of the array-valued output with respect to the neural network parameters:
46+
47+
```@example jacobian_gradient
48+
using SymbolicNeuralNetworks: Gradient
49+
50+
g = Gradient(nn)
51+
52+
derivative(g)[1].L1.W |> latexify
53+
```
54+
55+
## Double Derivatives
56+
57+
We can easily differentiate a neural network twice by using [`SymbolicNeuralNetworks.Jacobian`](@ref) and [`SymbolicNeuralNetworks.Gradient`](@ref) together. We first use [`SymbolicNeuralNetworks.Jacobian`](@ref) to differentiate the network output with respect to its input:
58+
59+
```@example jacobian_gradient
60+
using AbstractNeuralNetworks
61+
using SymbolicNeuralNetworks
62+
using SymbolicNeuralNetworks: Jacobian, Gradient, derivative
63+
using Latexify: latexify
64+
65+
c = Chain(Dense(2, 1, tanh))
66+
nn = SymbolicNeuralNetwork(c)
67+
□ = Jacobian(nn)
68+
# we show the derivative with respect to
69+
derivative(□) |> latexify
70+
```
71+
72+
We see that the output is a matrix of size ``\mathrm{output\_dim} \times \mathrm{input\_dim}``. We can further compute the gradients of all entries of this matrix with [`SymbolicNeuralNetworks.Gradient`](@ref):
73+
74+
```@example jacobian_gradient
75+
g = Gradient(derivative(□), nn)
76+
nothing # hide
77+
```
78+
79+
So [`SymbolicNeuralNetworks.Gradient`](@ref) differentiates every element of the matrix with respect to all neural network parameters. In order to access the gradient of the first element of the neural network with respect to the weight `b` in the first layer, we write:
80+
81+
```@example jacobian_gradient
82+
matrix_index = (1, 1)
83+
layer = :L1
84+
weight = :b
85+
derivative(g)[matrix_index...][layer][weight] |> latexify
86+
```
87+
88+
If we now want to obtain an executable `Julia` function we have to use [`build_nn_function`](@ref). We call this function on:
89+
90+
```math
91+
x = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \quad W = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad b = \begin{bmatrix} 0 \\ 0 \end{bmatrix}
92+
```
93+
94+
```@example jacobian_gradient
95+
built_function = build_nn_function(derivative(g), nn.params, nn.input)
96+
97+
x = [1., 0.]
98+
ps = NeuralNetworkParameters((L1 = (W = [1. 0.; 0. 1.], b = [0., 0.]), ))
99+
built_function(x, ps)[matrix_index...][layer][weight]
100+
```
101+
102+
!!! info
103+
With `SymbolicNeuralNetworks`, the `struct`s [`SymbolicNeuralNetworks.Jacobian`](@ref), [`SymbolicNeuralNetworks.Gradient`](@ref) and [`build_nn_function`](@ref) it is easy to build combinations of derivatives. This is much harder when using `Zygote`-based AD.

0 commit comments

Comments
 (0)