Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactored Symbolic Neural Networks #16

Merged
merged 73 commits into from
Dec 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
73 commits
Select commit Hold shift + click to select a range
01f9715
Started cleaning up respository and make things readable.
benedict-96 Oct 28, 2024
70dffa5
Added the githook similar to GML and GP.
benedict-96 Oct 30, 2024
672958b
Converted SymbolicNeuralNetwork to something that's readable and hope…
benedict-96 Oct 30, 2024
47b7c8f
Added a test for the new interface.
benedict-96 Oct 30, 2024
9925266
@view -> @views.
benedict-96 Oct 30, 2024
3e7fb50
Started expanding on documentation.
benedict-96 Oct 30, 2024
139a18f
Exporting HSNN now.
benedict-96 Oct 30, 2024
d621424
Tried to express hamiltonian vf trhough matrix multiplication.
benedict-96 Oct 30, 2024
5e0727d
Fixed typo.
benedict-96 Oct 30, 2024
6a4829a
Added parallelizing of functions and made script work (up to a point).
benedict-96 Oct 31, 2024
7ceb9dc
Attempt to make SymbolicNeuralNetworks work with Chains.
benedict-96 Nov 3, 2024
05d259d
add comments to and clean up buildsymbolic
michakraus Nov 5, 2024
19bc4fc
add comments to and clean up build_eval and build_hamiltonian
michakraus Nov 5, 2024
df4a9e7
Removed unnecessary return.
benedict-96 Nov 5, 2024
466c6db
Made HNN work.
benedict-96 Nov 6, 2024
ef2b13c
Removed utils file (there is a utils folder).
benedict-96 Nov 7, 2024
fd721e9
Finished initial refactoring.
benedict-96 Nov 7, 2024
ae1a5cc
Made build_function work and started writing docstrings.
benedict-96 Nov 9, 2024
e8a35d7
Finished docstrings.
benedict-96 Nov 11, 2024
b977291
Changed namings slightly.
benedict-96 Nov 13, 2024
03d1c87
Fixed script(s).
benedict-96 Nov 13, 2024
76f6a85
Including files at right spot (HNN/LNN after SNN).
benedict-96 Nov 13, 2024
b4c2d9d
Made SHNN a separate struct and added docstrings.
benedict-96 Nov 13, 2024
36395de
Added AbstractArray of BasicSymbolic to possible types in EqT (this c…
benedict-96 Nov 13, 2024
dc5ce72
SymbolicNeuralNetwork -> AbstractSymbolicNeuralNetwork.
benedict-96 Nov 13, 2024
209c365
Parallelized with mapreduce.
benedict-96 Nov 13, 2024
557d34e
Changed name and contents of test.
benedict-96 Nov 18, 2024
022fa0f
Commented out almost all tests.
benedict-96 Nov 18, 2024
6f8ccd3
Implemented SymbolicPullback.
benedict-96 Nov 18, 2024
648a68d
Started adding tests (and fixing things).
benedict-96 Nov 18, 2024
51082b8
Found problem with one test.
benedict-96 Nov 18, 2024
bbab67c
Modified test.
benedict-96 Nov 19, 2024
5d91645
Added plots and comparison to non-Hamiltonian neural network.
benedict-96 Nov 20, 2024
7ccbb16
Made pullback more legible and expanded on tests.
benedict-96 Nov 20, 2024
3cba750
Only testing some cases for now (until performance issues are fixed).
benedict-96 Nov 20, 2024
9ede565
Made generated function work.
benedict-96 Nov 25, 2024
556b2a1
Now testing more stuff.
benedict-96 Nov 25, 2024
1fee7a6
Removed unnecessary line.
benedict-96 Nov 25, 2024
5982d96
Made test quicker.
benedict-96 Nov 25, 2024
25aced5
Added symbolic input to SymbolicNeuralNetwork struct.
benedict-96 Nov 26, 2024
72e1f3f
Changed name of file for consistency with abstract neural networks.
benedict-96 Nov 26, 2024
5555e3f
Added derivative objetct.
benedict-96 Nov 26, 2024
870fa3a
Added extra files for symbolic parameters of specific layers in order…
benedict-96 Nov 26, 2024
cb1e8d2
Added types.
benedict-96 Nov 26, 2024
6ccbb00
Starting making Jacobian work.
benedict-96 Nov 26, 2024
3de2fa2
Put additional routines into separate file.
benedict-96 Nov 26, 2024
a2e6371
Finished Jacobian and started fixing/working on Gradient.
benedict-96 Nov 27, 2024
f47a757
Docs aren't doing anything at the moment.
benedict-96 Nov 27, 2024
e16264c
Added docstrings and fixed functions for various cases.
benedict-96 Nov 28, 2024
25942b0
Added docstrings and fixed a number of typos.
benedict-96 Nov 28, 2024
a26fb20
Fixed problems with compilation and docstrings.
benedict-96 Nov 29, 2024
513197b
Removed all tests except docstring tests.
benedict-96 Nov 30, 2024
c601d0a
Added plot of Gaussian that is used for training.
benedict-96 Nov 30, 2024
7643a5b
Fixed snn script.
benedict-96 Dec 1, 2024
045d800
Added double derivative docs.
benedict-96 Dec 1, 2024
75ca456
Updated Readme.
benedict-96 Dec 2, 2024
838d817
Removed files that aren't used anymore.
benedict-96 Dec 2, 2024
881381c
Finished double derivtive docs. Removed HNN example from docs.
benedict-96 Dec 2, 2024
29d0349
Removed files that aren't used anymore.
benedict-96 Dec 2, 2024
6dd3d93
Fixed latexify problem, but introduced type pyracy.
benedict-96 Dec 2, 2024
492b7c3
Cleaned up tests.
benedict-96 Dec 2, 2024
2909f6b
Not using one of the tests in symbolic_gradient.jl for now.
benedict-96 Dec 2, 2024
ae7a5d6
Removed ChainRulesCore and KernelAbstractions (not needed).
benedict-96 Dec 2, 2024
916c62b
Merge branch 'main' into make_script_work
benedict-96 Dec 2, 2024
c5d5621
Not testing for 1.6 anymore.
benedict-96 Dec 2, 2024
a4a2e7f
Resolved conflict
benedict-96 Dec 3, 2024
24384bd
Updated docs.
benedict-96 Dec 3, 2024
05ee7b2
Minor fix.
michakraus Dec 3, 2024
ac49890
Remove unnecessary packages from docs Project.
michakraus Dec 3, 2024
d8512a3
Added script for comparing pullbacks and added an explanation for why…
benedict-96 Dec 3, 2024
0fcce26
Fix fix_create_array.
michakraus Dec 3, 2024
6e72a04
Extend pullback comparison script.
michakraus Dec 3, 2024
e09ee57
Updated docstring.
benedict-96 Dec 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .githooks/pre-push
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# pre-push git hook that runs all tests before pushing

red='\033[0;31m'
green='\033[0;32m'
no_color='\033[0m'

reponame=$(basename `git rev-parse --show-toplevel`)


echo "\nRunning pre-push hook\n"
echo "Testing $reponame"
julia --project=@. -e "using Pkg; Pkg.test(\"SymbolicNeuralNetworks\")"

if [[ $? -ne 0 ]]; then
echo "\n${red}ERROR - Tests must pass before push!\n${no_color}"
exit 1
fi

echo "\n${green}Git hook was SUCCESSFUL!${no_color}\n"
1 change: 0 additions & 1 deletion .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ jobs:
fail-fast: false
matrix:
version:
- '1.6'
- '1.10'
- '^1.11.0-0'
os:
Expand Down
18 changes: 13 additions & 5 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,30 @@ version = "0.1.2"

[deps]
AbstractNeuralNetworks = "60874f82-5ada-4c70-bd1c-fa6be7711c8a"
KernelAbstractions = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
Latexify = "23fbe1c1-3f47-55db-b15f-69d7ec21a316"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
RuntimeGeneratedFunctions = "7e49a35a-f44a-4d26-94aa-eba1b4ca6b47"
SafeTestsets = "1bc83da4-3b8d-516f-aca4-4fe02f6d838f"
Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"

[compat]
AbstractNeuralNetworks = "0.1, 0.3, 0.4"
KernelAbstractions = "0.9"
AbstractNeuralNetworks = "0.3, 0.4"
Documenter = "1.8.0"
ForwardDiff = "0.10.38"
Latexify = "0.16.5"
RuntimeGeneratedFunctions = "0.5"
SafeTestsets = "0.1"
Symbolics = "5, 6"
Zygote = "0.6.73"
julia = "1.6"

[extras]
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
Latexify = "23fbe1c1-3f47-55db-b15f-69d7ec21a316"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
SafeTestsets = "1bc83da4-3b8d-516f-aca4-4fe02f6d838f"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"

[targets]
test = ["Test"]
test = ["Test", "ForwardDiff", "Random", "Documenter", "Latexify", "SafeTestsets", "Zygote"]
77 changes: 47 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,51 +6,68 @@
[![Coverage](https://codecov.io/gh/JuliaGNI/SymbolicNeuralNetworks.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/JuliaGNI/SymbolicNeuralNetworks.jl)
[![PkgEval](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/S/SymbolicNeuralNetworks.svg)](https://JuliaCI.github.io/NanosoldierReports/pkgeval_badges/S/SymbolicNeuralNetworks.html)

SymbolicNeuralNetworks.jl was created to take advantage of [Symbolics.jl](https://symbolics.juliasymbolics.org/stable/) for training neural networks by accelarating their evaluation and by simplifing the computation of some derivatives of the neural network that may be needed for loss functions. This package is based on [AbstractNeuralNetwork.jl](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl) and can be applied to [GeometricMachineLearning.jl](https://github.com/JuliaGNI/GeometricMachineLearning.jl).
In a perfect world we probably would not need `SymbolicNeuralNetworks`. Its motivation mainly comes from [`Zygote`](https://github.com/FluxML/Zygote.jl)'s inability to handle second-order derivatives in a decent way[^1]. We also note that if [`Enzyme`](https://github.com/EnzymeAD/Enzyme.jl) matures further, there may be no need for `SymoblicNeuralNetworks` anymore in the future. For now (December 2024) `SymbolicNeuralNetworks` offer a good way to incorporate derivatives into the loss function.

To accelerate the evaluation of the neural network, we change its evaluation method with its code generated by [Symbolics.jl](https://symbolics.juliasymbolics.org/stable/), performs some otpmizations on it, and generate the associate function with [RuntimeGeneratedFunctions.jl](https://github.com/SciML/RuntimeGeneratedFunctions.jl).
[^1]: In some cases it is possible to perform second-order differentiation with `Zygote`, but when this is possible and when it is not is not entirely clear.

One can easily symbolize its neural network which will create another neural networks with the symbolize method
`SymbolicNeuralNetworks` was created to take advantage of [`Symbolics`](https://symbolics.juliasymbolics.org/stable/) for training neural networks by accelerating their evaluation and by simplifying the computation of arbitrary derivatives of the neural network. This package is based on [`AbstractNeuralNetwork`](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl) and can be applied to [`GeometricMachineLearning`](https://github.com/JuliaGNI/GeometricMachineLearning.jl).

`SymbolicNeuralNetworks` creates a symbolic expression of the neural network, computes arbitrary combinations of derivatives and uses [`RuntimeGeneratedFunctions`](https://github.com/SciML/RuntimeGeneratedFunctions.jl) to compile a `Julia` function.

To create a symbolic neural network, we first design a `model` with [`AbstractNeuralNetwork`](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl):
```julia
symbolize(neuralnet, dim)
using AbstractNeuralNetworks

c = Chain(Dense(2, 2, tanh), Linear(2, 1))
```
where neuralnet is a neural network in the framework of [AbstractNeuralNetwork.jl](https://github.com/JuliaGNI/AbstractNeuralNetworks.jl) and dim the dimension of the input.

## Example
We now call `SymbolicNeuralNetwork`:

```Julia
```julia
using SymbolicNeuralNetworks
using GeometricMachineLearning
using Symbolics

@variables sx[1:2]
@variables nn(sx)[1:1]
Dx1 = Differential(sx[1])
Dx2 = Differential(sx[2])
vectorfield = [0 1; -1 0] * [Dx1(nn[1]), Dx2(nn[1])]
eqs = (x = sx, nn = nn, vectorfield = vectorfield)
nn = SymbolicNeuralNetwork(c)
```

arch = HamiltonianNeuralNetwork(2)
shnn = SymbolicNeuralNetwork(arch; eqs = eqs)
## Example

hnn = NeuralNetwork(arch, Float64)
fun_vectorfield = functions(shnn).vectorfield
```
We now train the neural network by using `SymbolicPullback`[^2]:

## Performance
[^2]: This example is discussed in detail in the docs.

Let see the performance to compute the vectorfield between SymbolicNeuralNetwork's version and Zygote's one:
```Julia
using Zygote
```julia
pb = SymbolicPullback(nn)

using GeometricMachineLearning

ω∇ₓnn(x, params) = [0 1; -1 0] * Zygote.gradient(x->hnn(x, params)[1], x)[1]
# we generate the data and process them with `GeometricMachineLearning.DataLoader`
x_vec = -1.:.1:1.
y_vec = -1.:.1:1.
xy_data = hcat([[x, y] for x in x_vec, y in y_vec]...)
f(x::Vector) = exp.(-sum(x.^2))
z_data = mapreduce(i -> f(xy_data[:, i]), hcat, axes(xy_data, 2))

println("Comparison of performances between Zygote and SymbolicNeuralNetwork for ω∇ₓnn")
x = [0.5, 0.8]
@time ω∇ₓnn(x, hnn.params)[1]
@time fun_vectorfield(x, hnn.params)
dl = DataLoader(xy_data, z_data)

nn_cpu = NeuralNetwork(c, CPU())
o = Optimizer(AdamOptimizer(), nn_cpu)
n_epochs = 1000
batch = Batch(10)
o(nn_cpu, dl, batch, n_epochs, pb.loss, pb)
```

Let see another example of the training of a SympNet (an intrasec structure preserving architecture present in [GeometricMachineLearning.jl](https://github.com/JuliaGNI/GeometricMachineLearning.jl)) on an harmonic oscillator the data of which come from [GeometricProblem.jl](https://github.com/JuliaGNI/GeometricProblems.jl) :
We can also train the neural network with `Zygote`-based[^3] automatic differentiation (AD):

[^3]: Note that here we can actually use `Zygote` without problems as it does not involve any complicated derivatives.

```julia
pb_zygote = GeometricMachineLearning.ZygotePullback(FeedForwardLoss())
o(nn_cpu, dl, batch, n_epochs, pb_zygote.loss, pb_zygote)
```

## Development

We are using git hooks, e.g., to enforce that all tests pass before pushing. In order to activate these hooks, the following command must be executed once:
```
git config core.hooksPath .githooks
```
6 changes: 6 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
[deps]
AbstractNeuralNetworks = "60874f82-5ada-4c70-bd1c-fa6be7711c8a"
CairoMakie = "13f3f980-e62b-5c42-98c6-ff1f3baf88f0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
GeometricMachineLearning = "194d25b2-d3f5-49f0-af24-c124f4aa80cc"
Latexify = "23fbe1c1-3f47-55db-b15f-69d7ec21a316"
SymbolicNeuralNetworks = "aed23131-dcd0-47ca-8090-d21e605652e3"
Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
9 changes: 9 additions & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
using SymbolicNeuralNetworks
using Documenter
using Latexify: LaTeXString

# taken from https://github.com/korsbo/Latexify.jl/blob/master/docs/make.jl
Base.show(io::IO, ::MIME"text/html", l::LaTeXString) = l.s

DocMeta.setdocmeta!(SymbolicNeuralNetworks, :DocTestSetup, :(using SymbolicNeuralNetworks); recursive=true)

Expand All @@ -13,9 +17,14 @@ makedocs(;
canonical="https://JuliaGNI.github.io/SymbolicNeuralNetworks.jl",
edit_link="main",
assets=String[],
mathengine = MathJax3()
),
pages=[
"Home" => "index.md",
"Tutorials" => [
"Vanilla Symbolic Neural Network" => "symbolic_neural_networks.md",
"Double Derivative" => "double_derivative.md",
],
],
)

Expand Down
103 changes: 103 additions & 0 deletions docs/src/double_derivative.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Arbitrarily Combining Derivatives

`SymbolicNeuralNetworks` can compute derivatives of arbitrary order of a neural network. For this we use two `struct`s:
1. [`SymbolicNeuralNetworks.Jacobian`](@ref) and
2. [`SymbolicNeuralNetworks.Gradient`](@ref).

!!! info "Terminology"
Whereas the name `Jacobian` is standard for the matrix whose entries consist of all partial derivatives of the output of a function, the name `Gradient` is typically not used the way it is done here. Normally a *gradient* collects all the partial derivatives of a scalar function. In `SymbolicNeuralNetworks` the `struct` `Gradient` performs all partial derivatives of a symbolic array with respect to all the parameters of a neural network. So if we compute the `Gradient` of a matrix, then the corresponding routine returns *a matrix of neural network parameters*, each of which is the *standard gradient* of a matrix element. So it can be written as:
```math
\mathtt{Gradient}\left( \begin{pmatrix} m_{11} & m_{12} & \cdots & m_{1m} \\ m_{21} & m_{22} & \cdots & m_{2m} \\ \vdots & \vdots & \vdots & \vdots \\ m_{n1} & m_{n2} & \cdots & m_{nm} \end{pmatrix} \right) = \begin{pmatrix} \nabla_{\mathbb{P}}m_{11} & \nabla_{\mathbb{P}}m_{12} & \cdots & \nabla_{\mathbb{P}}m_{1m} \\ \nabla_{\mathbb{P}}m_{21} & \nabla_{\mathbb{P}}m_{22} & \cdots & \nabla_{\mathbb{P}}m_{2m} \\ \vdots & \vdots & \vdots & \vdots \\ \nabla_{\mathbb{P}}m_{n1} & \nabla_{\mathbb{P}}m_{n2} & \cdots & \nabla_{\mathbb{P}}m_{nm} \end{pmatrix},
```
where ``\mathbb{P}`` are the parameters of the neural network. For computational and consistency reasons each element ``\nabla_\mathbb{P}m_{ij}`` are `NeuralNetworkParameters`.

## Jacobian of a Neural Network

[`SymbolicNeuralNetworks.Jacobian`](@ref) differentiates a symbolic expression with respect to the input arguments of a neural network:

```@example jacobian_gradient
using AbstractNeuralNetworks
using SymbolicNeuralNetworks
using SymbolicNeuralNetworks: Jacobian, Gradient, derivative
using Latexify: latexify

c = Chain(Dense(2, 1, tanh; use_bias = false))
nn = SymbolicNeuralNetwork(c)
□ = Jacobian(nn)
# we show the derivative with respect to
derivative(□) |> latexify
```

Note that the output of `nn` is one-dimensional and we use the convention

```math
\square_{ij} = [\mathrm{jacobian}_{x}f]_{ij} = \frac{\partial}{\partial{}x_j}f_i,
```
so the output has shape ``\mathrm{input\_dim}\times\mathrm{output\_dim} = 1\times2``:

```@example jacobian_gradient
@assert size(derivative(□)) == (1, 2) # hide
size(derivative(□))
```

## Gradient of a Neural Network

As described above [`SymbolicNeuralNetworks.Gradient`](@ref) differentiates every element of the array-valued output with respect to the neural network parameters:

```@example jacobian_gradient
using SymbolicNeuralNetworks: Gradient

g = Gradient(nn)

derivative(g)[1].L1.W |> latexify
```

## Double Derivatives

We can easily differentiate a neural network twice by using [`SymbolicNeuralNetworks.Jacobian`](@ref) and [`SymbolicNeuralNetworks.Gradient`](@ref) together. We first use [`SymbolicNeuralNetworks.Jacobian`](@ref) to differentiate the network output with respect to its input:

```@example jacobian_gradient
using AbstractNeuralNetworks
using SymbolicNeuralNetworks
using SymbolicNeuralNetworks: Jacobian, Gradient, derivative
using Latexify: latexify

c = Chain(Dense(2, 1, tanh))
nn = SymbolicNeuralNetwork(c)
□ = Jacobian(nn)
# we show the derivative with respect to
derivative(□) |> latexify
```

We see that the output is a matrix of size ``\mathrm{output\_dim} \times \mathrm{input\_dim}``. We can further compute the gradients of all entries of this matrix with [`SymbolicNeuralNetworks.Gradient`](@ref):

```@example jacobian_gradient
g = Gradient(derivative(□), nn)
nothing # hide
```

So [`SymbolicNeuralNetworks.Gradient`](@ref) differentiates every element of the matrix with respect to all neural network parameters. In order to access the gradient of the first element of the neural network with respect to the weight `b` in the first layer, we write:

```@example jacobian_gradient
matrix_index = (1, 1)
layer = :L1
weight = :b
derivative(g)[matrix_index...][layer][weight] |> latexify
```

If we now want to obtain an executable `Julia` function we have to use [`build_nn_function`](@ref). We call this function on:

```math
x = \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \quad W = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \quad b = \begin{bmatrix} 0 \\ 0 \end{bmatrix}
```

```@example jacobian_gradient
built_function = build_nn_function(derivative(g), nn.params, nn.input)

x = [1., 0.]
ps = NeuralNetworkParameters((L1 = (W = [1. 0.; 0. 1.], b = [0., 0.]), ))
built_function(x, ps)[matrix_index...][layer][weight]
```

!!! info
With `SymbolicNeuralNetworks`, the `struct`s [`SymbolicNeuralNetworks.Jacobian`](@ref), [`SymbolicNeuralNetworks.Gradient`](@ref) and [`build_nn_function`](@ref) it is easy to build combinations of derivatives. This is much harder when using `Zygote`-based AD.
Loading
Loading