Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation for optimizer #89

Merged
merged 163 commits into from
Dec 5, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
163 commits
Select commit Hold shift + click to select a range
efad4e6
Copied Makefile from GI.
benedict-96 Nov 10, 2023
c95c6d1
Started adding bibliography.
benedict-96 Nov 10, 2023
7f80823
Copied Makefile from GI.
benedict-96 Nov 10, 2023
355ba99
Added more references (on symplectic integrators).
benedict-96 Nov 10, 2023
dca1b52
Started to rework this section. Fixed typos.
benedict-96 Nov 10, 2023
bbe8c30
Continued changing relevant bits and corrected typos. The file still …
benedict-96 Nov 10, 2023
5823466
Updated Hairer reference.
benedict-96 Nov 10, 2023
2776bc6
Started adding descriptions to the various layers.
benedict-96 Nov 10, 2023
daa8765
Started refactoring this file. It still uses the old syntax.
benedict-96 Nov 10, 2023
7866c07
Added routines such that the SympNet layers can now deal with NamedTu…
benedict-96 Nov 10, 2023
bbddca5
File containing SympNet architecture.
benedict-96 Nov 14, 2023
0c86757
Added necessary tikz files.
benedict-96 Nov 14, 2023
1cf6ff7
Fixed typo.
benedict-96 Nov 15, 2023
504e841
images -> tikz.
benedict-96 Nov 15, 2023
fd4a232
Added (empty file) references.
benedict-96 Nov 15, 2023
411c20a
Added tikz picture that visualizes dependencies of various structs an…
benedict-96 Nov 18, 2023
a8e92fb
Updated file path.
benedict-96 Nov 18, 2023
7eed150
Added the bibliography with DocumenterCitations.
benedict-96 Nov 18, 2023
c3d365e
Fixed some typos.
benedict-96 Nov 18, 2023
48ea7d3
Updated file path.
benedict-96 Nov 18, 2023
37b0444
Added new files.
benedict-96 Nov 18, 2023
9c840bc
Updated pendulum script to reflect new data types.
benedict-96 Nov 18, 2023
2e82681
Updated SympNet architectures.:
benedict-96 Nov 18, 2023
950565d
Started adding the option of symplectic data to the DataLoader.
benedict-96 Nov 18, 2023
f543742
Made layers work with symplectic data.
benedict-96 Nov 18, 2023
704b39f
Added more description and a routine that can be initialized with a N…
benedict-96 Nov 18, 2023
d3aa016
Added some comments.
benedict-96 Nov 18, 2023
4376166
Reorganized the DataLoader and Batch for NamedTuples.
benedict-96 Nov 19, 2023
998a1a9
Fixed some typos in the code section.
benedict-96 Nov 19, 2023
772bd3a
Extened routines to NamedTuples and fixed some typos.
benedict-96 Nov 19, 2023
b6bc346
Forgot to output the batches before. Wrote a related comment into the…
benedict-96 Nov 20, 2023
eb906cf
Changed data format so that the output is a matrix.
benedict-96 Nov 20, 2023
8488428
Continued formatting script for new syntax.
benedict-96 Nov 20, 2023
90baae2
Removed empty line.
benedict-96 Nov 20, 2023
7dda667
Updated TODO.:
benedict-96 Nov 20, 2023
b901082
Added Progressbar to optimization routine.
benedict-96 Nov 20, 2023
4032ba1
Now saving time steps as T instead of T+1 (may have to think about po…
benedict-96 Nov 20, 2023
12b4ce8
Fixed typo for the outputs of the linear layers.
benedict-96 Nov 20, 2023
93eafb7
Added training routine for la sympnet.
benedict-96 Nov 21, 2023
9c0521d
Added option to specify type.
benedict-96 Nov 21, 2023
4dbda54
Specified type of input matrix.
benedict-96 Nov 21, 2023
26fdacf
Added various routines that should also run on GPU including matrix-m…
benedict-96 Nov 21, 2023
356c33d
Started refactoring these layers a bit such that fewer structs are ne…
benedict-96 Nov 21, 2023
f7b22c2
Added a few separate tests for SymmetricMatrix.
benedict-96 Nov 21, 2023
121b062
Added an additional option to call the pendulum script with a NamedTu…
benedict-96 Nov 22, 2023
83b098d
Added an additional constructor s.t. SympNets can be called with Data…
benedict-96 Nov 22, 2023
8cbf85d
Updated the script s.t. calling sympnet now takes DataLoader as input…
benedict-96 Nov 22, 2023
8976d52
Added a setindex! routine (needed for updating parameters).
benedict-96 Nov 22, 2023
29c7c2f
put DataLoader under code comment marks.
benedict-96 Nov 22, 2023
90667c3
made specifying the type of Adam an (easier) option.
benedict-96 Nov 22, 2023
d9a5c39
Fixed a typo in the docs.
benedict-96 Nov 22, 2023
366231a
Updated SympNetLayers to save an extra struct and an abstract type. S…
benedict-96 Nov 22, 2023
1a769b9
Minor refactoring to improve readability.
benedict-96 Nov 22, 2023
90e6097
Fixed the Makefiles s.t. they now correctly generate the pngs.
benedict-96 Nov 22, 2023
4ece4eb
Reformulated various sections and fixed typos.
benedict-96 Nov 22, 2023
ebd35ac
Added ENV line for correct plotting.
benedict-96 Nov 22, 2023
5af1ab9
Updated according to new data structures in sympnets.jl
benedict-96 Nov 22, 2023
1953f39
Updated file path to png.
benedict-96 Nov 22, 2023
e0fe713
Added comment.
benedict-96 Nov 22, 2023
1008491
Split the theory and the tutorial sections up into two parts.
benedict-96 Nov 22, 2023
c45adf5
Played around with the parameters a bit to make the LASympnet work (h…
benedict-96 Nov 22, 2023
60673e7
Added the sympnet tutorial.
benedict-96 Nov 22, 2023
acdb284
Added various cross-references and updated the tikz picture to use th…
benedict-96 Nov 22, 2023
87cfccc
Changed the default values of the LA SympNet parameters.
benedict-96 Nov 22, 2023
acdaf0b
Updated the makefile to include the new logos.
benedict-96 Nov 23, 2023
c669e8f
Replaced the name with our logo (also dark mode).
benedict-96 Nov 23, 2023
350205b
Added an optional argument (sidebar_sitename=false) to better display…
benedict-96 Nov 23, 2023
c8f869e
Added generation of tikz images.
benedict-96 Nov 23, 2023
f52d258
Added .yaml file to generate logos.
benedict-96 Nov 23, 2023
1e1d4db
Got rid of extra clean command (should already be included in make all).
benedict-96 Nov 23, 2023
1455922
Added logos (you shouldn't include bitstype).
benedict-96 Nov 23, 2023
a514b2a
Added logo.
benedict-96 Nov 23, 2023
ee17db8
Added logo for darkmode.
benedict-96 Nov 23, 2023
fed1520
Deleted Readme workflow again.
benedict-96 Nov 23, 2023
9fce2b7
type -> GeometricMachineLearning.type
benedict-96 Nov 23, 2023
dd0596c
Fixed a problem with tests that appeared because of the new syntax.
benedict-96 Nov 23, 2023
6c42e9c
Added an extra option just to generate logos.
benedict-96 Nov 23, 2023
d707115
Updated names of exported structs.
benedict-96 Nov 23, 2023
b15bc2f
Deleted test that appeared double for some reason.
benedict-96 Nov 23, 2023
384afac
type -> GeometricMachineLearning.type. This function is no longer exp…
benedict-96 Nov 23, 2023
56b08c5
Remove Readme.yml.
benedict-96 Nov 23, 2023
3cbbab6
Started adding markdown for general manifolds.
benedict-96 Nov 24, 2023
d1ad571
Completed proof that SO(N) is a manifold.
benedict-96 Nov 24, 2023
3c2ff82
Added file describing basic topological concepts needed for manifolds.
benedict-96 Nov 24, 2023
a32fa22
Started basic description of NN optimizers.
benedict-96 Nov 24, 2023
9f62150
Added files on the existence and uniqueness theorem, the inverse func…
benedict-96 Nov 26, 2023
9daa61f
Added the right reference for the Bishop book at the bottom.
benedict-96 Nov 27, 2023
9387f2c
Remove old transformer script (Lux and Flux dependencies).
benedict-96 Nov 27, 2023
a47bdef
Changed name to something more descriptive.
benedict-96 Nov 27, 2023
dd31393
Now only doing things once instead of multiple times (computation of …
benedict-96 Nov 28, 2023
877f32f
Now exporting BFGS-related structs.
benedict-96 Nov 28, 2023
e22bc4f
Cache and Optimizer for BFGS.
benedict-96 Nov 28, 2023
afa46c6
Added routines for vec, zero, get_backend, assign! and copy for the s…
benedict-96 Nov 28, 2023
e3c4dfd
Minor change of one comment and now allow different arrays to be opti…
benedict-96 Nov 28, 2023
fd5854e
Added routine for bfgs
benedict-96 Nov 28, 2023
2790fea
Added routines for vec (and its inverse), zero, get_backend, assign! …
benedict-96 Nov 28, 2023
80dbaa7
Added some comments.
benedict-96 Nov 28, 2023
af717fc
Put comments in front of functions that are probably not needed.
benedict-96 Nov 28, 2023
6fa0937
Started adding documentation for BFSG.
benedict-96 Nov 28, 2023
c80b1c6
Added new transformer script explicitly for bfgs, up to now not diffe…
benedict-96 Nov 28, 2023
3cc8976
Changed backend to GPU (in general) instead of CPU.
benedict-96 Nov 28, 2023
dc64ce1
Added a test for the vectorization.
benedict-96 Nov 28, 2023
3082631
changed some hyperparameters.
benedict-96 Nov 29, 2023
7426e9c
Adjusted script to new syntax. Not working at the moment.
benedict-96 Nov 29, 2023
3d718df
Moved DataLoader to the front of the file <- DataLoader doesn't depen…
benedict-96 Nov 29, 2023
6031a94
Changed equals sign to broadcasting operation (assign! and broadcast …
benedict-96 Nov 29, 2023
e9b6741
Changed equals sign to broadcast operation (assign! should serve the …
benedict-96 Nov 29, 2023
9ddaa52
Added another optimize_for_one_epoch! routine for the case if we have…
benedict-96 Nov 29, 2023
4a4b8a2
Added an accuracy routine that takes neural network as input.
benedict-96 Nov 29, 2023
5aa3bf8
main change is that we now add a small scalar to Y'*S to make sure it…
benedict-96 Nov 29, 2023
4d3f2a1
Added a constructor for optimizer if the input is a neural network.
benedict-96 Nov 29, 2023
83e2702
Added a short test script for the bfgs optimizer.
benedict-96 Nov 29, 2023
420cd34
Now nor loading LinearAlgebra anymore.
benedict-96 Nov 29, 2023
6ec91a8
Fixed typo dimesnion -> dimension.
benedict-96 Nov 29, 2023
5fbed13
Finished a rough outline of the documentation. Included a derivation …
benedict-96 Dec 3, 2023
f2aa41d
Fixed a typo that originated from not having defined for .
benedict-96 Dec 3, 2023
90f59cf
Decreased batch size.
benedict-96 Dec 3, 2023
8cd3bda
Added documentation for and .
benedict-96 Dec 3, 2023
7f07da4
Fixed a problem that had its origin in not being able to index GPU ar…
benedict-96 Dec 3, 2023
4bff014
Added a test for bfgs on the Stiefel manifold.
benedict-96 Dec 3, 2023
a33ef13
Added an architecture for the classification transformer as a .
benedict-96 Dec 3, 2023
f2099fa
Removed the temporary workaround and the cpu allocation.
benedict-96 Dec 3, 2023
9740b83
Added an additional way of computing H. These two probably are equiva…
benedict-96 Dec 3, 2023
24634f2
Added new file stiefel_projection.jl for the array of the same name.
benedict-96 Dec 4, 2023
0bd6c09
Removed file. The remaining One struct is not needed anymore.
benedict-96 Dec 4, 2023
c8d6522
Removed a comment.
benedict-96 Dec 4, 2023
f7c9ff0
Now using the newly implemented StiefelProjection struct. This gets r…
benedict-96 Dec 4, 2023
a5c2026
Removed arrays/auxiliary.jl
benedict-96 Dec 4, 2023
2823f5b
Added another constructor for the case when we just have two integers…
benedict-96 Dec 4, 2023
54b21bd
Added a missing type dependency.
benedict-96 Dec 4, 2023
9bde4ba
Changed the way the indexing is done for the Grassmann manifold. This…
benedict-96 Dec 4, 2023
2420ad9
Increased number of steps for convergence test. Was occasionally fail…
benedict-96 Dec 4, 2023
015d324
Also importing function init_optimizer_cache now because this is no l…
benedict-96 Dec 4, 2023
0513445
Added restriction that the type has to be a number. Needed for Julia …
benedict-96 Dec 4, 2023
cdfd554
Merge pull request #88 from JuliaGNI/refactoring_and_documentation
michakraus Dec 4, 2023
41a46c4
Merge branch 'main' into bfgs
benedict-96 Dec 4, 2023
fb53122
Got rid of CUDA and GPUArrays dependencies.
benedict-96 Dec 4, 2023
3863768
Merge pull request #90 from JuliaGNI/bfgs
michakraus Dec 4, 2023
2ea1dae
Removed the with version 1. This is not in the GI Documenter.yml file…
benedict-96 Dec 4, 2023
955912f
Merge pull request #92 from JuliaGNI/bfgs
michakraus Dec 4, 2023
249e4b8
Started adding markdown for general manifolds.
benedict-96 Nov 24, 2023
d77882d
Completed proof that SO(N) is a manifold.
benedict-96 Nov 24, 2023
3d47412
Added file describing basic topological concepts needed for manifolds.
benedict-96 Nov 24, 2023
8d6bc4a
Started basic description of NN optimizers.
benedict-96 Nov 24, 2023
9e9bd25
Added files on the existence and uniqueness theorem, the inverse func…
benedict-96 Nov 26, 2023
dad1157
Added the right reference for the Bishop book at the bottom.
benedict-96 Nov 27, 2023
a9a6429
Added the basic topological and analysis files as well as the bsfg op…
benedict-96 Dec 4, 2023
fca4061
Added new references.
benedict-96 Dec 4, 2023
4f1cba9
Fixed references in files. They are now all called with @bibliography.
benedict-96 Dec 4, 2023
6b274ef
Fixed typo (space after #).
benedict-96 Dec 4, 2023
981b04b
Fixed typo.
benedict-96 Dec 4, 2023
9ae14b8
Fixed yaml file in branch documentation_for_optimizer_yaml_test.
benedict-96 Dec 5, 2023
883706a
Added package dependencies.
benedict-96 Dec 5, 2023
831144b
Merge branch 'documentation_for_optimizer' of https://github.com/Juli…
benedict-96 Dec 5, 2023
ed48915
Resolved merge conflicts.
benedict-96 Dec 5, 2023
5aa570b
Now making directory assets for the Documenter.
benedict-96 Dec 5, 2023
9445f19
Added two logos (normal and dark mode) via upload.
benedict-96 Dec 5, 2023
9bf1c7f
Merge branch 'documentation_for_optimizer' of https://github.com/Juli…
benedict-96 Dec 5, 2023
19b8c67
Updated path for logos.
benedict-96 Dec 5, 2023
d7fc42a
Update README.md
benedict-96 Dec 5, 2023
6856585
Now not using local pngs for logos in README.md
benedict-96 Dec 5, 2023
7e000dd
Removed pngs. Those are now saved online.
benedict-96 Dec 5, 2023
b64a779
fixed reference.
benedict-96 Dec 5, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 15 additions & 15 deletions .github/workflows/Documenter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,35 +4,35 @@ on:
push:
branches:
- main
tags: '*'
pull_request:

jobs:
build:
name: Documentation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@v1
with:
version: '1'
- name: Install dependencies
run: |
- uses: actions/checkout@v3
- run: |
sudo apt-get install imagemagick
sudo apt-get install poppler-utils
sudo apt-get install texlive-xetex
sudo apt-get install texlive-science
mkdir docs/src/assets
make all -C docs/src/tikz
- uses: julia-actions/setup-julia@latest
- run: |
julia --project=docs -e '
using Pkg
Pkg.develop(PackageSpec(path=pwd()))
Pkg.instantiate()
Pkg.build()
Pkg.precompile()'
- name: Run doctests
run: |
julia --project=docs -e '
Pkg.precompile()
using Documenter: doctest
using GeometricMachineLearning
doctest(GeometricMachineLearning)'
- name: Build and deploy Documentation
run: julia --project make.jl
working-directory: docs
julia --project=docs docs/make.jl
- uses: julia-actions/julia-buildpkg@v1
- uses: julia-actions/julia-docdeploy@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
DOCUMENTER_KEY: ${{ secrets.DOCUMENTER_KEY }}
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# Geometric Machine Learning
<picture>
<source media="(prefers-color-scheme: light)" srcset="https://github.com/JuliaGNI/GeometricMachineLearning.jl/assets/55493704/8d6d1410-b857-4e0f-8609-50e43be9a268">
<source media="(prefers-color-scheme: dark)" srcset="https://github.com/JuliaGNI/GeometricMachineLearning.jl/assets/55493704/014929d1-2297-4b2c-9359-58cadbb03a0e">
<img alt="Shows a black logo in light color mode and a white one in dark color mode.">
</picture>


[![Stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://juliagni.github.io/GeometricMachineLearning.jl/stable)
[![Latest](https://img.shields.io/badge/docs-latest-blue.svg)](https://juliagni.github.io/GeometricMachineLearning.jl/latest)
Expand Down Expand Up @@ -52,4 +57,4 @@ plot(trajectory_to_plot)
The optimization of the first layer is done on the Stiefel Manifold $St(n, N)$, and the optimizer used is the manifold version of Adam (see (Brantner, 2023)).

## References
- Brantner B. Generalizing Adam To Manifolds For Efficiently Training Transformers[J]. arXiv preprint arXiv:2305.16901, 2023.
- Brantner B. Generalizing Adam To Manifolds For Efficiently Training Transformers[J]. arXiv preprint arXiv:2305.16901, 2023.
15 changes: 15 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@

.PHONY: documenter images

all: images documenter

documenter:
julia --color=yes --project=. make.jl

images:
$(MAKE) all -C src/tikz

clean:
$(MAKE) empty -C src/tikz
rm -Rf build
rm -Rf src/tutorial
17 changes: 17 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,21 @@
[deps]
AbstractNeuralNetworks = "60874f82-5ada-4c70-bd1c-fa6be7711c8a"
BandedMatrices = "aae01518-5342-5314-be14-df237901396f"
ChainRulesCore = "d360d2e6-b24c-11e9-a2a3-2a2ae2dbcce4"
Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
DocumenterCitations = "daee34ce-89f3-4625-b898-19384cb65244"
ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
GeometricBase = "9a0b12b7-583b-4f04-aa1f-d8551b6addc9"
GeometricEquations = "c85262ba-a08a-430a-b926-d29770767bf2"
GeometricIntegrators = "dcce2d33-59f6-5b8d-9047-0defad88ae06"
GeometricMachineLearning = "194d25b2-d3f5-49f0-af24-c124f4aa80cc"
InteractiveUtils = "b77e0a4c-d291-57a0-90e8-8db25a27a240"
KernelAbstractions = "63c18a36-062a-441e-b654-da1e3ab1ce7c"
LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e"
NNlib = "872c559c-99b0-510c-b3b7-b6c96a88d5cd"
Plots = "91a5bcdd-55d7-5caf-9e0b-520d859cae80"
ProgressMeter = "92933f4c-e287-5a05-a399-4b506db050ca"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
TimerOutputs = "a759f4b9-e2f1-59dc-863e-4aeb61b1ea8f"
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
22 changes: 21 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
using GeometricMachineLearning
using Documenter
using DocumenterCitations
# using Weave

# this is necessary to avoid warnings. See https://documenter.juliadocs.org/dev/man/syntax/
ENV["GKSwstype"] = "100"

bib = CitationBibliography(joinpath(@__DIR__, "src", "GeometricMachineLearning.bib"))

makedocs(;
plugins=[bib],
modules=[GeometricMachineLearning],
authors="Michael Kraus, Benedikt Brantner",
repo="https://github.com/JuliaGNI/GeometricMachineLearning.jl/blob/{commit}{path}#L{line}",
Expand All @@ -11,28 +18,39 @@ makedocs(;
prettyurls=get(ENV, "CI", "false") == "true",
canonical="https://juliagni.github.io/GeometricMachineLearning.jl",
assets=String[],
# specifies that we do not display the package name again (it's already in the logo)
sidebar_sitename=false,
),
pages=[
"Home" => "index.md",
"Architectures" => [
"SympNet" => "architectures/sympnet.md",
],
"Manifolds" => [
"Concepts from General Topology" => "manifolds/basic_topology.md",
"General Theory on Manifolds" => "manifolds/manifolds.md",
"The Inverse Function Theorem" => "manifolds/inverse_function_theorem.md",
"The Submersion Theorem" => "manifolds/submersion_theorem.md",
"Homogeneous Spaces" => "manifolds/homogeneous_spaces.md",
"Stiefel" => "manifolds/stiefel_manifold.md",
"Grassmann" => "manifolds/grassmann_manifold.md",
"Differential Equations and the EAU theorem" => "manifolds/existence_and_uniqueness_theorem.md",
],
"Arrays" => [
"Global Tangent Space" => "arrays/stiefel_lie_alg_horizontal.md",
],
"Optimizer Framework" => "Optimizer.md",
"Optimizer Framework" => [
"Optimizers" => "Optimizer.md",
"General Optimization" => "optimizers/general_optimization.md",
],
"Optimizer Functions" => [
"Horizontal Lift" => "optimizers/manifold_related/horizontal_lift.md",
"Global Sections" => "optimizers/manifold_related/global_sections.md",
"Retractions" => "optimizers/manifold_related/retractions.md",
"Geodesic Retraction" => "optimizers/manifold_related/geodesic.md",
"Cayley Retraction" => "optimizers/manifold_related/cayley.md",
"Adam Optimizer" => "optimizers/adam_optimizer.md",
"BFGS Optimizer" => "optimizers/bfgs_optimizer.md",
],
"Special Neural Network Layers" => [
"Attention" => "layers/attention_layer.md",
Expand All @@ -49,9 +67,11 @@ makedocs(;
"Projection and Reduction Error" => "reduced_order_modeling/projection_reduction_errors.md",
],
"Tutorials" =>[
"Sympnets" => "tutorials/sympnet_tutorial.md",
"Linear Wave Equation" => "tutorials/linear_wave_equation.md",
"MNIST" => "tutorials/mnist_tutorial.md",
],
"References" => "references.md",
"Library" => "library.md",
],
)
Expand Down
153 changes: 153 additions & 0 deletions docs/src/GeometricMachineLearning.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
@article{brantner2023generalizing,
title={Generalizing Adam To Manifolds For Efficiently Training Transformers},
author={Brantner, Benedikt},
journal={arXiv preprint arXiv:2305.16901},
year={2023}
}

@article{jin2020sympnets,
title={SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems},
author={Jin, Pengzhan and Zhang, Zhen and Zhu, Aiqing and Tang, Yifa and Karniadakis, George Em},
journal={Neural Networks},
volume={132},
pages={166--179},
year={2020},
publisher={Elsevier}
}

@article{jin2022optimal,
title={Optimal unit triangular factorization of symplectic matrices},
author={Jin, Pengzhan and Lin, Zhangli and Xiao, Bo},
journal={Linear Algebra and its Applications},
year={2022},
publisher={Elsevier}
}

@book{hairer2006geometric,
title={Geometric Numerical integration: structure-preserving algorithms for ordinary differential equations},
author={Hairer, Ernst and Lubich, Christian and Wanner, Gerhard},
year={2006},
publisher={Springer}
}


@book{leimkuhler2004simulating,
title={Simulating hamiltonian dynamics},
author={Leimkuhler, Benedict and Reich, Sebastian},
number={14},
year={2004},
publisher={Cambridge university press}
}

@book{lang2012fundamentals,
title={Fundamentals of differential geometry},
author={Lang, Serge},
volume={191},
year={2012},
publisher={Springer Science \& Business Media}
}

@book{lipschutz1965general,
title={General Topology},
author={Seymour Lipschutz},
year={1965},
publisher={McGraw-Hill Book Company},
location={New York City, New York}
}

@book{bishop1980tensor,
title={Tensor Analysis on Manifolds},
author={Richard L. Bishop, Samuel I. Goldberg},
year={1980},
publisher={Dover Publications},
location={Mineola, New York}
}

@book{wright2006numerical,
title={Numerical optimization},
author={Stephen J. Wright, Jorge Nocedal},
year={2006},
publisher={Springer Science+Business Media},
location={New York, NY}
}

@article{fresca2021comprehensive,
title={A comprehensive deep learning-based approach to reduced order modeling of nonlinear time-dependent parametrized PDEs},
author={Fresca, Stefania and Dede’, Luca and Manzoni, Andrea},
journal={Journal of Scientific Computing},
volume={87},
pages={1--36},
year={2021},
publisher={Springer}
}

@article{buchfink2023symplectic,
title={Symplectic model reduction of Hamiltonian systems on nonlinear manifolds and approximation with weakly symplectic autoencoder},
author={Buchfink, Patrick and Glas, Silke and Haasdonk, Bernard},
journal={SIAM Journal on Scientific Computing},
volume={45},
number={2},
pages={A289--A311},
year={2023},
publisher={SIAM}
}

@article{peng2016symplectic,
title={Symplectic model reduction of Hamiltonian systems},
author={Peng, Liqian and Mohseni, Kamran},
journal={SIAM Journal on Scientific Computing},
volume={38},
number={1},
pages={A1--A27},
year={2016},
publisher={SIAM}
}

@article{luong2015effective,
title={Effective approaches to attention-based neural machine translation},
author={Luong, Minh-Thang and Pham, Hieu and Manning, Christopher D},
journal={arXiv preprint arXiv:1508.04025},
year={2015}
}

@article{bahdanau2014neural,
title={Neural machine translation by jointly learning to align and translate},
author={Bahdanau, Dzmitry and Cho, Kyunghyun and Bengio, Yoshua},
journal={arXiv preprint arXiv:1409.0473},
year={2014}
}

@article{greif2019decay,
title={Decay of the Kolmogorov N-width for wave problems},
author={Greif, Constantin and Urban, Karsten},
journal={Applied Mathematics Letters},
volume={96},
pages={216--222},
year={2019},
publisher={Elsevier}
}

@article{blickhan2023registration,
title={A registration method for reduced basis problems using linear optimal transport},
author={Blickhan, Tobias},
journal={arXiv preprint arXiv:2304.14884},
year={2023}
}

@article{lee2020model,
title={Model reduction of dynamical systems on nonlinear manifolds using deep convolutional autoencoders},
author={Lee, Kookjin and Carlberg, Kevin T},
journal={Journal of Computational Physics},
volume={404},
pages={108973},
year={2020},
publisher={Elsevier}
}

@article{vaswani2017attention,
title={Attention is all you need},
author={Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N and Kaiser, Lukasz and Polosukhin, Illia},
journal={Advances in neural information processing systems},
volume={30},
year={2017}
}
12 changes: 9 additions & 3 deletions docs/src/Optimizer.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@ In order to generalize neural network optimizers to [homogeneous spaces](manifol

Starting from an element of the tangent space $T_Y\mathcal{M}$[^1], we need to perform two mappings to arrive at $\mathfrak{g}^\mathrm{hor}$, which we refer to by $\Omega$ and a red horizontal arrow:

![](images/general_optimization_with_boundary.png)
[^1]: In practice this is obtained by first using an AD routine on a loss function $L$, and then computing the Riemannian gradient based on this. See the section of the [Stiefel manifold](manifolds/stiefel_manifold.md) for an example of this.

![](tikz/general_optimization_with_boundary.png)

Here the mapping $\Omega$ is a [horizontal lift](optimizers/manifold_related/horizontal_lift.md) from the tangent space onto the **horizontal component of the Lie algebra at $Y$**.

Expand All @@ -13,6 +15,10 @@ The red line maps the horizontal component at $Y$, i.e. $\mathfrak{g}^{\mathrm{h
The $\mathrm{cache}$ stores information about previous optimization steps and is dependent on the optimizer. The elements of the $\mathrm{cache}$ are also in $\mathfrak{g}^\mathrm{hor}$. Based on this the optimer ([Adam](optimizers/adam_optimizer.md) in this case) computes a final velocity, which is the input of a [retraction](optimizers/manifold_related/retractions.md). Because this *update* is done for $\mathfrak{g}^{\mathrm{hor}}\equiv{}T_Y\mathcal{M}$, we still need to perform a mapping, called `apply_section` here, that then finally updates the network parameters. The two red lines are described in [global sections](optimizers/manifold_related/global_sections.md).

## References
- Brantner B. Generalizing Adam To Manifolds For Efficiently Training Transformers[J]. arXiv preprint arXiv:2305.16901, 2023.

[^1]: In practice this is obtained by first using an AD routine on a loss function $L$, and then computing the Riemannian gradient based on this. See the section of the [Stiefel manifold](manifolds/stiefel_manifold.md) for an example of this.
```@bibliography
Pages = []
Canonical = false

brantner2023generalizing
```
Loading
Loading