Skip to content

Commit

Permalink
build based on 88f903c
Browse files Browse the repository at this point in the history
  • Loading branch information
Documenter.jl committed Apr 11, 2024
1 parent a4b1ea8 commit 8f2741e
Show file tree
Hide file tree
Showing 47 changed files with 2,120 additions and 2,120 deletions.
2 changes: 1 addition & 1 deletion latest/.documenter-siteinfo.json
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"documenter":{"julia_version":"1.10.2","generation_timestamp":"2024-04-10T16:20:04","documenter_version":"1.3.0"}}
{"documenter":{"julia_version":"1.10.2","generation_timestamp":"2024-04-11T14:12:05","documenter_version":"1.3.0"}}
2 changes: 1 addition & 1 deletion latest/Optimizer/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion latest/architectures/autoencoders/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion latest/architectures/sympnet/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -81,4 +81,4 @@
\begin{pmatrix}
q \\
K^T \mathrm{diag}(a)\sigma(Kq+b)+p
\end{pmatrix}.\]</p><p>The parameters of this layer are the <em>scaling matrix</em> <span>$K\in\mathbb{R}^{m\times d}$</span>, the bias <span>$b\in\mathbb{R}^{m}$</span> and the <em>scaling vector</em> <span>$a\in\mathbb{R}^{m}$</span>. The name &quot;gradient layer&quot; has its origin in the fact that the expression <span>$[K^T\mathrm{diag}(a)\sigma(Kq+b)]_i = \sum_jk_{ji}a_j\sigma(\sum_\ell{}k_{j\ell}q_\ell+b_j)$</span> is the gradient of a function <span>$\sum_ja_j\tilde{\sigma}(\sum_\ell{}k_{j\ell}q_\ell+b_j)$</span>, where <span>$\tilde{\sigma}$</span> is the antiderivative of <span>$\sigma$</span>. The first dimension of <span>$K$</span> we refer to as the <em>upscaling dimension</em>.</p><p>If we denote by <span>$\mathcal{M}^G$</span> the set of gradient layers, a <span>$G$</span>-SympNet is a function of the form <span>$\Psi=g_k \circ g_{k-1} \circ \cdots \circ g_0$</span> where <span>$(g_i)_{0\leq i\leq k} \subset (\mathcal{M}^G)^k$</span>. The index <span>$k$</span> is again the <em>number of hidden layers</em>.</p><p>Further note here the different roles played by round and square brackets: the latter indicates a nonlinear operation as opposed to a regular vector or matrix. </p><h3 id="Universal-approximation-theorems"><a class="docs-heading-anchor" href="#Universal-approximation-theorems">Universal approximation theorems</a><a id="Universal-approximation-theorems-1"></a><a class="docs-heading-anchor-permalink" href="#Universal-approximation-theorems" title="Permalink"></a></h3><p>In order to state the <em>universal approximation theorem</em> for both architectures we first need a few definitions:</p><p>Let <span>$U$</span> be an open set of <span>$\mathbb{R}^{2d}$</span>, and let us denote by <span>$\mathcal{SP}^r(U)$</span> the set of <span>$C^r$</span> smooth symplectic maps on <span>$U$</span>. We now define a topology on <span>$C^r(K, \mathbb{R}^n)$</span>, the set of <span>$C^r$</span>-smooth maps from a compact set <span>$K\subset\mathbb{R}^{n}$</span> to <span>$\mathbb{R}^{n}$</span> through the norm</p><p class="math-container">\[||f||_{C^r(K,\mathbb{R}^{n})} = \underset{|\alpha|\leq r}{\sum} \underset{1\leq i \leq n}{\max}\underset{x\in K}{\sup} |D^\alpha f_i(x)|,\]</p><p>where the differential operator <span>$D^\alpha$</span> is defined by </p><p class="math-container">\[D^\alpha f = \frac{\partial^{|\alpha|} f}{\partial x_1^{\alpha_1}...x_n^{\alpha_n}},\]</p><p>with <span>$|\alpha| = \alpha_1 +...+ \alpha_n$</span>. </p><p><strong>Definition</strong> <span>$\sigma$</span> is <strong><span>$r$</span>-finite</strong> if <span>$\sigma\in C^r(\mathbb{R},\mathbb{R})$</span> and <span>$\int |D^r\sigma(x)|dx &lt;+\infty$</span>.</p><p><strong>Definition</strong> Let <span>$m,n,r\in \mathbb{N}$</span> with <span>$m,n&gt;0$</span> be given, <span>$U$</span> an open set of <span>$\mathbb{R}^m$</span>, and <span>$I,J\subset C^r(U,\mathbb{R}^n)$</span>. We say <span>$J$</span> is <strong><span>$r$</span>-uniformly dense on compacta in <span>$I$</span></strong> if <span>$J \subset I$</span> and for any <span>$f\in I$</span>, <span>$\epsilon&gt;0$</span>, and any compact <span>$K\subset U$</span>, there exists <span>$g\in J$</span> such that <span>$||f-g||_{C^r(K,\mathbb{R}^{n})} &lt; \epsilon$</span>.</p><p>We can now state the universal approximation theorems:</p><p><strong>Theorem (Approximation theorem for LA-SympNet)</strong> For any positive integer <span>$r&gt;0$</span> and open set <span>$U\in \mathbb{R}^{2d}$</span>, the set of <span>$LA$</span>-SympNet is <span>$r$</span>-uniformly dense on compacta in <span>$SP^r(U)$</span> if the activation function <span>$\sigma$</span> is <span>$r$</span>-finite.</p><p><strong>Theorem (Approximation theorem for G-SympNet)</strong> For any positive integer <span>$r&gt;0$</span> and open set <span>$U\in \mathbb{R}^{2d}$</span>, the set of <span>$G$</span>-SympNet is <span>$r$</span>-uniformly dense on compacta in <span>$SP^r(U)$</span> if the activation function <span>$\sigma$</span> is <span>$r$</span>-finite.</p><p>There are many <span>$r$</span>-finite activation functions commonly used in neural networks, for example:</p><ul><li>sigmoid <span>$\sigma(x)=\frac{1}{1+e^{-x}}$</span> for any positive integer <span>$r$</span>, </li><li>tanh <span>$\tanh(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}}$</span> for any positive integer <span>$r$</span>. </li></ul><p>The universal approximation theorems state that we can, in principle, get arbitrarily close to any symplectomorphism defined on <span>$\mathbb{R}^{2d}$</span>. But this does not tell us anything about how to optimize the network. This is can be done with any common <a href="../../Optimizer/">neural network optimizer</a> and these neural network optimizers always rely on a corresponding loss function. </p><h2 id="Loss-function"><a class="docs-heading-anchor" href="#Loss-function">Loss function</a><a id="Loss-function-1"></a><a class="docs-heading-anchor-permalink" href="#Loss-function" title="Permalink"></a></h2><p>To train the SympNet, one need data along a trajectory such that the model is trained to perform an integration. These data are <span>$(Q,P)$</span> where <span>$Q[i,j]$</span> (respectively <span>$P[i,j]$</span>) is the real number <span>$q_j(t_i)$</span> (respectively <span>$p[i,j]$</span>) which is the j-th coordinates of the generalized position (respectively momentum) at the i-th time step. One also need a loss function defined as :</p><p class="math-container">\[Loss(Q,P) = \underset{i}{\sum} d(\Phi(Q[i,-],P[i,-]), [Q[i,-] P[i,-]]^T)\]</p><p>where <span>$d$</span> is a distance on <span>$\mathbb{R}^d$</span>.</p><p>See the <a href="../../tutorials/sympnet_tutorial/">tutorial section</a> for an introduction into using SympNets with <code>GeometricMachineLearning.jl</code>.</p><h2 id="References"><a class="docs-heading-anchor" href="#References">References</a><a id="References-1"></a><a class="docs-heading-anchor-permalink" href="#References" title="Permalink"></a></h2><div class="citation noncanonical"><dl><dt>[1]</dt><dd><div>P. Jin, Z. Zhang, A. Zhu, Y. Tang and G. E. Karniadakis. <em>SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems</em>. Neural Networks <strong>132</strong>, 166–179 (2020).</div></dd></dl></div><section class="footnotes is-size-7"><ul><li class="footnote" id="footnote-1"><a class="tag is-link" href="#citeref-1">1</a>Note that if <span>$k=1$</span> then the <span>$LA$</span>-SympNet consists of only one linear layer.</li></ul></section></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../../">« Home</a><a class="docs-footer-nextpage" href="../../manifolds/basic_topology/">Concepts from General Topology »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.3.0 on <span class="colophon-date" title="Wednesday 10 April 2024 16:20">Wednesday 10 April 2024</span>. Using Julia version 1.10.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
\end{pmatrix}.\]</p><p>The parameters of this layer are the <em>scaling matrix</em> <span>$K\in\mathbb{R}^{m\times d}$</span>, the bias <span>$b\in\mathbb{R}^{m}$</span> and the <em>scaling vector</em> <span>$a\in\mathbb{R}^{m}$</span>. The name &quot;gradient layer&quot; has its origin in the fact that the expression <span>$[K^T\mathrm{diag}(a)\sigma(Kq+b)]_i = \sum_jk_{ji}a_j\sigma(\sum_\ell{}k_{j\ell}q_\ell+b_j)$</span> is the gradient of a function <span>$\sum_ja_j\tilde{\sigma}(\sum_\ell{}k_{j\ell}q_\ell+b_j)$</span>, where <span>$\tilde{\sigma}$</span> is the antiderivative of <span>$\sigma$</span>. The first dimension of <span>$K$</span> we refer to as the <em>upscaling dimension</em>.</p><p>If we denote by <span>$\mathcal{M}^G$</span> the set of gradient layers, a <span>$G$</span>-SympNet is a function of the form <span>$\Psi=g_k \circ g_{k-1} \circ \cdots \circ g_0$</span> where <span>$(g_i)_{0\leq i\leq k} \subset (\mathcal{M}^G)^k$</span>. The index <span>$k$</span> is again the <em>number of hidden layers</em>.</p><p>Further note here the different roles played by round and square brackets: the latter indicates a nonlinear operation as opposed to a regular vector or matrix. </p><h3 id="Universal-approximation-theorems"><a class="docs-heading-anchor" href="#Universal-approximation-theorems">Universal approximation theorems</a><a id="Universal-approximation-theorems-1"></a><a class="docs-heading-anchor-permalink" href="#Universal-approximation-theorems" title="Permalink"></a></h3><p>In order to state the <em>universal approximation theorem</em> for both architectures we first need a few definitions:</p><p>Let <span>$U$</span> be an open set of <span>$\mathbb{R}^{2d}$</span>, and let us denote by <span>$\mathcal{SP}^r(U)$</span> the set of <span>$C^r$</span> smooth symplectic maps on <span>$U$</span>. We now define a topology on <span>$C^r(K, \mathbb{R}^n)$</span>, the set of <span>$C^r$</span>-smooth maps from a compact set <span>$K\subset\mathbb{R}^{n}$</span> to <span>$\mathbb{R}^{n}$</span> through the norm</p><p class="math-container">\[||f||_{C^r(K,\mathbb{R}^{n})} = \underset{|\alpha|\leq r}{\sum} \underset{1\leq i \leq n}{\max}\underset{x\in K}{\sup} |D^\alpha f_i(x)|,\]</p><p>where the differential operator <span>$D^\alpha$</span> is defined by </p><p class="math-container">\[D^\alpha f = \frac{\partial^{|\alpha|} f}{\partial x_1^{\alpha_1}...x_n^{\alpha_n}},\]</p><p>with <span>$|\alpha| = \alpha_1 +...+ \alpha_n$</span>. </p><p><strong>Definition</strong> <span>$\sigma$</span> is <strong><span>$r$</span>-finite</strong> if <span>$\sigma\in C^r(\mathbb{R},\mathbb{R})$</span> and <span>$\int |D^r\sigma(x)|dx &lt;+\infty$</span>.</p><p><strong>Definition</strong> Let <span>$m,n,r\in \mathbb{N}$</span> with <span>$m,n&gt;0$</span> be given, <span>$U$</span> an open set of <span>$\mathbb{R}^m$</span>, and <span>$I,J\subset C^r(U,\mathbb{R}^n)$</span>. We say <span>$J$</span> is <strong><span>$r$</span>-uniformly dense on compacta in <span>$I$</span></strong> if <span>$J \subset I$</span> and for any <span>$f\in I$</span>, <span>$\epsilon&gt;0$</span>, and any compact <span>$K\subset U$</span>, there exists <span>$g\in J$</span> such that <span>$||f-g||_{C^r(K,\mathbb{R}^{n})} &lt; \epsilon$</span>.</p><p>We can now state the universal approximation theorems:</p><p><strong>Theorem (Approximation theorem for LA-SympNet)</strong> For any positive integer <span>$r&gt;0$</span> and open set <span>$U\in \mathbb{R}^{2d}$</span>, the set of <span>$LA$</span>-SympNet is <span>$r$</span>-uniformly dense on compacta in <span>$SP^r(U)$</span> if the activation function <span>$\sigma$</span> is <span>$r$</span>-finite.</p><p><strong>Theorem (Approximation theorem for G-SympNet)</strong> For any positive integer <span>$r&gt;0$</span> and open set <span>$U\in \mathbb{R}^{2d}$</span>, the set of <span>$G$</span>-SympNet is <span>$r$</span>-uniformly dense on compacta in <span>$SP^r(U)$</span> if the activation function <span>$\sigma$</span> is <span>$r$</span>-finite.</p><p>There are many <span>$r$</span>-finite activation functions commonly used in neural networks, for example:</p><ul><li>sigmoid <span>$\sigma(x)=\frac{1}{1+e^{-x}}$</span> for any positive integer <span>$r$</span>, </li><li>tanh <span>$\tanh(x)=\frac{e^x-e^{-x}}{e^x+e^{-x}}$</span> for any positive integer <span>$r$</span>. </li></ul><p>The universal approximation theorems state that we can, in principle, get arbitrarily close to any symplectomorphism defined on <span>$\mathbb{R}^{2d}$</span>. But this does not tell us anything about how to optimize the network. This is can be done with any common <a href="../../Optimizer/">neural network optimizer</a> and these neural network optimizers always rely on a corresponding loss function. </p><h2 id="Loss-function"><a class="docs-heading-anchor" href="#Loss-function">Loss function</a><a id="Loss-function-1"></a><a class="docs-heading-anchor-permalink" href="#Loss-function" title="Permalink"></a></h2><p>To train the SympNet, one need data along a trajectory such that the model is trained to perform an integration. These data are <span>$(Q,P)$</span> where <span>$Q[i,j]$</span> (respectively <span>$P[i,j]$</span>) is the real number <span>$q_j(t_i)$</span> (respectively <span>$p[i,j]$</span>) which is the j-th coordinates of the generalized position (respectively momentum) at the i-th time step. One also need a loss function defined as :</p><p class="math-container">\[Loss(Q,P) = \underset{i}{\sum} d(\Phi(Q[i,-],P[i,-]), [Q[i,-] P[i,-]]^T)\]</p><p>where <span>$d$</span> is a distance on <span>$\mathbb{R}^d$</span>.</p><p>See the <a href="../../tutorials/sympnet_tutorial/">tutorial section</a> for an introduction into using SympNets with <code>GeometricMachineLearning.jl</code>.</p><h2 id="References"><a class="docs-heading-anchor" href="#References">References</a><a id="References-1"></a><a class="docs-heading-anchor-permalink" href="#References" title="Permalink"></a></h2><div class="citation noncanonical"><dl><dt>[1]</dt><dd><div>P. Jin, Z. Zhang, A. Zhu, Y. Tang and G. E. Karniadakis. <em>SympNets: Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems</em>. Neural Networks <strong>132</strong>, 166–179 (2020).</div></dd></dl></div><section class="footnotes is-size-7"><ul><li class="footnote" id="footnote-1"><a class="tag is-link" href="#citeref-1">1</a>Note that if <span>$k=1$</span> then the <span>$LA$</span>-SympNet consists of only one linear layer.</li></ul></section></article><nav class="docs-footer"><a class="docs-footer-prevpage" href="../../">« Home</a><a class="docs-footer-nextpage" href="../../manifolds/basic_topology/">Concepts from General Topology »</a><div class="flexbox-break"></div><p class="footer-message">Powered by <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> and the <a href="https://julialang.org/">Julia Programming Language</a>.</p></nav></div><div class="modal" id="documenter-settings"><div class="modal-background"></div><div class="modal-card"><header class="modal-card-head"><p class="modal-card-title">Settings</p><button class="delete"></button></header><section class="modal-card-body"><p><label class="label">Theme</label><div class="select"><select id="documenter-themepicker"><option value="auto">Automatic (OS)</option><option value="documenter-light">documenter-light</option><option value="documenter-dark">documenter-dark</option></select></div></p><hr/><p>This document was generated with <a href="https://github.com/JuliaDocs/Documenter.jl">Documenter.jl</a> version 1.3.0 on <span class="colophon-date" title="Thursday 11 April 2024 14:12">Thursday 11 April 2024</span>. Using Julia version 1.10.2.</p></section><footer class="modal-card-foot"></footer></div></div></div></body></html>
Loading

0 comments on commit 8f2741e

Please sign in to comment.