Symplectic Autoencoders are a type of neural network suitable for treating Hamiltonian parametrized PDEs with slowly decaying Kolmogorov $n$-width. It is based on proper symplectic decomposition (PSD) and symplectic neural networks (SympNets).
Hamiltonian PDEs are partial differential equations that, like its ODE counterpart, have a Hamiltonian associated with it. An example of this is the linear wave equation (see (Buckfink et al, 2023)) with Hamiltonian
\[\mathcal{H}(q, p; \mu) := \frac{1}{2}\int_\Omega\mu^2(\partial_\xi{}q(t,\xi;\mu))^2 + p(t,\xi;\mu)^2d\xi.\]
The PDE for to this Hamiltonian can be obtained similarly as in the ODE case:
\[\partial_t{}q(t,\xi;\mu) = \frac{\delta{}\mathcal{H}}{\delta{}p} = p(t,\xi;\mu), \quad \partial_t{}p(t,\xi;\mu) = -\frac{\delta{}\mathcal{H}}{\delta{}q} = \mu^2\partial_{\xi{}\xi}q(t,\xi;\mu)\]
As with regular parametric PDEs, we also associate a solution manifold with Hamiltonian PDEs. This is a finite-dimensional manifold, on which the dynamics can be described through a Hamiltonian ODE. I NEED A PROOF OR SOME EXPLANATION FOR THIS!
As with any other reduced order modeling technique we first discretize the PDE. This should be done with a structure-preserving scheme, thus yielding a (high-dimensional) Hamiltonian ODE as a result. Discretizing the wave equation above with finite differences yields a Hamiltonian system:
\[\mathcal{H}_\mathrm{discr}(z(t;\mu);\mu) := \frac{1}{2}x(t;\mu)^T\begin{bmatrix} -\mu^2D_{\xi{}\xi} & \mathbb{O} \\ \mathbb{O} & \mathbb{I} \end{bmatrix} x(t;\mu).\]
In Hamiltonian reduced order modelling we try to find a symplectic submanifold of the solution space that captures the dynamics of the full system as well as possible.
Similar to the regular PDE case we again build an encoder $\Psi^\mathrm{enc}$ and a decoder $\Psi^\mathrm{dec}$; but now both these mappings are required to be symplectic!
Concretely this means:
- The encoder is a mapping from a high-dimensional symplectic space to a low-dimensional symplectic space, i.e. $\Psi^\mathrm{enc}:\mathbb{R}^{2N}\to\mathbb{R}^{2n}$ such that $\nabla\Psi^\mathrm{enc}\mathbb{J}_{2N}(\nabla\Psi^\mathrm{enc})^T = \mathbb{J}_{2n}$.
- The decoder is a mapping from a low-dimensional symplectic space to a high-dimensional symplectic space, i.e. $\Psi^\mathrm{dec}:\mathbb{R}^{2n}\to\mathbb{R}^{2N}$ such that $(\nabla\Psi^\mathrm{dec})^T\mathbb{J}_{2N}\nabla\Psi^\mathrm{dec} = \mathbb{J}_{2n}$.
If these two maps are constrained to linear maps, then one can easily find good solutions with proper symplectic decomposition (PSD).
For PSD the two mappings $\Psi^\mathrm{enc}$ and $\Psi^\mathrm{dec}$ are constrained to be linear, orthonormal (i.e. $\Psi^T\Psi = \mathbb{I}$) and symplectic. The easiest way to enforce this is through the so-called cotangent lift:
\[\Psi_\mathrm{CL} =
+\begin{bmatrix} \Phi & \mathbb{O} \\ \mathbb{O} & \Phi \end{bmatrix},\]
and $\Phi\in{}St(n,N)\sub\mathbb{R}^{N\times{}n}$, i.e. is an element of the Stiefel manifold. If the snapshot matrix is of the form:
\[M = \left[\begin{array}{c:c:c:c}
+\hat{q}_1(t_0) & \hat{q}_1(t_1) & \quad\ldots\quad & \hat{q}_1(t_f) \\
+\hat{q}_2(t_0) & \hat{q}_2(t_1) & \ldots & \hat{q}_2(t_f) \\
+\ldots & \ldots & \ldots & \ldots \\
+\hat{q}_N(t_0) & \hat{q}_N(t_1) & \ldots & \hat{q}_N(t_f) \\
+\hat{p}_1(t_0) & \hat{p}_1(t_1) & \ldots & \hat{p}_1(t_f) \\
+\hat{p}_2(t_0) & \hat{p}_2(t_1) & \ldots & \hat{p}_2(t_f) \\
+\ldots & \ldots & \ldots & \ldots \\
+\hat{p}_{N}(t_0) & \hat{p}_{N}(t_1) & \ldots & \hat{p}_{N}(t_f) \\
+\end{array}\right],\]
then $\Phi$ can be computed in a very straight-forward manner:
- Rearrange the rows of the matrix $M$ such that we end up with a $N\times2(f+1)$ matrix: $\hat{M} := [M_q, M_p]$.
- Perform SVD: $\hat{M} = U\Sigma{}V^T$; set $\Phi\gets{}U\mathtt{[:,1:n]}$.
For details on the cotangent lift (and other methods for linear symplectic model reduction) consult (Peng and Mohseni, 2016).
PSD suffers from the similar shortcomings as regular POD: it is a linear map and the approximation space $\tilde{\mathcal{M}}= \{\Psi^\mathrm{dec}(z_r)\in\mathbb{R}^{2N}:u_r\in\mathrm{R}^{2n}\}$ is strictly linear. For problems with slowly-decaying Kolmogorov $n$-width this leads to very poor approximations.
In order to overcome this difficulty we use neural networks, more specifically SympNets, together with cotangent lift-like matrices. The resulting architecture, symplectic autoencoders, are demonstrated in the following image:
So we alternate between SympNet and PSD layers. Because all the PSD layers are based on matrices $\Phi\in{}St(n,N)$ we have to optimize on the Stiefel manifold.
- Buchfink, Patrick, Silke Glas, and Bernard Haasdonk. "Symplectic model reduction of Hamiltonian systems on nonlinear manifolds and approximation with weakly symplectic autoencoder." SIAM Journal on Scientific Computing 45.2 (2023): A289-A311.
- Peng, Liqian, and Kamran Mohseni. "Symplectic model reduction of Hamiltonian systems." SIAM Journal on Scientific Computing 38.1 (2016): A1-A27.