An inherent challenge for many practical PDE solvers is the large dimensionality of the resulting problems.
Our model
One popular way to reduce the complexity is to map a spatial state of our system
However, it's crucial that encoder and decoder do a good job at reducing the dimensionality of the problem. This is a very good task for DL approaches. Furthermore, we then need a time evolution of the latent space states wiewel2019lss
& {cite}wiewel2020lsssubdiv
, which in turn employs
the encoder/decoder of Kim et al. {cite}bkim2019deep
.
---
name: timeseries-lsp-overview
---
For time series predictions with ROMs, we encode the state of our system with an encoder $f_e$, predict
the time evolution with $f_t$, and then decode the full spatial information with a decoder $f_d$.
Reducing the dimension and complexity of computational models, often called reduced order modeling (ROM) or model reduction, is a classic topic in the computational field. Traditional techniques often employ techniques such as principal component analysis to arrive at a basis for a chosen space of solution. However, being linear by construction, these approaches have inherent limitations when representing complex, non-linear solution manifolds. In practice, all "interesting" solutions are highly non-linear, and hence DL has received a substantial amount of interest as a way to learn non-linear representations. Due to the non-linearity, DL representations can potentially yield a high accuracy with fewer degrees of freedom in the reduced model compared to classic approaches.
The canonical NN for reduced models is an autoencoder. This denotes a network whose sole task is to reconstruct a given input
with the encoder
Autoencoder networks are typically realized as stacks of convolutional layers.
While the details of these layers can be chosen flexibly, a key property of all
autoencoder architectures is that no connection between encoder and decoder part may
exist. Hence, the network has to be separable for encoder and decoder.
This is natural, as any connections (or information) shared between encoder and decoder
would prevent using the encoder or decoder in a standalone manner. E.g., the decoder has to be able to decode a full state
One popular variant of autoencoders is worth a mention here: the so-called variational autoencoders, or VAEs. These autoencoders follow the structure above, but additionally employ a loss term to shape the latent space of
Typically we use a normal distribution as target, which makes the latent space an
The goal of the temporal prediction is to compute a latent space state at time
$$ \text{arg min}{\theta_p} | f_p( \mathbf{c}{t};\theta_p) - \mathbf{c}_{t+1} |_2^2 $$
where the prediction network is denoted by
:class: tip
In classical dynamical systems literature, a data-driven prediction of future states
is typically formulated in terms of the so-called _Koopman operator_, which usually takes
the form of a matrix, i.e. uses a linear approach.
Traditional works have focused on obtaining good _Koopman operators_ that yield
a high accuracy in combination with a basis to span the space of solutions. In the approach
outlined above the $f_p$ network can be seen as a non-linear Koopman operator.
In order for this approach to work, we either need an appropriate history of previous states to uniquely identify the right next state, or our network has to internally store the previous history of states it has seen.
For the former variant, the prediction network
In the formulation above we have clearly split the en- / decoding and the time prediction parts. However, in practice an end-to-end training of all networks involved in a certain task is usually preferable, as the networks can adjust their behavior in accordance with the other components involved in the task.
For the time prediction, we can formulate the objective in terms of
$$ \text{arg min}{\theta_e,\theta_p,\theta_d} | f_d( f_p( f_e( \mathbf{s}{t} ;\theta_e) ;\theta_p) ;\theta_d) - \mathbf{s}_{t+1} |_2^2 $$
Ideally, this step is furthermore unrolled over time to stabilize the evolution over time. The resulting training will be significantly more expensive, as more weights need to be trained at once, and a much larger number of intermediate states needs to be processed. However, the increased cost typically pays off with a reduced overall inference error.
---
height: 300px
name: timeseries-lss-subdiv-prediction
---
Several time frames of an example prediction from {cite}`wiewel2020lsssubdiv`, which additionally couples the
learned time evolution with a numerically solved advection step.
The learned prediction is shown at the top, the reference simulation at the bottom.
To summarize, DL allows us to move from linear subspaces to non-linear manifolds, and provides a basis for performing complex steps (such as time evolutions) in the resulting latent space.
In order to make practical experiments in this area of deep learning, we can recommend this latent space simulation code, which realizes an end-to-end training for encoding and prediction. Alternatively, this learned model reduction code focuses on the encoding and decoding aspects.
Both are available as open source and use a combination of TensorFlow and mantaflow as DL and fluid simulation frameworks.