Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecated ContextWindowDataset.observables attribute? #10

Open
Danfoa opened this issue Mar 28, 2024 · 6 comments
Open

Deprecated ContextWindowDataset.observables attribute? #10

Danfoa opened this issue Mar 28, 2024 · 6 comments
Assignees
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@Danfoa
Copy link
Contributor

Danfoa commented Mar 28, 2024

I noticed that the BaseModel class makes reference to an observables attribute from the ContextWindowDataset class, which seems to be undefined and undocumented within that class. For example, this issue can be observed here:

def predict(
self,
data: ContextWindowDataset,
t: int = 1,
predict_observables: bool = True,
reencode_every: int = 0,
):
"""
Predicts the state or, if the system is stochastic, its expected value :math:`\mathbb{E}[X_t | X_0 = X]` after ``t`` instants given the initial conditions ``data.lookback(self.lookback_len)`` being the lookback slice of ``data``.
If ``data.observables`` is not ``None``, returns the analogue quantity for the observable instead.
Args:
data (ContextWindowDataset): Dataset of context windows. The lookback window of ``data`` will be used as the initial condition, see the note above.
t (int): Number of steps in the future to predict (returns the last one).
predict_observables (bool): Return the prediction for the observables in ``data.observables``, if present. Default to ``True``.
reencode_every (int): When ``t > 1``, periodically reencode the predictions as described in :footcite:t:`Fathi2023`. Only available when ``predict_observables = False``.
Returns:
The predicted (expected) state/observable at time :math:`t`. The result is composed of arrays with shape matching ``data.lookforward(self.lookback_len)`` or the contents of ``data.observables``. If ``predict_observables = True`` and ``data.observables != None``, the returned ``dict`` will contain the special key ``__state__`` containing the prediction for the state as well.
"""

The data.observables attribute is neither defined nor documented in the ContextWindowDataset, which leads to confusion. Similarly, the purpose of the predict_observables attribute in the base model’s predict method is unclear.

I suggest that this attribute should either be clearly documented and defined within the ContextWindowDataset class or removed from the function call and documentation to avoid confusion. Generally, dynamically adding attributes to class instances without prior definition or documentation is not a recommended practice.

Additionally, it would be beneficial to establish a clear vocabulary for the types of observable functions. Currently, it's not clear whether "observables" refer to state observables, which are scalar/vector-valued functions analytically computed/measured from the dynamical system, or to state latent observables/features, which are scalar/vector-valued functions we learn or define based on state observables.

Does kooplearn have an established nomenclature for distinguishing these types of observables? @pietronvll @vladi-iit @g-turri Should we make this distinction?

@Danfoa Danfoa changed the title Update documentation of data.observables Deprecated ContextWindowDataset.observables attribute? Mar 28, 2024
@pietronvll
Copy link
Contributor

Hi, @Danfoa! Thanks for opening this issue. Indeed, the observables attribute from ContextWindowDataset is work-in-progress and undocumented.

The idea of the API is that upon creating an instance of ContextWindowDataset, you can also specify a dictionary of observables associated with the states in the context windows.

As of now, the way to create an observables attribute is to dynamically adding it on an instantiated ContextWindowDataset as

ctxs = ContextWindowDataset(data)
ctxs.observables = {
  'obs_1': data_obs_1,
  'obs_2': data_obs_2
}

which as you say is not a good practice. To address this issue, we can start by modifying TensorContextWindow,

class TensorContextDataset(ContextWindowDataset):

by adding an observables property with getter and setter methods, where the setter methods check that the shape of the first two dimensions of ctxs.data and ctxs.observables['obs_1'] coincide. Recall how the first two dimensions of ctxs.data have shape number of contexts, context length.

It is reasonable to initialize the observables property to None, whenever they are not needed. See also this code snippet, parsing the observable dict to perform observables forecasting in the Linear, Nonlinear and Kernel models.

def parse_observables(observables_dict, data: ContextWindow, data_fit: ContextWindow):

I am assigning this issue to myself and @Danfoa. Let's create a branch out of dev and work from it. @GregoirePacreau might be interested in it as well.

@pietronvll pietronvll added documentation Improvements or additions to documentation enhancement New feature or request labels Mar 29, 2024
@GregoirePacreau
Copy link
Collaborator

GregoirePacreau commented Mar 29, 2024

Just to verify, if i have a series

$$(X_i, Y_i)_{i \in [T]}$$

and want to compute

$$E(f(Y_{T+1}) \vert X_{T+1})$$

I need to have in the observable dictionary an array containing the $(f(X_i),f(Y_i))_{i\in [T]}$ (in context form) ?

What would be the behaviour of the modes method when several observables are provided? Will it give a sequence of modes, one for each observable?

And finally, should we allow for functionals in the observables dictionary, or at least have a method that creates the correct array given a functional ?

@Danfoa
Copy link
Contributor Author

Danfoa commented Mar 29, 2024

Establishing a shared nomenclature for these concepts is crucial from the outset, especially since the term "observables" remains somewhat ambiguous to me within our context.

In the Kooplearn framework, to ensure clarity both among ourselves and for our users, it's essential to have well-defined terminology and documentation differentiating between:

  • State observables: Analytical or estimated measurements from the dynamical system, forming the X and Y data matrices, from which features/kernels, etc., are computed.
  • Latent (state) observables: Latent features or observables learned for latent variable Koopman models.
  • New/unseen/untrained observables: Measurements or observables not included in the state observables but predicted post learning the feature space. It's important to note that predicting these requires first solving a regression problem to define each new observable's projection on the learned latent space basis, as hinted by @GregoirePacreau in a previous comment.

While the terms I've used are suggestions, our aim should be to clearly delineate these categories in our documentation and nomenclature for transparency and ease of understanding.

@pietronvll
Copy link
Contributor

My definitions, which I try to use throughout kooplearn and also follow the definitions in our papers, are:

States

A state of the dynamical system/stochastic process is usually denoted $x_{t}$ (deterministic dynamics) or $X_{t}$ (stochastic process). States are defined on the state space $\mathcal{X}$, which is usually $\mathbb{R}^{d}$ or a subset thereof. As the name suggests, a state provides the full knowledge of the system at a given instant in time.

@Danfoa, I know that according to the previous description, states are not defined uniquely. For example, any bijective transformation of a state (which in turn is an observable, see below) is again another state. As the algorithms in kooplearn assume perfect observability of the system, they require context windows of states upon fitting. Apart from being states, however, kooplearn does not impose any restriction on their representation.

In short: a state in kooplearn is any variable giving a full description of the system. Context windows are sequences of observed states, and they are used for fitting and inferencing from kooplearn models.

Observables

Observables are arbitrary functions of states $f : \mathcal{X} \to \mathbb{R}^{p}$. Observables may or may not describe the dynamical system/stochastic process completely. Observables can be used as states if they give a full description. If they do not provide a complete description, however, they should not be used as states (that is, to fit kooplearn models) as it is known that partially observed Markov processes are not markovian in general.

Given these definitions, we can further distinguish between

  1. Learned observables: functions of the state provided by data-driven algorithms. These can be e.g. the eigenfunctions of a kooplearn.models.Kernel model or the learned DPNets feature map kooplearn.models.feature_maps.NNFeatureMap.
  2. Measurements: functions of the state which can be measured experimentally. These can be e.g. the COM velocity of a robot, the volatility of a stock, or the average energy of a molecule.

To answer @GregoirePacreau's question:

  1. Yes, the observable dictionary should contain a tensor with the evaluation of $f$ on the same states contained in the context window.
  2. The modes function now return a tuple (modes_dict, eigs) containing a dictionary with the same structure of ctxs.observables and an array with the eigenvalues corresponding to each of the modes.

return results, _eigs

To take the $i$-th mode of the observable f you should use

ctxs = TensorContextWindow( ... )
ctxs.observables = {
    'f': data_obs_f
}
modes, eigs = model.modes(ctxs)
modes_f = modes['f']

#i-th mode of f
modes_f[i]

@pietronvll
Copy link
Contributor

Something to add to this PR I just noticed:

Handling observables is a bit awkward, as calling predict(test_data, ...) (or modes) looks for a dict-like observables attribute within test_data. This dict-like attribute should contain raw numpy arrays of the observables evaluated on the train data.

This should be fixed: observables should be defined on the train data from the get-go, and the dict-like should contain TensorContextDataset of observables instead of raw numpy arrays

@Danfoa
Copy link
Contributor Author

Danfoa commented Apr 2, 2024

Have a look at this DataClass, which I use to store observations (including state) from a given Markov dynamical system.

The key idea is to keep observables separated and named, for instance:

  • Kinetic Energy: tensor of shape (time_horizon, 1)
  • CoM momentum: tensor of shape (time_horizon, 3)
  • Temperature: tensor of shape (time_horizon, 1)

If you want all/some stacked observables you get a view of the data. In general this added structure of knowing which parts of the state are vectors, scalars, etc is needed to handle the system's symmetries. Adding this structure to a ContextWindowLike class, can enable us to log relevant additional metrics, like errors of each of these observables independently (if requested).

That DataClass has become quite helpful for me, maybe there is something there we can adapt to kooplean.

In general to solve the problem of the 5 inheritance classes, I propose to introduce a single ContextWindowLike class which is either a wrapper or direct inherited from Tensor/ndarray. Similar in spirit as GeometricTensor, which wraps a tensor and offers some additional structural attributes and methods. From what I can see, the idea behind introducing the TensorContextDataset class, which is the main interaction of torch with kooplearns data paradigm, is to:

  1. Allow the user to get the past_horizon/preset/lookback and future/prediction_horizon/lookforward without handcrafting the time indexing at every point.
  2. Design pipelines that are based on processing "trajectories/sequences of observables" instead of individual states.
  3. Keep track of the context_len, and (ideally) the features_shape.
  4. Handle automatically the backend.

In practice, when using torch, the TensorContextDataset is already being used as a wrapper for a "trajectory" Tensor/ndarray (correct me if wrong). I think we can design a Tensor wrapper class, in the spirit of GeometricTensor, which covers these 4 features, while is still processed as a tensor by the myriad of torch native functions, which expect Tensors instead of TensorContextDataset.

Dont see the need for the 5 levels of abstraction, which in practice are/will become problematic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

When branches are created from issues, their pull requests are automatically linked.

3 participants