Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Developing random method for conditional autoregression #4518

Closed
ckrapu opened this issue Mar 9, 2021 · 14 comments · Fixed by #4596
Closed

Developing random method for conditional autoregression #4518

ckrapu opened this issue Mar 9, 2021 · 14 comments · Fixed by #4596

Comments

@ckrapu
Copy link
Contributor

ckrapu commented Mar 9, 2021

Support for conditional autoregressions AKA Gaussian Markov random fields was added in PR #4504 but the .random method was left not implemented as there was not an obvious choice of algorithms that would scale well as a function of the field dimension. A bit of discussion on this can be seen at issue #3689. There may exist efficient methods for doing this which could be implemented, however.

To be more explicit, the core difficulty is sampling x ~ CAR(mu, tau, alpha, W) where the standard way to do this is by taking a unit isotropic Gaussian vector z and solving z = Lx where L is the Cholesky decomposition of the CAR-structured precision matrix. This operation has cubic complexity in the dimension of x and thus is a nonstarter for CAR/GMRF vectors with more than a few hundred sites.

@ckrapu
Copy link
Contributor Author

ckrapu commented Mar 9, 2021

A potentially viable method is described in the paper Fast sampling of Gaussian Markov random fields by Havard Rue. The basic recipe is as follows:

  1. Sample hyperparameters tau, alpha for the CAR scale and autocorrelation, respectively
  2. Compute a permutation of the rows of adjacency matrix W which minimizes the matrix bandwidth (Scipy method referenced here)
  3. Calculate the Cholesky decomposition L of the precision matrix Q = tau*(D - alpha * W) using efficient routines for banded matrices (Scipy function)
  4. Solve the equation z = Lx for x which is relatively quick since L is triangular. (Scipy function)

Note that step 2 doesn't need to be done for each sample - it only needs to be done once for each value of W.

@brandonwillard
Copy link
Contributor

Distribution.random will be removed shortly during the move to v4, so, if anyone is looking to do this, it should be done in the context of an Aesara RandomVariable Op.

@aerubanov
Copy link
Contributor

Hello @brandonwillard ! I'm locking for my first contribution in pymc and considering this issue. But I'm a litle bit confused by your comment about RandomVariables as I'm not very familiar with the codebase yet. I checked the classes in distributions/multivariative.py on v4 branch, but found that the code in random method is comented. So could you please give an example of using RandomVariable instead of the Distribution.random?

@brandonwillard
Copy link
Contributor

brandonwillard commented Mar 24, 2021

In v4, Distribution.random is effectively replaced with Aesara's RandomVariable.perform (or—more commonly—RandomVariable.rng_fn).

The v4 Multinomial is a good reference, because it implements a custom Aesara MultinomialRV, instead of simply using an existing RandomVariable. Unfortunately, that's not the best example, because it just extends an existing RandomVariable.

A more complete example would look as follows:

import numpy as np

import aesara.tensor as aet

from aesara.graph.op import Op
from aesara.tensor.var import TensorVariable
from aesara.tensor.random.op import RandomVariable

from pymc3.distributions.distribution import Distribution


# Create your own `RandomVariable`...
class BlahRV(RandomVariable):
    name: str = "blah"

    # Provide the number of dimensions for this RV (e.g. `0` for a scalar, `1`
    # for a vector, etc.)
    ndim_supp: int = 0

    # Provide the number of dimensions for each parameter of the RV (e.g. if
    # there's only one vector parameter, `[1]`; for two parameters, one a
    # matrix and the other a scalar, `[2, 0]`; etc.)
    ndims_params: List[int] = [0, 0]

    # The NumPy/Aesara dtype for this RV (e.g. `"int32"`, `"int64"`).
    dtype: str = "floatX"

    # A pretty text and LaTeX representation for the RV
    _print_name: Tuple[str, str] = ("blah", "\\operatorname{blah}")

    # If you want to add a custom signature and default values for the
    # parameters, do it like this.
    def __call__(self, loc=0.0, scale=1.0, size=None, **kwargs) -> TensorVariable:
        return super().__call__(loc, scale, size=size, **kwargs)

    # This is the Python code that produces samples.  Its signature will always
    # start with a NumPy `RandomState` object, then the distribution
    # parameters, and, finally, the size.
    #
    # This is effectively the v4 replacement for `Distribution.random`.
    @classmethod
    def rng_fn(
        cls,
        rng: np.random.RandomState,
        loc: np.ndarray,
        scale: np.ndarray,
        size: Tuple[int, ...],
    ) -> np.ndarray:
        return scipy.stats.blah.rvs(loc, scale, random_state=rng, size=size)


# Create an instance of the `RandomVariable` `Op`...
blah = BlahRV()


# Now, in PyMC3 v4, we need to create a `Distribution` interface for our new
# `RandomVariable`, `BlahRV`.
# Instead of `Distribution`, use a more specific `Distribution` subclass, when
# possible.
class Blah(Distribution):

    # Specify the `RandomVariable` `Op` that this `Distribution` will use.
    rv_op: Op = blah

    def __new__(cls, name, *args, **kwargs):
        # If you want to set a default transform for all instances of this
        # class, this is currently the most direct way to do that.
        kwargs.setdefault("transform", some_transform)
        return super().__new__(cls, name, *args, **kwargs)

    @classmethod
    def dist(cls, loc, scale, *args, **kwargs) -> TensorVariable:
        """Create an instances of `self.rv_op`."""
        # Here, we can transform inputs and whatnot.
        # This will contain all the parameter setup logic from
        # the old `Distribution.__init__`.
        loc = aet.as_tensor_variable(loc)
        scale = aet.as_tensor_variable(scale)
        # The first argument to `super().dist` is a list of the parameters to `rv_op`.
        return super().dist([loc, scale], *args, **kwargs)

    # This is basically one-to-one with the old `Distribution.logp`
    def logp(
        value: TensorVariable, loc: TensorVariable, scale: TensorVariable
    ) -> TensorVariable:
        """Create a log-likelihood graph."""
        return aet.log(value - loc) - scale

@aerubanov
Copy link
Contributor

@brandonwillard thank you so much for your help!

@canyon289
Copy link
Member

Some additional details for people reading this in case its not obvious. Some of the edits need to occur in Aesara here
https://github.com/pymc-devs/aesara/blob/master/aesara/tensor/random/basic.py

Others need to occur in the V4 branch of PyMC3 in the correct distribution type module (this one is the continuous module)

https://github.com/pymc-devs/pymc3/pull/4579/files#diff-f6da6e3a7c05955ac32a065b9eb57831718dd764fc5e92f8fcd98eaf92e1d79a

We only want to add the distributions in Aesara that are defined in numpy
pymc-devs/pytensor#321 (comment)

@brandonwillard
Copy link
Contributor

Some of the edits need to occur in Aesara here
https://github.com/pymc-devs/aesara/blob/master/aesara/tensor/random/basic.py

Regarding the things discussed here, there aren't any edits that need to occur in Aesara.

@zoj613
Copy link
Contributor

zoj613 commented Mar 31, 2021

Is there interest for an Intrinsic Conditional Autoregressive (ICAR) distribution? I have used it in the past for research. There is a paper [1] that formally specifies this distribution.

References

[1] https://www.sciencedirect.com/science/article/abs/pii/S2211675317301574

@brandonwillard
Copy link
Contributor

as well as a technique by [2] explaining how such a distribution can be sampled from efficiently.

We could definitely use that, and I like its references.

@zoj613
Copy link
Contributor

zoj613 commented Mar 31, 2021

as well as a technique by [2] explaining how such a distribution can be sampled from efficiently.

We could definitely use that, and I like its references.

Since the first reference is behind a paywall, section 4 of this paper has a nice summary of that reference: https://www.apps.stat.vt.edu/ferreira/Ferreira_2019.pdf

Essentially, the logp of a ICAR random effect can be computed directly and its limiting distribution is a singular multivariate gaussian.

@fonnesbeck
Copy link
Member

@zoj613
Copy link
Contributor

zoj613 commented Mar 31, 2021

Here's the first paper: drive.google.com/file/d/1gMOljbPudJnVVco33izZ3Dzqud7nue5f/view?usp=sharing

Thank you very much.

@conorhassan
Copy link

conorhassan commented Jul 10, 2022

Hi @zoj613, are you planning on finishing this issue? If not, could I pick it up? Thanks

@twiecki
Copy link
Member

twiecki commented Jul 10, 2022

@conorhassan It's been long enough, you can pick it up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants