Developing random method for conditional autoregression #4518

ckrapu · 2021-03-09T15:31:34Z

Support for conditional autoregressions AKA Gaussian Markov random fields was added in PR #4504 but the .random method was left not implemented as there was not an obvious choice of algorithms that would scale well as a function of the field dimension. A bit of discussion on this can be seen at issue #3689. There may exist efficient methods for doing this which could be implemented, however.

To be more explicit, the core difficulty is sampling x ~ CAR(mu, tau, alpha, W) where the standard way to do this is by taking a unit isotropic Gaussian vector z and solving z = Lx where L is the Cholesky decomposition of the CAR-structured precision matrix. This operation has cubic complexity in the dimension of x and thus is a nonstarter for CAR/GMRF vectors with more than a few hundred sites.

The text was updated successfully, but these errors were encountered:

ckrapu · 2021-03-09T15:53:31Z

A potentially viable method is described in the paper Fast sampling of Gaussian Markov random fields by Havard Rue. The basic recipe is as follows:

Sample hyperparameters tau, alpha for the CAR scale and autocorrelation, respectively
Compute a permutation of the rows of adjacency matrix W which minimizes the matrix bandwidth (Scipy method referenced here)
Calculate the Cholesky decomposition L of the precision matrix Q = tau*(D - alpha * W) using efficient routines for banded matrices (Scipy function)
Solve the equation z = Lx for x which is relatively quick since L is triangular. (Scipy function)

Note that step 2 doesn't need to be done for each sample - it only needs to be done once for each value of W.

brandonwillard · 2021-03-09T16:39:28Z

Distribution.random will be removed shortly during the move to v4, so, if anyone is looking to do this, it should be done in the context of an Aesara RandomVariable Op.

aerubanov · 2021-03-24T21:32:39Z

Hello @brandonwillard ! I'm locking for my first contribution in pymc and considering this issue. But I'm a litle bit confused by your comment about RandomVariables as I'm not very familiar with the codebase yet. I checked the classes in distributions/multivariative.py on v4 branch, but found that the code in random method is comented. So could you please give an example of using RandomVariable instead of the Distribution.random?

brandonwillard · 2021-03-24T22:55:28Z

In v4, Distribution.random is effectively replaced with Aesara's RandomVariable.perform (or—more commonly—RandomVariable.rng_fn).

The v4 Multinomial is a good reference, because it implements a custom Aesara MultinomialRV, instead of simply using an existing RandomVariable. Unfortunately, that's not the best example, because it just extends an existing RandomVariable.

A more complete example would look as follows:

import numpy as np

import aesara.tensor as aet

from aesara.graph.op import Op
from aesara.tensor.var import TensorVariable
from aesara.tensor.random.op import RandomVariable

from pymc3.distributions.distribution import Distribution


# Create your own `RandomVariable`...
class BlahRV(RandomVariable):
    name: str = "blah"

    # Provide the number of dimensions for this RV (e.g. `0` for a scalar, `1`
    # for a vector, etc.)
    ndim_supp: int = 0

    # Provide the number of dimensions for each parameter of the RV (e.g. if
    # there's only one vector parameter, `[1]`; for two parameters, one a
    # matrix and the other a scalar, `[2, 0]`; etc.)
    ndims_params: List[int] = [0, 0]

    # The NumPy/Aesara dtype for this RV (e.g. `"int32"`, `"int64"`).
    dtype: str = "floatX"

    # A pretty text and LaTeX representation for the RV
    _print_name: Tuple[str, str] = ("blah", "\\operatorname{blah}")

    # If you want to add a custom signature and default values for the
    # parameters, do it like this.
    def __call__(self, loc=0.0, scale=1.0, size=None, **kwargs) -> TensorVariable:
        return super().__call__(loc, scale, size=size, **kwargs)

    # This is the Python code that produces samples.  Its signature will always
    # start with a NumPy `RandomState` object, then the distribution
    # parameters, and, finally, the size.
    #
    # This is effectively the v4 replacement for `Distribution.random`.
    @classmethod
    def rng_fn(
        cls,
        rng: np.random.RandomState,
        loc: np.ndarray,
        scale: np.ndarray,
        size: Tuple[int, ...],
    ) -> np.ndarray:
        return scipy.stats.blah.rvs(loc, scale, random_state=rng, size=size)


# Create an instance of the `RandomVariable` `Op`...
blah = BlahRV()


# Now, in PyMC3 v4, we need to create a `Distribution` interface for our new
# `RandomVariable`, `BlahRV`.
# Instead of `Distribution`, use a more specific `Distribution` subclass, when
# possible.
class Blah(Distribution):

    # Specify the `RandomVariable` `Op` that this `Distribution` will use.
    rv_op: Op = blah

    def __new__(cls, name, *args, **kwargs):
        # If you want to set a default transform for all instances of this
        # class, this is currently the most direct way to do that.
        kwargs.setdefault("transform", some_transform)
        return super().__new__(cls, name, *args, **kwargs)

    @classmethod
    def dist(cls, loc, scale, *args, **kwargs) -> TensorVariable:
        """Create an instances of `self.rv_op`."""
        # Here, we can transform inputs and whatnot.
        # This will contain all the parameter setup logic from
        # the old `Distribution.__init__`.
        loc = aet.as_tensor_variable(loc)
        scale = aet.as_tensor_variable(scale)
        # The first argument to `super().dist` is a list of the parameters to `rv_op`.
        return super().dist([loc, scale], *args, **kwargs)

    # This is basically one-to-one with the old `Distribution.logp`
    def logp(
        value: TensorVariable, loc: TensorVariable, scale: TensorVariable
    ) -> TensorVariable:
        """Create a log-likelihood graph."""
        return aet.log(value - loc) - scale

aerubanov · 2021-03-25T07:08:35Z

@brandonwillard thank you so much for your help!

canyon289 · 2021-03-29T03:33:20Z

Some additional details for people reading this in case its not obvious. Some of the edits need to occur in Aesara here
https://github.com/pymc-devs/aesara/blob/master/aesara/tensor/random/basic.py

Others need to occur in the V4 branch of PyMC3 in the correct distribution type module (this one is the continuous module)

https://github.com/pymc-devs/pymc3/pull/4579/files#diff-f6da6e3a7c05955ac32a065b9eb57831718dd764fc5e92f8fcd98eaf92e1d79a

We only want to add the distributions in Aesara that are defined in numpy
pymc-devs/pytensor#321 (comment)

brandonwillard · 2021-03-29T04:08:14Z

Some of the edits need to occur in Aesara here
https://github.com/pymc-devs/aesara/blob/master/aesara/tensor/random/basic.py

Regarding the things discussed here, there aren't any edits that need to occur in Aesara.

zoj613 · 2021-03-31T21:30:09Z

Is there interest for an Intrinsic Conditional Autoregressive (ICAR) distribution? I have used it in the past for research. There is a paper [1] that formally specifies this distribution.

References

[1] https://www.sciencedirect.com/science/article/abs/pii/S2211675317301574

brandonwillard · 2021-03-31T21:45:39Z

as well as a technique by [2] explaining how such a distribution can be sampled from efficiently.

We could definitely use that, and I like its references.

zoj613 · 2021-03-31T22:10:30Z

as well as a technique by [2] explaining how such a distribution can be sampled from efficiently.

We could definitely use that, and I like its references.

Since the first reference is behind a paywall, section 4 of this paper has a nice summary of that reference: https://www.apps.stat.vt.edu/ferreira/Ferreira_2019.pdf

Essentially, the logp of a ICAR random effect can be computed directly and its limiting distribution is a singular multivariate gaussian.

fonnesbeck · 2021-03-31T23:35:01Z

Here's the first paper: https://drive.google.com/file/d/1gMOljbPudJnVVco33izZ3Dzqud7nue5f/view?usp=sharing

zoj613 · 2021-03-31T23:43:19Z

Here's the first paper: drive.google.com/file/d/1gMOljbPudJnVVco33izZ3Dzqud7nue5f/view?usp=sharing

Thank you very much.

conorhassan · 2022-07-10T17:48:05Z

Hi @zoj613, are you planning on finishing this issue? If not, could I pick it up? Thanks

twiecki · 2022-07-10T17:56:31Z

@conorhassan It's been long enough, you can pick it up.

ckrapu mentioned this issue Mar 9, 2021

Conditional autoregression distribution #4504

Merged

twiecki added the beginner friendly label Mar 9, 2021

brandonwillard mentioned this issue Mar 24, 2021

Write v4 upgrade guide #4515

Closed

brandonwillard mentioned this issue Mar 25, 2021

Add PolyaGamma distribution class #4531

Merged

5 tasks

aerubanov mentioned this issue Mar 31, 2021

CAR random variables #4596

Merged

larryshamalama mentioned this issue Apr 25, 2021

Distribution class refactoring: an example using the Weibull distribution #4668

Closed

ricardoV94 mentioned this issue May 15, 2021

Port remaining distributions to v4 #4686

Closed

26 tasks

ricardoV94 mentioned this issue Jun 17, 2021

Generalized poisson distribution #4775

Closed

zoj613 mentioned this issue Jul 10, 2021

Add sum-zero constrained intrinsic conditional autoregressive (ICAR) variable #4851

Closed

5 tasks

twiecki closed this as completed in #4596 Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Developing random method for conditional autoregression #4518

Developing random method for conditional autoregression #4518

ckrapu commented Mar 9, 2021 •

edited

Loading

ckrapu commented Mar 9, 2021

brandonwillard commented Mar 9, 2021

aerubanov commented Mar 24, 2021

brandonwillard commented Mar 24, 2021 •

edited

Loading

aerubanov commented Mar 25, 2021

canyon289 commented Mar 29, 2021

brandonwillard commented Mar 29, 2021

zoj613 commented Mar 31, 2021 •

edited

Loading

brandonwillard commented Mar 31, 2021

zoj613 commented Mar 31, 2021 •

edited

Loading

fonnesbeck commented Mar 31, 2021

zoj613 commented Mar 31, 2021

conorhassan commented Jul 10, 2022 •

edited

Loading

twiecki commented Jul 10, 2022

Developing random method for conditional autoregression #4518

Developing random method for conditional autoregression #4518

Comments

ckrapu commented Mar 9, 2021 • edited Loading

ckrapu commented Mar 9, 2021

brandonwillard commented Mar 9, 2021

aerubanov commented Mar 24, 2021

brandonwillard commented Mar 24, 2021 • edited Loading

aerubanov commented Mar 25, 2021

canyon289 commented Mar 29, 2021

brandonwillard commented Mar 29, 2021

zoj613 commented Mar 31, 2021 • edited Loading

References

brandonwillard commented Mar 31, 2021

zoj613 commented Mar 31, 2021 • edited Loading

fonnesbeck commented Mar 31, 2021

zoj613 commented Mar 31, 2021

conorhassan commented Jul 10, 2022 • edited Loading

twiecki commented Jul 10, 2022

ckrapu commented Mar 9, 2021 •

edited

Loading

brandonwillard commented Mar 24, 2021 •

edited

Loading

zoj613 commented Mar 31, 2021 •

edited

Loading

zoj613 commented Mar 31, 2021 •

edited

Loading

conorhassan commented Jul 10, 2022 •

edited

Loading