-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Developing random method for conditional autoregression #4518
Comments
A potentially viable method is described in the paper Fast sampling of Gaussian Markov random fields by Havard Rue. The basic recipe is as follows:
Note that step 2 doesn't need to be done for each sample - it only needs to be done once for each value of |
|
Hello @brandonwillard ! I'm locking for my first contribution in pymc and considering this issue. But I'm a litle bit confused by your comment about |
In
A more complete example would look as follows: import numpy as np
import aesara.tensor as aet
from aesara.graph.op import Op
from aesara.tensor.var import TensorVariable
from aesara.tensor.random.op import RandomVariable
from pymc3.distributions.distribution import Distribution
# Create your own `RandomVariable`...
class BlahRV(RandomVariable):
name: str = "blah"
# Provide the number of dimensions for this RV (e.g. `0` for a scalar, `1`
# for a vector, etc.)
ndim_supp: int = 0
# Provide the number of dimensions for each parameter of the RV (e.g. if
# there's only one vector parameter, `[1]`; for two parameters, one a
# matrix and the other a scalar, `[2, 0]`; etc.)
ndims_params: List[int] = [0, 0]
# The NumPy/Aesara dtype for this RV (e.g. `"int32"`, `"int64"`).
dtype: str = "floatX"
# A pretty text and LaTeX representation for the RV
_print_name: Tuple[str, str] = ("blah", "\\operatorname{blah}")
# If you want to add a custom signature and default values for the
# parameters, do it like this.
def __call__(self, loc=0.0, scale=1.0, size=None, **kwargs) -> TensorVariable:
return super().__call__(loc, scale, size=size, **kwargs)
# This is the Python code that produces samples. Its signature will always
# start with a NumPy `RandomState` object, then the distribution
# parameters, and, finally, the size.
#
# This is effectively the v4 replacement for `Distribution.random`.
@classmethod
def rng_fn(
cls,
rng: np.random.RandomState,
loc: np.ndarray,
scale: np.ndarray,
size: Tuple[int, ...],
) -> np.ndarray:
return scipy.stats.blah.rvs(loc, scale, random_state=rng, size=size)
# Create an instance of the `RandomVariable` `Op`...
blah = BlahRV()
# Now, in PyMC3 v4, we need to create a `Distribution` interface for our new
# `RandomVariable`, `BlahRV`.
# Instead of `Distribution`, use a more specific `Distribution` subclass, when
# possible.
class Blah(Distribution):
# Specify the `RandomVariable` `Op` that this `Distribution` will use.
rv_op: Op = blah
def __new__(cls, name, *args, **kwargs):
# If you want to set a default transform for all instances of this
# class, this is currently the most direct way to do that.
kwargs.setdefault("transform", some_transform)
return super().__new__(cls, name, *args, **kwargs)
@classmethod
def dist(cls, loc, scale, *args, **kwargs) -> TensorVariable:
"""Create an instances of `self.rv_op`."""
# Here, we can transform inputs and whatnot.
# This will contain all the parameter setup logic from
# the old `Distribution.__init__`.
loc = aet.as_tensor_variable(loc)
scale = aet.as_tensor_variable(scale)
# The first argument to `super().dist` is a list of the parameters to `rv_op`.
return super().dist([loc, scale], *args, **kwargs)
# This is basically one-to-one with the old `Distribution.logp`
def logp(
value: TensorVariable, loc: TensorVariable, scale: TensorVariable
) -> TensorVariable:
"""Create a log-likelihood graph."""
return aet.log(value - loc) - scale |
@brandonwillard thank you so much for your help! |
Some additional details for people reading this in case its not obvious. Some of the edits need to occur in Aesara here Others need to occur in the V4 branch of PyMC3 in the correct distribution type module (this one is the continuous module) We only want to add the distributions in Aesara that are defined in numpy |
Regarding the things discussed here, there aren't any edits that need to occur in Aesara. |
Is there interest for an Intrinsic Conditional Autoregressive (ICAR) distribution? I have used it in the past for research. There is a paper [1] that formally specifies this distribution. References[1] https://www.sciencedirect.com/science/article/abs/pii/S2211675317301574 |
We could definitely use that, and I like its references. |
Since the first reference is behind a paywall, section 4 of this paper has a nice summary of that reference: https://www.apps.stat.vt.edu/ferreira/Ferreira_2019.pdf Essentially, the logp of a ICAR random effect can be computed directly and its limiting distribution is a singular multivariate gaussian. |
Here's the first paper: https://drive.google.com/file/d/1gMOljbPudJnVVco33izZ3Dzqud7nue5f/view?usp=sharing |
Thank you very much. |
Hi @zoj613, are you planning on finishing this issue? If not, could I pick it up? Thanks |
@conorhassan It's been long enough, you can pick it up. |
Support for conditional autoregressions AKA Gaussian Markov random fields was added in PR #4504 but the
.random
method was left not implemented as there was not an obvious choice of algorithms that would scale well as a function of the field dimension. A bit of discussion on this can be seen at issue #3689. There may exist efficient methods for doing this which could be implemented, however.To be more explicit, the core difficulty is sampling
x ~ CAR(mu, tau, alpha, W)
where the standard way to do this is by taking a unit isotropic Gaussian vectorz
and solvingz = Lx
whereL
is the Cholesky decomposition of the CAR-structured precision matrix. This operation has cubic complexity in the dimension ofx
and thus is a nonstarter for CAR/GMRF vectors with more than a few hundred sites.The text was updated successfully, but these errors were encountered: