-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
draw_values draws from marginal distributions #3210
Comments
yes i think I also had a similar issue (cannot find which issue it is now), that if you have y ~ Normal(X*b, ...) with X being a latent variable with observed, prior prediction is not conditioned correctly. I think the current implementation is to sample the observed variable, which it necessary produce sample of all its parents as by product. When there is no observed or multiple observed things are not working well. Fixing it might be difficult in terms of finding a fast implementation (do forward pass on the DAG would surely work - we had an earlier implementation but it is much slower). |
Looking at this now - a few observations:
|
Another note:
with pm.Model() as model:
a = pm.Normal('a', mu=0, sd=100, observed=0)
b = pm.Normal('b', mu=a, sd=1e-4, observed=1e-6)
c = pm.Normal('c', mu=a, sd=1e-4, observed=1e-6)
d = pm.Normal('d')
trace = pm.sample()
posterior = pm.sample_posterior_predictive(trace)
with pm.Model() as model:
a = pm.Normal('a', mu=0, sd=10)
b = pm.Normal('b', mu=a, sd=1e-2)
c = pm.Normal('c', mu=a, sd=1e-2)
trace = pm.sample() |
I post here a related error I get when i try to sample from posterior distribution of a copula model. Versions and main componentsPyMC3 Version: tried with 3.5 and dev branch Reproducible exampleimport numpy as np
import pymc3 as pm
import scipy.special as ss
import matplotlib.pyplot as plt
import seaborn as sns
import scipy as sp
import theano.tensor as tt
from theano import shared
corr=0.6
_sigma=np.array([[1,corr],[corr,1]])
_mus=np.array([-1,2])
_sigmas=np.array([.5,.5])
N=10000
u=np.random.multivariate_normal((0,0),_sigma,size=N)
c=sp.stats.norm.cdf(u)
generated=sp.stats.lognorm.ppf(c,_sigmas,scale=np.exp(_mus))
def lognormal_cdf(values,mu,sigma):
return pm.math.invprobit((tt.log(values)-mu)/sigma)
def jacobian_det(f_inv_x, x):
grad = tt.reshape(pm.theanof.gradient(tt.sum(f_inv_x), [x]), x.shape)
return tt.log(tt.abs_(grad))
with pm.Model() as model:
data=shared(generated)
#hackish way to create cholesky mat. that yields a covariance mat. with 1s on the diagonal and corr anywhere else
corr = pm.LKJCorr('corr',n=2, eta=2)[0]
chol = tt.set_subtensor(tt.eye(2,2)[([1,1], [0,1])],[corr,tt.sqrt(1-tt.pow(corr,2))])
# r = pm.Uniform('r',lower=-1, upper=1)
# cov = pm.Deterministic('cov',
# tt.stacklists([[1., r],
# [r, 1.]]))
mus=pm.Normal('mus',0,1,shape=2)
sigmas=pm.HalfCauchy('sigmas',beta=2,shape=2)
ps=lognormal_cdf(data,mus,sigmas)
cs=pm.math.probit(ps)
us_dist = pm.MvNormal('copula',mu=np.zeros(2), chol=chol,observed=cs)
pm.Potential('jacob_det', jacobian_det(cs, data))
trace=pm.sample(1000)
sample=pm.sample_posterior_predictive(trace,model=model,vars=[us_dist,mus,sigmas],samples=10000)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-17-14dbe60ba96f> in <module>()
----> 1 sample=pm.sample_posterior_predictive(trace,model=model,vars=[us_dist,mus,sigmas],samples=10000)
2
3 simulated_c=sp.stats.norm.cdf(sample['copula'])
4 simulated=np.exp(sp.stats.norm.ppf(simulated_c)*sample['sigmas']+sample['mus'])
~\AppData\Local\Continuum\Anaconda3\envs\newroot\lib\site-packages\pymc3\sampling.py in sample_posterior_predictive(trace, samples, model, vars, size, random_seed, progressbar)
1118 # draw once to inspect the shape
1119 var_values = list(zip(varnames,
-> 1120 draw_values(vars, point=model.test_point, size=size)))
1121 ppc_trace = defaultdict(list)
1122 for varname, value in var_values:
~\AppData\Local\Continuum\Anaconda3\envs\newroot\lib\site-packages\pymc3\distributions\distribution.py in draw_values(params, point, size)
310 while to_eval or missing_inputs:
311 if to_eval == missing_inputs:
--> 312 raise ValueError('Cannot resolve inputs for {}'.format([str(params[j]) for j in to_eval]))
313 to_eval = set(missing_inputs)
314 missing_inputs = set()
ValueError: Cannot resolve inputs for ['copula'] FYI with version 3.4 I was able to successfully sample from the copula but samples of mus and sigmas were wrong (coming from the prior i guess). With version 3.5 and dev branch samples of mus and sigmas seem to be correct but this error appeared. |
…n PR pymc-devs#3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values.
* Fix for #3225. Made Triangular `c` attribute be handled consistently with scipy.stats. Added test and updated example code. * Fix for #3210 which uses a completely different approach than PR #3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values. * Removed leftover print statement * Added release notes and draw values context managers to mixture and multivariate distributions that make many calls to draw_values or other distributions random methods within their own random.
close by #3273 |
Description of your problem
draw_values
draws from the marginal distributions of the requested variables instead of the joint distribution.Please provide a minimal, self-contained, and reproducible example.
Please provide the full traceback.
This model is a rather simple hierarchical model. The expected joint distribution of variables
b
andc
is:However, the values drawn with
draw_values
are conditionally independent from one anotherThe marginals match but the joint distribution does not. Given that #2983 relies heavily on
draw_values
to makesample_prior_predictive
work, this problem is really important.I've given this issue some though before submitting it here and I think I found the origin of the problem and also a possible solution, that I'll pull in later. What seems to be happening is
draw_values
sees that it must sample froma
,b
andc
.Apply
nodes with ownerships so there is no dependency tree being searched and nothing happens togivens
orpoint
.draw_values
arrives to its final stage where it makes three seperate calls to_draw_value
, one for each variable._draw_value
, its decided thatparam
'srandom
method must be called.random
in turn issues a new call todraw_values
to get theNormal
'smu
andsd
. The values returned by this call todraw_values
are never made available to the other calls in point 3. Thus, effectively, each variable ends up being drawn from its marginal distribution.As it stands now, the only way that the nested
draw_values
calls from point 5 could be made aware of any value given toa
is by modifying thepoint
dictionary with the values drawn froma
, before calling the conditionally dependentb
andc
.My proposal is a follows. Each distribution should store a list of the
theano.Variable
's it is conditionally dependent on.For example
Normal
should have an attribute calledconditional_on
which should be the list[self.mu, self.tau]
. If this attribute is adopted as a convention of all distributions (in the end, every distribution already implicitly uses it when callingdraw_values
from theirrandom
method), we could build a dependency graph (I suppose it should be a DAG) between all the variables used bydraw_values
. I also imagine that this DAG could be built behind the scenes when adding random variables into a model, to save some unnecessary computations. This DAG should also take into account deterministic relations, given bytheano
owner input nodes, something that is already a bit in place indraw_values
. Knowing the full DAG, we can know which nodes are independent from anything else, start drawing their values, putting them into thepoint
dictionary, and them moving on to the next nodes in the hierarchy.Versions and main components
The text was updated successfully, but these errors were encountered: