Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triangular distribution random sampling is incorrectly implemented #3225

Closed
lbignell opened this issue Oct 11, 2018 · 2 comments
Closed

Triangular distribution random sampling is incorrectly implemented #3225

lbignell opened this issue Oct 11, 2018 · 2 comments

Comments

@lbignell
Copy link

There seems to be a problem with the way scipy.stats.triang is called by Triangular.random.
An error is thrown even if you use a combination of 'upper', 'lower', and 'c' arguments found in the documentation, see below.
The fact that I can recover the correct parameters in the model below if I catch the error suggests that the issue is limited to when random is sampled.
Calling sample_ppc(dummytrace, model=dummymodel) throws the same error, presumably because it calls the random method.

However, when I change the way scipy.stats.triang is called in the source code to be analogous to that below, it doesn't fix the problem, so maybe I'm missing something.

Description of your problem

Please provide a minimal, self-contained, and reproducible example.

theupper = 8
thelower = 2
thec = 6.5
numobs = 100000
xvals = np.linspace(thelower,theupper,1000)
#This is how I think stats.triang should be called, which differs from Triangular.random.
dummyPDF = stats.triang(loc=thelower, scale=theupper - thelower, c=(thec - thelower)/(theupper-thelower))
yvals = dummyPDF.pdf(xvals)
thedist = pymc3.Triangular.dist(lower=thelower, upper=theupper, c=thec)
allOK = True
dummydata = dummyPDF.rvs(numobs)
try:
    thesamples = thedist.random(size=numobs)
except ValueError:
    #This always gets thrown. I've pasted the error traceback for if it's raised below.
    allOK = False
fig = plt.figure()
plt.plot(xvals, yvals, 'r')
if allOK:
    plt.hist(thesamples, histtype='step', bins=100, range=(thelower,theupper), normed=True);

plt.hist(dummydata, histtype='step', bins=100, range=(thelower,theupper), normed=True);
dummymodel = pymc3.Model()
with dummymodel:
    varlower = pymc3.Normal('varlower', mu=thelower, sd=5)
    varupper = pymc3.Normal('varupper', mu=theupper, sd=5)
    varc = pymc3.Normal('varc', mu=thec, sd=5)
    dummyObs = pymc3.Triangular('dummyObs', lower=varlower, upper=varupper, c=varc, observed=dummydata)
    start = pymc3.find_MAP()
    step = pymc3.Metropolis()
    dummytrace = pymc3.sample(draws=500, step=step, start=start)

pymc3.summary(dummytrace)

Please provide the full traceback.
If the error is caught:

varlower:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  1.993            0.005            0.000            [1.982, 2.002]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  1.982          1.990          1.993          1.997          2.002


varupper:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  8.006            0.003            0.000            [8.001, 8.012]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  8.001          8.004          8.005          8.006          8.014


varc:

  Mean             SD               MC Error         95% HPD interval
  -------------------------------------------------------------------
  
  6.502            0.007            0.001            [6.488, 6.512]

  Posterior quantiles:
  2.5            25             50             75             97.5
  |--------------|==============|==============|--------------|
  
  6.488          6.496          6.502          6.505          6.519

image

If the error is raised:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-198-9f2618c889b8> in <module>()
     14 dummydata = dummyPDF.rvs(numobs)
     15 try:
---> 16     thesamples = thedist.random(size=numobs)
     17 except ValueError:
     18     allOK = False

/home/lbignell/anaconda3/lib/python3.5/site-packages/pymc3/distributions/continuous.py in random(self, point, size)
   1860                                       point=point)
   1861         return generate_samples(stats.triang.rvs, c=c-lower, loc=lower, scale=upper-lower,
-> 1862                                 size=size, dist_shape=self.shape, random_state=None)
   1863 
   1864     def logp(self, value):

/home/lbignell/anaconda3/lib/python3.5/site-packages/pymc3/distributions/distribution.py in generate_samples(generator, *args, **kwargs)
    399     if broadcast_shape == (1,) and prefix_shape == ():
    400         if size is not None:
--> 401             samples = generator(size=size, *args, **kwargs)
    402         else:
    403             samples = generator(size=1, *args, **kwargs)

/home/lbignell/anaconda3/lib/python3.5/site-packages/scipy/stats/_distn_infrastructure.py in rvs(self, *args, **kwds)
    938         cond = logical_and(self._argcheck(*args), (scale >= 0))
    939         if not np.all(cond):
--> 940             raise ValueError("Domain error in arguments.")
    941 
    942         if np.all(scale == 0):

ValueError: Domain error in arguments.

Please provide any additional information below.

Versions and main components

  • PyMC3 Version: 3.2
  • Theano Version: 0.9.0
  • Python Version: 3.5.3
  • Operating system: Ubuntu 16.04
  • How did you install PyMC3: conda
@lucianopaz
Copy link
Contributor

Yes, it looks like c is handled differently in the Triangular distribution than in scipy.stats.triang. It's also inconsistent in the examples on the website. It should be an easy fix.

lucianopaz added a commit to lucianopaz/pymc that referenced this issue Nov 14, 2018
…istently with scipy.stats. Added test and updated example code.
twiecki pushed a commit that referenced this issue Nov 17, 2018
…with scipy.stats. Added test and updated example code. (#3253)
@lucianopaz
Copy link
Contributor

This can now be closed by #3253. Thanks @lbignell for reporting the bug!

ColCarroll pushed a commit to ColCarroll/pymc3 that referenced this issue Nov 22, 2018
…istently with scipy.stats. Added test and updated example code. (pymc-devs#3253)
junpenglao pushed a commit that referenced this issue Dec 3, 2018
* Fix for #3225. Made Triangular `c` attribute be handled consistently with scipy.stats. Added test and updated example code.

* Fix for #3210 which uses a completely different approach than PR #3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values.

* Removed leftover print statement

* Added release notes and draw values context managers to mixture and multivariate distributions that make many calls to draw_values or other distributions random methods within their own random.
twiecki pushed a commit that referenced this issue Dec 22, 2018
* Fix for #3225. Made Triangular `c` attribute be handled consistently with scipy.stats. Added test and updated example code.

* Added a more detailed error message for Broken pipes.

* Not a fix for #3140

* Fixed import of time. Trimmed the broken pipe exception handling. Added release notes.

* Moved maintenance message to release notes of pymc3.7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants