Refactor basic Distributions #4508

kc611 · 2021-03-07T16:10:48Z

This PR refactors a few distributions according to the new RandomVariable class setup.

Refactored the following distributions :

Beta
Cauchy
Exponential
Half Cauchy
Half Normal
Inverse Gamma
Bernoulli
NegativeBinomial
Poisson
MvNormal
Multinomial

michaelosthege · 2021-03-09T23:11:46Z

@kc611 can you rebase on the latest v4 version? Then you'll get many tests enabled for the Github actions.
We treat XPASS as failures now, so you'll find out which tests got fixed by your distribution refactoring. You can then remove the xfail decorator on those.

kc611 · 2021-03-10T06:18:47Z

so you'll find out which tests got fixed by your distribution refactoring.

Actually there still seems to be some precision issue with the tests ( the distributions give values as expected but they don't match exactly with the values expected by the tests ) which is causing even the initially refactored normal distribution test to fail. So I guess some changes other than refactoring are needed for the tests to actually pass.

@brandonwillard might know more about this.

ricardoV94 · 2021-03-10T10:27:46Z

We should merge the changes in this PR #4497 to the V4 branch

In that PR we changed the check_logcdf precision of the normal distribution as we found it was not passing on all possible values in float32 runs.

In addition, to ensure a deterministic behavior we should set the n_samples optional argument in check_logcdf and check_logp of all the distributions marked with xfail due to precision issues to n_samples=-1. This will make those tests fail deterministically on float32 even after refactoring.

However we cannot do this to any of the tests mentioned in #4420, or else the runs will take way too long to complete.

Alternatively we can temporarily re-seed the TestMatchesScipy class, temporarily reverting the effect of this PR #4461

brandonwillard · 2021-03-12T00:57:46Z

Actually there still seems to be some precision issue with the tests ( the distributions give values as expected but they don't match exactly with the values expected by the tests ) which is causing even the initially refactored normal distribution test to fail. So I guess some changes other than refactoring are needed for the tests to actually pass.

Which ones exactly?

In a few of the tests I went through that involved sampling, some had issues caused by implicit expectations regarding test values. The root of this difference is the logic in v3 that sets a random variable's corresponding TensorVariable's test value (i.e. var.tag.test_value) to a distribution-specific value (e.g. a mean or mode). v4 no longer does this, so tests that implicitly relied on very specific default values would fail.

brandonwillard

Can you try (re)enabling the tests for our currently converted Distributions in pymc3.tests.test_distributions_random?

kc611 · 2021-03-13T05:13:26Z

Which ones exactly?

The ones in test_distribution.py. test_normal fails with the same error as all the others. Strangely test_uniform runs without a hitch.

Can you try (re)enabling the tests for our currently converted Distributions in pymc3.tests.test_distributions_random?

Looks like the module you specified is not yet updated according to the latest RV changes. The main testing functions and classes will need to updated accordingly before we can actually run those tests.

brandonwillard · 2021-03-13T21:25:08Z

Looks like the module you specified is not yet updated according to the latest RV changes. The main testing functions and classes will need to updated accordingly before we can actually run those tests.

Yes, it's a whole other task, but it's a critical one, because we need to reinstate the tests so that we can demonstrate—to ourselves and others—which things are and aren't working in v4.

kc611 · 2021-03-14T16:28:45Z

In pymc3.tests.test_distributions_random, the BaseTestCase class originally did testing for shapes of the .random() method. Is this still necessary to do it over here ? Since now the RV's have been moved to aesara and test_basic in aesara.tests.tensor.random seems to be doing the same thing. (Testing output shapes of random samples for different distributions ) or am I missing something here ?

The other two methods pymc3_random and pymc3_random_discrete test for the actual output values of the samples and having them here makes much more sense than BaseTestCase.

brandonwillard · 2021-03-14T18:14:43Z

Is this still necessary to do it over here ? Since now the RV's have been moved to aesara and test_basic in aesara.tests.tensor.random seems to be doing the same thing. (Testing output shapes of random samples for different distributions ) or am I missing something here ?

Those tests are almost necessarily doing the same thing as the tests in aesara.tests.tensor.random. With that in mind, we can remove the redundant tests (i.e. ones for which the corresponding RandomVariables are already tested in Aesara) and later convert the remaining tests into tests for the new RandomVariables that we'll implement in PyMC3.

kc611 · 2021-03-15T17:42:29Z

pymc3/distributions/continuous.py

-        # beta = draw_values([self.beta], point=point, size=size)[0]
-        # return generate_samples(self._random, beta, dist_shape=self.shape, size=size)
+@_logp.register(HalfCauchyRV)
+def half_cauchy_logp(op, value, beta, alpha):


I'm a bit sceptical as for why this function over here requires the extra alpha input argument even though it does not use it. I assumed it was something related to HalfCauchy being a special case of the above Cauchy distribution. The argument passed over here was a TensorVariable with a constant value of {1} during running of the test_distribution.

Where do you find it requires alpha?

Is it possible it's coming from the new random method? The scipy method allows for loc and scale, but we only used scale in pymc3. We should make sure the alpha / beta is not being confused in the random calls, since beta corresponds to the second optional argument in the scipy version (the scale): https://github.com/pymc-devs/aesara/blob/b852bd24472e13ae2a405b36eaad462830c89228/aesara/tensor/random/basic.py#L228

On running the test in test_distribution with only beta parameter, the log_p method got 4 parameters instead of 3. I assumed that had something to do with the mathematics involved. So I put in an extra argument. Strangely the value of beta is being passed first. The other one (fourth argument) always remains a constant.

Is it possible it's coming from the new random method?

Probably not, since the random method is doesn't have any say as to what happens in the log_p. The RV's just exist for dispatch purposes in case of log_p.

Still it seems that if a single variable is passed to the random op it will take it as a loc parameter, while the logp will take it as a scale parameter given the order of the arguments in the aesara op, no?

Not to distract from the extra parameter, which is certainly a problem.

brandonwillard · 2021-03-16T04:30:10Z

I've added a commit to this PR that allows for a much simpler Distribution refactoring process by automating the dispatch registrations. Now, any relevant log* or transform functions found within a Distribution class will have dispatch entries created for them automatically.

If the relevant Distribution tests pass, we should merge this.

brandonwillard added the v4 label Mar 8, 2021

michaelosthege added this to the vNext (4.0.0) milestone Mar 8, 2021

kc611 changed the title ~~Refactored Beta,Exponential and HalfNormal Distribution~~ Refactored distributions in distributions.continuous Mar 10, 2021

brandonwillard changed the title ~~Refactored distributions in distributions.continuous~~ Refactore distributions in distributions.continuous Mar 12, 2021

brandonwillard reviewed Mar 12, 2021

View reviewed changes

brandonwillard changed the title ~~Refactore distributions in distributions.continuous~~ Refactor distributions in distributions.continuous Mar 12, 2021

michaelosthege mentioned this pull request Mar 13, 2021

Change all xfail decorated tests currently xpassing so that they all fail #4516

Closed

Refactored distributions in pymc.distributions.continuous

3db730e

kc611 requested a review from brandonwillard March 15, 2021 17:36

kc611 commented Mar 15, 2021

View reviewed changes

brandonwillard added 2 commits March 15, 2021 23:12

Simplify the new Distribution interface and convert a few more

278d3d1

Use xfail mark in pymc3.tests.test_distributions_timeseries

f2f2e2a

brandonwillard approved these changes Mar 16, 2021

View reviewed changes

brandonwillard merged commit 1031722 into pymc-devs:v4 Mar 16, 2021

brandonwillard changed the title ~~Refactor distributions in distributions.continuous~~ Refactor basic Distributions Mar 16, 2021

brandonwillard mentioned this pull request Mar 16, 2021

Fix different / alternative parametrizations between RandomOps and Logp methods #4548

Merged

ricardoV94 mentioned this pull request Mar 19, 2021

Establish new test framework for the "random" behavior of Distributions #4554

Closed

This was referenced Mar 28, 2021

WIP: V4 update test framework for distributions random method #4580

Closed

V4 update test framework for distributions random method 2nd attempt #4608

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor basic Distributions #4508

Refactor basic Distributions #4508

kc611 commented Mar 7, 2021 •

edited by brandonwillard

Loading

michaelosthege commented Mar 9, 2021

kc611 commented Mar 10, 2021

ricardoV94 commented Mar 10, 2021 •

edited

Loading

brandonwillard commented Mar 12, 2021

brandonwillard left a comment

kc611 commented Mar 13, 2021 •

edited

Loading

brandonwillard commented Mar 13, 2021

kc611 commented Mar 14, 2021

brandonwillard commented Mar 14, 2021

kc611 Mar 15, 2021

ricardoV94 Mar 15, 2021 •

edited

Loading

kc611 Mar 15, 2021 •

edited

Loading

ricardoV94 Mar 15, 2021

ricardoV94 Mar 15, 2021

brandonwillard commented Mar 16, 2021

Refactor basic Distributions #4508

Refactor basic Distributions #4508

Conversation

kc611 commented Mar 7, 2021 • edited by brandonwillard Loading

michaelosthege commented Mar 9, 2021

kc611 commented Mar 10, 2021

ricardoV94 commented Mar 10, 2021 • edited Loading

brandonwillard commented Mar 12, 2021

brandonwillard left a comment

Choose a reason for hiding this comment

kc611 commented Mar 13, 2021 • edited Loading

brandonwillard commented Mar 13, 2021

kc611 commented Mar 14, 2021

brandonwillard commented Mar 14, 2021

kc611 Mar 15, 2021

Choose a reason for hiding this comment

ricardoV94 Mar 15, 2021 • edited Loading

Choose a reason for hiding this comment

kc611 Mar 15, 2021 • edited Loading

Choose a reason for hiding this comment

ricardoV94 Mar 15, 2021

Choose a reason for hiding this comment

ricardoV94 Mar 15, 2021

Choose a reason for hiding this comment

brandonwillard commented Mar 16, 2021

kc611 commented Mar 7, 2021 •

edited by brandonwillard

Loading

ricardoV94 commented Mar 10, 2021 •

edited

Loading

kc611 commented Mar 13, 2021 •

edited

Loading

ricardoV94 Mar 15, 2021 •

edited

Loading

kc611 Mar 15, 2021 •

edited

Loading