Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function to optimize prior under constraints #5231

Merged
merged 54 commits into from
Jan 4, 2022
Merged

Function to optimize prior under constraints #5231

merged 54 commits into from
Jan 4, 2022

Conversation

AlexAndorra
Copy link
Contributor

@AlexAndorra AlexAndorra commented Nov 30, 2021

Ready for review 🍾

This PR introduces the pm.find_optim_prior function, that finds the optimal prior parameters of a distribution under some constraints.

For instance, sometimes, while building a model, we think "I need a Gamma that has 95% probability mass between 0.1 and 0.4". It may take a long time to find an appropriate combination of parameters that give a decent Gamma.
pm.find_optim_prior basically answers the question: "What are the alpha and beta parameters that fit this constraint?".

Notes:

  • pm.find_optim_prior can be used with any PyMC distribution for which the log-cdf function is implemented.
  • For now, it only works for two-parameter distributions, as there are exactly two constraints (a lower and an upper bound).
  • When possible, it uses Aesara and gradients of the PyMC log CDF for more efficient computations. Otherwise, it falls back to Scipy's default.

Depending on what your PR does, here are a few things you might want to address in the description:

@codecov
Copy link

codecov bot commented Nov 30, 2021

Codecov Report

Merging #5231 (d89e375) into main (600fe90) will increase coverage by 0.00%.
The diff coverage is 87.80%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #5231   +/-   ##
=======================================
  Coverage   79.08%   79.08%           
=======================================
  Files          87       88    +1     
  Lines       14394    14435   +41     
=======================================
+ Hits        11383    11416   +33     
- Misses       3011     3019    +8     
Impacted Files Coverage Δ
pymc/func_utils.py 87.50% <87.50%> (ø)
pymc/__init__.py 96.07% <100.00%> (+0.07%) ⬆️
pymc/parallel_sampling.py 86.33% <0.00%> (-1.00%) ⬇️

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks cool, left some suggestions below

pymc/find_optim_prior.py Outdated Show resolved Hide resolved
@ferrine
Copy link
Member

ferrine commented Nov 30, 2021

I would do the API differently to make that less verbose and elegant. Ideally, something like this:

intercept = pm.Normal.fit("intercept", lower=0.1, upper=0.9)
# That could (in future) be extended as
intercept = pm.Normal.fit("intercept", data=input_data)

@ricardoV94
Copy link
Member

I would do the API differently to make that less verbose and elegant. Ideally, something like this:

intercept = pm.Normal.fit("intercept", lower=0.1, upper=0.9)
That could (in future) be extended as
intercept = pm.Normal.fit("intercept", data=input_data)

I don't agree. You still need to provide initial parameter guesses, lower/upper can be distribution parameters, and you might want to specify that some parameters should be fixed, as I mentioned above. All that is more difficult if you mix optimization goals with distribution initialization.

Also are you suggesting this would automatically add a new random variable to the model? I think this suggestion can create a lot of ambiguity, similar to how the .dist often confuses people. Specially because you need to play nice with all the additional keywords like dims, size, observed, transform and so on...

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go into util.py? That's not a very discoverable file name but this is also a very small helper function.

pymc/find_optim_prior.py Outdated Show resolved Hide resolved
@AlexAndorra
Copy link
Contributor Author

I would do the API differently to make that less verbose and elegant. Ideally, something like this

@ferrine I agree with @ricardoV94 on this, at least for a first version, where we can try and see how people use that feature and what is missing (or not).

Should this go into util.py? That's not a very discoverable file name but this is also a very small helper function.

@ricardoV94 I actually put that into util.py originally, but got into issues when trying to make find_optim_prior accessible at the pm level 🤷‍♂️ It's not something I'm a master at, so if someone knows how to do that, I'm down to put it back into util.py

@ricardoV94
Copy link
Member

You can import it explicitly in init.py I think.

@ricardoV94
Copy link
Member

ricardoV94 commented Dec 1, 2021

I suggest that instead of minimizing two distances: logcdf(lower) - np.log(alpha/2) and logcdf(upper) - np.log(1-alpha/2), we just minimize logdiffexp(logcdf(upper), logcdf(lower)) - np.log(mass), which among other things allows one to set the lower point to zero/a value where at the edge of the distribution support.

@aseyboldt do you see any loss in functionality by doing things like this?

pymc/tests/test_util.py Outdated Show resolved Hide resolved
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a test with a discrete distribution?

@AlexAndorra
Copy link
Contributor Author

All green @ricardoV94 🥳 Let's make that Xmas miracle happen!

pymc/func_utils.py Outdated Show resolved Hide resolved
pymc/func_utils.py Outdated Show resolved Hide resolved
RELEASE-NOTES.md Show resolved Hide resolved
pymc/tests/test_util.py Outdated Show resolved Hide resolved
pymc/tests/test_util.py Outdated Show resolved Hide resolved
jac = pm.aesaraf.compile_pymc([dist_params], aesara_jac, allow_input_downcast=True)
# when PyMC cannot compute the gradient
# TODO: use specific gradient, not implemented exception
except Exception:
Copy link
Member

@ricardoV94 ricardoV94 Dec 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the exception we want here is the one found in aesara.gradient.NullTypeGradError as well as NotImplementedError

Suggested change
except Exception:
except (NotImplementedError, NullTyppeGradError):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception that's thrown is a TypeError by the Aesara grad method (line 501) in aesara/gradient.py:

if cost is not None and cost.ndim != 0:
>           raise TypeError("cost must be a scalar.")

I added that error in the Except clause and tests pass locally. aesara.gradient.NullTypeGradError and NotImplementedError don't seem to be raised but I kept them in case they are by other cases we may have forgotten

Copy link
Member

@ricardoV94 ricardoV94 Dec 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't catch that TypeError. That means we produced a wrong input to aesara grad.

The other two exceptions are the ones that (should) appear when a grad is not implemented for an Op.

Copy link
Contributor Author

@AlexAndorra AlexAndorra Jan 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means we should make these two exceptions appear then, shouldn't we? Because they are not raised right now -- only the TypeError is raised
(here is to my first GH comment of the year 🥂 )

Copy link
Member

@ricardoV94 ricardoV94 Jan 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means we should make these two exceptions appear then, shouldn't we? Because they are not raised right now -- only the TypeError is raised
(here is to my first GH comment of the year 🥂 )

Those two exceptions mean there is no grad implemented for some Op in the cdf which can very well happen and its a good reason to silently default to the scipy approximation. The TypeError, on the other hand, should not be catched, as that means we did something wrong.

In fact there was a point during this PR when it was always silently defaulting to the scipy approximation because we were passing two values to grad and suppressing the TypeError.

Copy link
Member

@ricardoV94 ricardoV94 Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to be the case. Locally it is passing in float64/float32 and 1e-5 precision, so we don't need to have the separate test just for the Poisson anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha ha damn, I just pushed without that change. Tests are indeed passing locally. Gonna refactor the tests and push again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hot damn, tests are passing locally 🔥 Pushed!
Why does the symbolic gradien help so much with numerical errors?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the logcdf uses gammaincc whose gradient is notoriously tricky. We somewhat recently added a numerically stable(r) implementation to Aesara

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my reply above. We shouldn't catch the TypeError

@ricardoV94 ricardoV94 mentioned this pull request Jan 3, 2022
Copy link
Member

@ricardoV94 ricardoV94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should move the new tests to a file test_func_utils.py if func_utils.py is where the functions are going to live.

@AlexAndorra
Copy link
Contributor Author

Good point, just did it

Returns
-------
The optimized distribution parameters as a dictionary with the parameters'
name as key and the optimized value as value.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up PR (not to be sadistic with this one): we should add a code example in the docstrings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not preaching for my choir here, but I actually should add that here. Don't merge in the meantime. Will ping when done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done @ricardoV94 -- we can merge once tests pass

@ricardoV94 ricardoV94 merged commit 75ea2a8 into main Jan 4, 2022
@AlexAndorra AlexAndorra deleted the optim-prior branch January 4, 2022 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants