-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function to optimize prior under constraints #5231
Conversation
Codecov Report
@@ Coverage Diff @@
## main #5231 +/- ##
=======================================
Coverage 79.08% 79.08%
=======================================
Files 87 88 +1
Lines 14394 14435 +41
=======================================
+ Hits 11383 11416 +33
- Misses 3011 3019 +8
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks cool, left some suggestions below
I would do the API differently to make that less verbose and elegant. Ideally, something like this: intercept = pm.Normal.fit("intercept", lower=0.1, upper=0.9)
# That could (in future) be extended as
intercept = pm.Normal.fit("intercept", data=input_data) |
I don't agree. You still need to provide initial parameter guesses, lower/upper can be distribution parameters, and you might want to specify that some parameters should be fixed, as I mentioned above. All that is more difficult if you mix optimization goals with distribution initialization. Also are you suggesting this would automatically add a new random variable to the model? I think this suggestion can create a lot of ambiguity, similar to how the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this go into util.py
? That's not a very discoverable file name but this is also a very small helper function.
@ferrine I agree with @ricardoV94 on this, at least for a first version, where we can try and see how people use that feature and what is missing (or not).
@ricardoV94 I actually put that into |
You can import it explicitly in |
I suggest that instead of minimizing two distances: @aseyboldt do you see any loss in functionality by doing things like this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a test with a discrete distribution?
All green @ricardoV94 🥳 Let's make that Xmas miracle happen! |
pymc/func_utils.py
Outdated
jac = pm.aesaraf.compile_pymc([dist_params], aesara_jac, allow_input_downcast=True) | ||
# when PyMC cannot compute the gradient | ||
# TODO: use specific gradient, not implemented exception | ||
except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the exception we want here is the one found in aesara.gradient.NullTypeGradError
as well as NotImplementedError
except Exception: | |
except (NotImplementedError, NullTyppeGradError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception that's thrown is a TypeError
by the Aesara grad
method (line 501) in aesara/gradient.py
:
if cost is not None and cost.ndim != 0:
> raise TypeError("cost must be a scalar.")
I added that error in the Except clause and tests pass locally. aesara.gradient.NullTypeGradError
and NotImplementedError
don't seem to be raised but I kept them in case they are by other cases we may have forgotten
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't catch that TypeError. That means we produced a wrong input to aesara grad.
The other two exceptions are the ones that (should) appear when a grad is not implemented for an Op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means we should make these two exceptions appear then, shouldn't we? Because they are not raised right now -- only the TypeError is raised
(here is to my first GH comment of the year 🥂 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means we should make these two exceptions appear then, shouldn't we? Because they are not raised right now -- only the TypeError is raised
(here is to my first GH comment of the year 🥂 )
Those two exceptions mean there is no grad implemented for some Op in the cdf which can very well happen and its a good reason to silently default to the scipy approximation. The TypeError, on the other hand, should not be catched, as that means we did something wrong.
In fact there was a point during this PR when it was always silently defaulting to the scipy approximation because we were passing two values to grad and suppressing the TypeError.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems to be the case. Locally it is passing in float64/float32 and 1e-5 precision, so we don't need to have the separate test just for the Poisson anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha ha damn, I just pushed without that change. Tests are indeed passing locally. Gonna refactor the tests and push again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hot damn, tests are passing locally 🔥 Pushed!
Why does the symbolic gradien help so much with numerical errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the logcdf uses gammaincc whose gradient is notoriously tricky. We somewhat recently added a numerically stable(r) implementation to Aesara
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my reply above. We shouldn't catch the TypeError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move the new tests to a file test_func_utils.py
if func_utils.py
is where the functions are going to live.
Good point, just did it |
Returns | ||
------- | ||
The optimized distribution parameters as a dictionary with the parameters' | ||
name as key and the optimized value as value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up PR (not to be sadistic with this one): we should add a code example in the docstrings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not preaching for my choir here, but I actually should add that here. Don't merge in the meantime. Will ping when done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done @ricardoV94 -- we can merge once tests pass
Ready for review 🍾
This PR introduces the
pm.find_optim_prior
function, that finds the optimal prior parameters of a distribution under some constraints.For instance, sometimes, while building a model, we think "I need a Gamma that has 95% probability mass between 0.1 and 0.4". It may take a long time to find an appropriate combination of parameters that give a decent Gamma.
pm.find_optim_prior
basically answers the question: "What are thealpha
andbeta
parameters that fit this constraint?".Notes:
pm.find_optim_prior
can be used with any PyMC distribution for which the log-cdf function is implemented.Depending on what your PR does, here are a few things you might want to address in the description:
RELEASE-NOTES.md