-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement betaincinv and gammainc[c]inv functions #502
Conversation
I finally was able to get my hands back to it, apologies for resuming it very late. I have only implemented the function "impl" so far, gradients were not implemented and not sure if it is needed in the case of those inverse regularized functions. Let me know what can be improved here. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #502 +/- ##
=======================================
Coverage 80.77% 80.77%
=======================================
Files 162 162
Lines 46105 46122 +17
Branches 11266 11267 +1
=======================================
+ Hits 37240 37255 +15
- Misses 6638 6640 +2
Partials 2227 2227
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking it up!
@amyoshino here is a more typical PR to add new scalar/elemwise Ops that you can use as a template: 2dc7591 |
For JAX dispatching, it seems like these exist in tensorflow-probability: https://www.tensorflow.org/probability/api_docs/python/tfp/substrates/jax/math/betaincinv We can do something like https://github.com/pymc-devs/pytensor/pull/403/files to provide them in the JAX backend |
Co-authored-by: Ricardo Vieira <[email protected]>
I see I am missing quite a few things, thanks for the guidance! |
I have added the first derivatives for I have not set up the JAX dispatching yet, but plan to do so in the next push. Thanks for all your guidance, and I am hoping to improve it soon! |
@ricardoV94 Phew, what a journey to get all tests good for this PR. I believe now it is all good and it is ready for review. Let me take this comment to summarize all the steps and assumptions used in this PR, I hope it helps the reviewing process. Overall I implemented the wrapper of
In order to write the derivative for ReferencesDerivative of:
I wasn't able to implement the derivatives of This PR will help the issue pymc-devs/pymc#6845, which will benefit from the implemented functions for the icdf formulas. |
@amyoshino sorry I still didn't have time to review, just this comment here caught my attention:
Instead of implementing a new Op, let's try to use this form instead? def beta(a, b):
return (gamma(a) * gamma(b)) / gamma(a + b) If it proves too unstable, we can try to use the log-form with This saves us from having to maintain another Op and its grads, plus missing C/JAX implementations. We can also add a user-facing helper in pytensor/pytensor/tensor/special.py Lines 739 to 744 in 39bda72
|
@ricardoV94 enjoy your holiday time, no worries about it. I delayed the progress of this PR for so long, it would be really nonsense on my end to expect a quick response in this 🥲 Nice suggestions! I will make the changes in the code shortly, and for safety, I'll try to find out if the proposed solution is not unstable in any ways. I will keep you posted on the progress. |
@ricardoV94, I investigated the stability of using the proposed form of the Since the connection between beta and gamma function is valid only when Unstable when arguments have "high" values:Approximation is great (only one tiny difference found in a relatively large range of arguments)When x < 0. and/or y < 0The thing starts to go a bit wrong if we want to keep consistency in the results between this implementation and the SciPy's Beta function when Results can change sign depending on the argument valueAnd also can have a large difference when values are high (only when one or more arguments are < 0):Given this information, should we go ahead and use your suggestion? I have already worked on the changes, whatever path you want to take is good for me. Let me know and can push it at anytime. 😄 |
That first sign flip doesn't seem too bad, the answer is pretty much 0. I didn't look carefully and the examples that followed. We should think about the stability and range in the context that first motivated these Ops: use in the gradients of this PR. Do the gradients remain stable in a reasonable parameter space of the original functions? Also the form of beta we use for the gradients need not have anything to do with the one we offer users in the tensor module. For the user facing one I would probably offer the naive way with gamma and also offer the betaln with gammaln. |
Aren't negative values some special analytic extension? Or is it standard to support it? Also for assessing closeness it's common to look not only at absolute error but also relative error. Those 1e+16 may not be so unreasonable if they correspond to a small relative error |
My gut feeling is we should use exp(beta_ln) in the scalar grads and offer beta and betaln to user in tensor.math. Let's see if the gradients don't blowup for some reasonable values of the original functions. you can test locally with verify_grad |
Thanks for the suggestions and pointers to where to look at! I will evaluate all the mentioned points.
Good call, I will check it and be mindful about the objective of the PR.
I will research more about it. Since the function is well defined only for positive values, I imagine it is kind of a special use, but let me find out in which circumstances negative values are used before stating anything.
Most of errors are change in sigs, so error is close to zero. But in few cases errors are large, but only happens when values are high as well, so I will evaluate if this is really a concern or if large errors happen only when the functions are itself unusable.
Thank you so much for the hint! I will need some time to check the gradients and be more familiarized with the trace to the gradient computation, having a tip on where to start is very valuable 😄 |
@ricardoV94, after checking the review comments, here is my suggestion:
Given the context of the PR, the use of beta function for the inverse of beta distribution gradient computation will not present us negative values as arguments (as parameters of the Moreover, I could not find use cases for negative-valued arguments for the beta function. So, if any use case comes up in the future, we can revise the user-facing helper in
For safety and consistency with previously implemented
I have done that as well! Thanks for suggesting it. I just pushed the latest code, I hope everything is good, but if not, do not hesitate in pointing out improvements, I am never tired of improving things and keeping the codebase as neat as possible in a single pass. 😄 |
@amyoshino this looks great, I left a tiny tiny comment above. Otherwise it seems good to merge! |
Default pre-commit to multi-line Co-authored-by: Ricardo Vieira <[email protected]>
@ricardoV94 As always, thanks for guididing me through this PR. Learned a lot about the codebase and tests with this one. |
My pleasure, and thanks for taking this one on. It will be really useful for the icdf methods! |
Motivation for these changes
Closes #413
Implement
betaincinv
andgammaincinv
functions.The implementation of these functions will allow us to implement quantile/ICDF functions for distributions like Beta, Gamma, ChiSquared and StudentT distributions, being fundamental to the issue: pymc-devs/pymc#6845
Implementation details
Following other scalar.math functions implementations, the code is extended to call the functions from
scipy.special
.Tests were added to check if it matches scipy implementation.
Checklist
Major / Breaking Changes
New features
Bugfixes
Documentation
Maintenance