-
-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Full Rank Approximation #289
Conversation
Codecov Report
@@ Coverage Diff @@
## master #289 +/- ##
==========================================
+ Coverage 89.50% 90.02% +0.52%
==========================================
Files 32 33 +1
Lines 2400 2447 +47
==========================================
+ Hits 2148 2203 +55
+ Misses 252 244 -8
|
Hi @fonnesbeck |
Mean field ADVI does not run at all. Fails with:
|
Seems like a shape issue. Can you share a reproducible code snippet? |
There are some concerns about transformations not being stable |
Same behavior for mean-field ADVI |
Yes. Some transformations were missing for Bounded Distributions. Highlighting the same, I had opened an issue #283 some time back. Commit d61431b adds the remaining transformations. Before this commit, passing But I do not feel like its a good approach to manually add transformations to respective distributions. I tried to make it more general with a single class - class Interval(BackwardTransform):
name = "interval"
def __init__(self, lower_limit, upper_limit):
transform = tf.cond(
tf.math.is_inf(lower_limit),
lambda: tf.cond(
tf.math.is_inf(upper_limit),
lambda: tfb.Identity(),
lambda: tfb.Chain([tfb.Shift(upper_limit), tfb.Scale(-1), tfb.Exp()]), # upper - exp(x)
),
lambda: tf.cond(
tf.math.is_inf(upper_limit),
lambda: tfb.Chain([tfb.Shift(lower_limit), tfb.Exp()]), # exp(x) + lower
lambda: tfb.Sigmoid(low=lower_limit, high=upper_limit), # interval
),
)
super().__init__(transform) But this leads to many eager execution issues for lambda functions. Please suggest if there is a way to improve existing approach. |
@fonnesbeck, what is park bias model? |
@ferrine its a relatively large model that I was using to test ADVI, ported over from PyMC3. Its got a hierarchical component as well as Gaussian proccesses. Samples very slowly, so a good candidate for ADVI. |
Hi @fonnesbeck |
With the current PR, I am getting the following failure:
|
Can you rebase to current master so that all the covariance functions are available? |
Completed FullRank VI interface
Changed variable flattening to model flattening
dtype = dtype_util.common_dtype( | ||
self.state.all_unobserved_values.values(), dtype_hint=tf.float64 | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figuring out reasons of Cholesky Decomposition failure, this comment has been very helpful. Here are my few observations experimenting with VI and Gaussian Processes -
- High learning rates (order of 1e-2) leads to cholesky decomposition errors.
- Both Mean Field and Full Rank ADVI also lead to decomposition error with
tf.float32
even if we add a small WhiteNoise to the diagonal. But testing ADVI withtf.float64
leads to greater numerical stability of covariance matrix. So, I changed thedtype_hint
to usetf.float64
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. We had a long conversation figuring out the source of instabilities of park_bias model and it looks like smth is wrong with Cholesky implementation in float32 mode. We double-checked possible sources of numerical instabilities in the algorithm itself and failed to spot an obvious error. I'd rather leave this investigation on an ongoing basis. I hope most of the models do not suffer from these problems or we can spot the source of the problem later
and I also leave this experiment by @Sayam753 |
In another PR I would add full rank to the example NB, which also still reads "Full Rank ADVI - Coming Soon". |
This PR adds Full Rank ADVI interface. It is ready to review. Just to experiment with the API, I have created a gist inspired from @ColCarroll's notebook. I look forward to include mixture comparisons as well.