Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/param mixture #592

Merged
merged 18 commits into from
Apr 4, 2017
Merged

Feature/param mixture #592

merged 18 commits into from
Apr 4, 2017

Conversation

matthewdhoffman
Copy link
Collaborator

Adds the ParamMixture distribution to edward.models. This class implements a mixture model where all of the mixture components are from the same family, but with (possibly) different parameters. (Contrast with Mixture, which allows the mixture components to be of different families.)

@dustinvtran dustinvtran force-pushed the feature/ParamMixture branch 2 times, most recently from e004e6a to fd096f9 Compare March 28, 2017 14:45
@dustinvtran
Copy link
Member

dustinvtran commented Mar 29, 2017

@matthewdhoffman | Ready for you to take a look at again.

I robustified the implementation, specifically dealing with sample shape vs batch shape vs event shape. I wasn't sure how to implement mean/variance/stddev for arbitrary batch shapes, so I made them raise an error if the batch shape is non-scalar.

I also added another unit test. For the previous unit test, we were checking component-wise means/variances based on identifying from x.cat which mixture assignment a given draw came from. I wasn't sure if this was possible for the general _sample_n case, so I removed it.

@dustinvtran
Copy link
Member

I also had a question about the log prob. Is this doing the log sum density, summing over the individual component probabilities, or is this choosing one component's density? Seems like the latter?

@matthewdhoffman
Copy link
Collaborator Author

matthewdhoffman commented Apr 4, 2017

Interesting. I find the idea of a batch_size a little confusing in this context, but I think it makes sense to support it.

The most important thing I did in these commits is to restore the correspondence between cat and value. That relationship is necessary for inference—if log_prob() doesn't depend on cat, then there's no good way to infer what component generated which observation. Without it the motivating example in PR #588 won't work. As a bonus, now we can restore some of the tests in test_param_mixture_stats.py.

log_prob() is intended to give the conditional log p(x | z), not the marginal log p(x). Again, this is useful for inference. I added a marginal_log_prob() method that computes the marginal. (And added tests for both.) At some point we could restore mean(), std() etc. using similar logic.

@dustinvtran
Copy link
Member

Got it. All makes sense. merging

@dustinvtran dustinvtran merged commit 01e6f16 into master Apr 4, 2017
@dustinvtran dustinvtran deleted the feature/ParamMixture branch April 4, 2017 23:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants