Feature/simpler conjugacy #588

matthewdhoffman · 2017-03-25T20:47:50Z

This is a new, improved, rewritten version of the feature/conjugacy module I started months ago.

It mostly boils down to supporting one function: ed.complete_conditional(). This function takes as input rv (a RandomVariable) and blanket (a set of other RandomVariables), and tries to do some exponential-family algebra to figure out the conditional distribution of rv given blanket. (Behavior is unchanged if rv in blanket == True.)

The other user-facing change is adding a sample_shape kwarg to RandomVariable, which brings Edward syntax a bit more up to date with the tf.contrib.distribution syntax and reduces the amount of algebra the system has to do (compared with syntax like Normal(mu + tf.zeros([5]))). This supersedes PR #323.

The high-level algorithm is:

Build a TF node that computes the full log-joint distribution of rv and all of the (other) RandomVariables in blanket.
Crawl down the TF graph depth-first from the log-joint node until we hit either
a. A member of blanket (which truncates the search) or
b. rv or a nonlinear function of rv (which truncates the search and adds the node that stopped the search to a list of sufficient-statistic nodes).
Compute an s-expression-like intermediate representation of all sufficient-statistic nodes, and do some algebra on those nodes to simplify them (e.g., log(mul(a, b)) becomes add(log(a), log(b))).
Look at the set of sufficient statistic nodes (and the support of rv) to see if there's a match in our table of exponential-family distributions—if not, we're done.
If there's a match, copy the log-joint node to a scratch namespace, replacing all sufficient-statistic nodes with placeholders.
Compute natural parameters in that scratch namespace by calling tf.gradients() with respect to the sufficient-statistic placeholders on the copied log-joint. This trick, which exploits the trivial identity ∂/∂t (η•t) = η, is there to save us from worrying about shapes.
Copy those natural parameters back to the original namespace, getting rid of any placeholders, and construct a new exponential-family RandomVariable with those natural parameters.

matthewdhoffman · 2017-03-26T21:09:21Z

Thanks for the comments. I'm also excited to revisit the empirical question of what (if anything) conjugacy buys us over more black-box methods.

I'll pull out sample_shape and ParamMixture as separate PRs, make blanket optional, and move/refactor support and conjugate_log_prob. Regarding other questions:

The copy/copyback approach is annoying (and slow, because graph manipulation in TF is slow), and ideally I'd like to find a way to get rid of it. The problem it solves is that sometimes sufficient-statistic nodes can depend on other sufficient-statistic nodes. E.g., log(x) depends on x, and so taking the gradient w.r.t. x gives you the wrong answers. I tried solving this problem a few different ways, and this was the only one that worked as generally as I needed it to. In particular, we want to let the user say things like

mu = Normal(0., 1.)
x = Normal(3 * mu, 1.)

but if Normal.conjugate_log_prob() doesn't know that its input is a function of a random variable, then it's not really possible to write it in a way that makes (3 mu)^2 not a function of mu.value().

It might be possible to revisit this at some point—for example, we could use some of the logic in simplify.py to give the conjugate_log_prob() functions more knowledge about their inputs, or we could write a version of tf.gradients() with a stop_nodes parameter that says where to pretend there's a stop_gradient() node. But it'd probably be a significant undertaking.

Simplifying simplify.py: Good suggestions. I'll double check the implementation and decide whether as_float() and NodeWrapper are really necessary/helpful.

dustinvtran · 2017-03-27T10:42:20Z

That makes sense. Too bad tf.gradients doesn't already let one specify the stop nodes rather than specify tf.stop_gradient when building the graph. I agree the alternatives sound like a significant undertaking.

Once you tease out sample_shape/ParamMixture I can add commits to those branches to help. I won't push any commits to this feature/simpler_conjugacy branch (yet).

…jugacy

dustinvtran

I merged to update the branch to master (e.g., following ParamMixture).

Following the previous comments, can you also document the utility functions in simplify.py and conjugacy.py? I had trouble following the internal details of the functions when trying to understand complete_conditional.

Other comments below.

dustinvtran · 2017-04-05T15:52:18Z

edward/inferences/conjugacy/conjugacy.py

+def complete_conditional(rv, blanket, log_joint=None):
+  with tf.name_scope('complete_conditional_%s' % rv.name) as scope:
+    # log_joint holds all the information we need to get a conditional.
+    extended_blanket = copy(blanket)


Is this shallow copy of the list needed?

Good catch; there's a bug here. I meant to add rv to blanket here. It's fixed now.

dustinvtran · 2017-04-05T15:56:22Z

edward/inferences/conjugacy/conjugacy.py

+  return '_log_joint_of_' + ('&'.join([i.name[:-1] for i in blanket])) + '_'
+
+
+def get_log_joint(blanket):


My understanding is that get_log_joint will get the log joint tensor instead of forming a new one if it already exists. If we call complete_conditional twice, on two nodes which share variables in their blanket, it would redo the full joint. Is that correct?

That's right. I introduced this caching for other reasons (which aren't all that relevant now), but it still speeds up graph construction.

But maybe it makes sense to move that caching a level up, so that each RandomVariable caches its conjugate_log_prob()?

(The term "cache" is a little funny here, since the graph remembers everything regardless. We're just caching a pointer.)

dustinvtran · 2017-04-05T17:13:38Z

edward/inferences/conjugacy/conjugacy.py

+    swap_back = {}
+    for s_stat in s_stat_exprs.keys():
+      s_stat_shape = s_stat_exprs[s_stat][0][0].get_shape()
+      s_stat_placeholder = tf.placeholder(np.float32, s_stat_shape)


Is the float32 for placeholders because the log joint is float32?

It's because we want to take the gradient w.r.t. the placeholders, and TF doesn't like taking gradients w.r.t. non-floats.

mariru · 2017-04-10T20:45:50Z

@matthewdhoffman very much looking forward to this feature! Checked out the branch to play around with the examples. Direct import wasn't working so I added a line to the __init__.py file

A minor comment. It could be cool if in the beta_bernoulli_conjugate.py example, it prints the name of the distribution. i.e., on line 38 substitute

print('p(pi | x) type:', pi_cond.parameters['name'])

with

print('p(pi | x) type:', pi_cond.__class__.__name__)

or do the equivalent replacement in the return statement of the complete_conditional() function

dustinvtran · 2017-04-10T22:48:41Z

Is there anything this PR is still waiting on? Happy to merge now. I have some minor suggestions, but I'll submit another PR after this one.

matthewdhoffman · 2017-04-11T23:53:36Z

Nope, I think it's at a good checkpoint—let's merge. I haven't done the conjugate_log_prob() refactoring yet, but I probably won't have much time to look at it in the next couple of days and it can happen separately.

I made blanket optional, but added a warning about using that feature multiple times. (It'll throw a gnarly error if the user tries to do that—probably we could add a warning check.)

Also, it turns out that get_blanket() doesn't really do what we need here, since it only looks at the Markov blanket of the directed model, not the moralized graph. It might be nice at some point to add a moralization routine or Bayes ball or something to figure out what we actually need to condition on, but it shouldn't have much of a performance impact except possibly on graph creation.

dawenl · 2017-04-12T00:21:24Z

@matthewdhoffman you will gradually make all of our jobs meaningless with this PR :)

Matt Hoffman added 30 commits March 8, 2017 14:52

Rewritten approach that's more in line with contrib.distributions.

cf03a68

Beginnings of new conjugacy implementation

a90dbf2

Most basic beta-bernoulli example working

3380a32

Added beta-bernoulli conjugacy example. Minor bugfix.

ba42618

Deleted some crufty commented code

00b8bae

Got rid of _conj_log_prob_registry.

35430ca

Replaced string keys with tuples

199c2e9

Gamma and Poisson

039ec15

Support for simplish rewrites and smarter identity.

8213f05

Added normal

da53281

Rewrite of conjugacy engine

6250832

More conjugacy updates. Still not working in general.

4f10e18

Refactoring

b485d40

Added tests of simplify

09cd9ae

Finally got new approach working for normal, inverse-gamma.

de2b7c5

Simplified Identity node handling

3f7f8a2

Added gamma support back in.

e1e16fb

Eliminated some extraneous parentheses

13ce7e7

Got beta support back in

1fa339a

Added option to feed in custom log-joint

cdc73f4

Added Dirichlet and Categorical

b0a1553

Added a notion of support to random variables

c986957

Trying to get some Bernoulli stuff to work

75827fc

Refactored complete_conditional().

f9fff8e

Fixed a bug in graph copying

eebd110

Deleted print

d5f2e6c

Changed #Pow to #CPow in preparation to properly support tf.pow().

1c1d037

Improved pow rewrite support

24be7cf

Some more edge cases

71e78a7

Fixed some scoping problems

7a42c51

dustinvtran mentioned this pull request Mar 26, 2017

Add get_blanket utility function #590

Merged

Matt Hoffman added 2 commits March 26, 2017 13:24

python3 compatibility

fc73500

One last PEP8 fix

f3738ad

matthewdhoffman mentioned this pull request Mar 27, 2017

Add 'n' parameter to RandomVariable to access sample_n() functions #323

Closed

Bringing ParamMixture up-to-date

7a53a7c

dustinvtran mentioned this pull request Apr 4, 2017

a small LDA example doesn't seem to work #423

Closed

matthewdhoffman mentioned this pull request Apr 4, 2017

Feature/param mixture #592

Merged

dustinvtran force-pushed the feature/simpler_conjugacy branch 2 times, most recently from 7f85baf to b5acb1f Compare April 5, 2017 15:20

Merge remote-tracking branch 'origin/master' into feature/simpler_con…

b1e94d7

…jugacy

dustinvtran force-pushed the feature/simpler_conjugacy branch from b5acb1f to b1e94d7 Compare April 5, 2017 15:21

update tests

e945c63

dustinvtran reviewed Apr 5, 2017

View reviewed changes

dustinvtran mentioned this pull request Apr 6, 2017

Attempting basic hidden Markov model #450

Open

Matt Hoffman and others added 6 commits April 7, 2017 14:30

Fixed a bug where rv isn't part of blanket

b4a774d

Got rid of treating float constants as strings

e5c03fe

Got rid of NodeWrapper

6abdedd

Replaced casting in as_float() with an is_float() function

3e01543

Made complete_conditional() default to condition on all RandomVariables

2cbb7b7

direct import

9a02f71

PEP8

a50f33b

dustinvtran merged commit df5c913 into master Apr 11, 2017

dustinvtran deleted the feature/simpler_conjugacy branch April 11, 2017 23:59

dustinvtran mentioned this pull request Apr 14, 2017

Gibbs Sampling #607

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/simpler conjugacy #588

Feature/simpler conjugacy #588

matthewdhoffman commented Mar 25, 2017

matthewdhoffman commented Mar 26, 2017 •

edited

Loading

dustinvtran commented Mar 27, 2017

dustinvtran left a comment

dustinvtran Apr 5, 2017

matthewdhoffman Apr 7, 2017

dustinvtran Apr 5, 2017

matthewdhoffman Apr 7, 2017

dustinvtran Apr 5, 2017

matthewdhoffman Apr 7, 2017

mariru commented Apr 10, 2017 •

edited

Loading

dustinvtran commented Apr 10, 2017

matthewdhoffman commented Apr 11, 2017

dawenl commented Apr 12, 2017

		return '_log_joint_of_' + ('&'.join([i.name[:-1] for i in blanket])) + '_'


		def get_log_joint(blanket):

Feature/simpler conjugacy #588

Feature/simpler conjugacy #588

Conversation

matthewdhoffman commented Mar 25, 2017

matthewdhoffman commented Mar 26, 2017 • edited Loading

dustinvtran commented Mar 27, 2017

dustinvtran left a comment

Choose a reason for hiding this comment

dustinvtran Apr 5, 2017

Choose a reason for hiding this comment

matthewdhoffman Apr 7, 2017

Choose a reason for hiding this comment

dustinvtran Apr 5, 2017

Choose a reason for hiding this comment

matthewdhoffman Apr 7, 2017

Choose a reason for hiding this comment

dustinvtran Apr 5, 2017

Choose a reason for hiding this comment

matthewdhoffman Apr 7, 2017

Choose a reason for hiding this comment

mariru commented Apr 10, 2017 • edited Loading

dustinvtran commented Apr 10, 2017

matthewdhoffman commented Apr 11, 2017

dawenl commented Apr 12, 2017

matthewdhoffman commented Mar 26, 2017 •

edited

Loading

mariru commented Apr 10, 2017 •

edited

Loading