Make draw_values draw from the joint distribution #3214

lucianopaz · 2018-09-26T05:25:12Z

This is a fix I worked on for #3210. The idea is what I discussed in that issue's thread.

Sorry that the commit history is a mess. I had mistakenly thought I had fetched the remote upstream, and did my changes on a really old commit, and had a bit of trouble with the merge process.

With the proposed mechanics for building the variable dependency DAG draw_values is now able to draw from the joint probability distribution.

With the issue's example code, the expected joint distribution is:

and now the result from draw_values is:

Upstream fetch

…d properly before starting the fix...

junpenglao · 2018-09-26T07:43:22Z

This is a pretty big change, but I think introducing conditional_on could potentially be really useful.
@fonnesbeck, @twiecki what do you think?

lucianopaz · 2018-09-26T07:49:48Z

I'm sorry about the test coverage and test failures. Due to my bad upstream fetch, the local tests that I ran were outdated. I'm debugging the PR now to try to get it up to date and passing.

twiecki · 2018-09-26T09:49:39Z

Thanks @lucianopaz, this is quite an impressive PR.

We should definitely fix this bug. Without having thought about this deeply enough two questions come to mind in terms of possible simplifications:

Can't we get the ancestry information from the theano graph directly, like in Update to more accurate way of calculating ancestors #3213?
Would networkx make representation of the DAG simpler and obviate the need for a custom DAG class? There was an earlier PR here: WIP: graph representation of model #1683

ColCarroll · 2018-09-26T11:41:25Z

This is very interesting! I think Thomas' intuition is right - I am working on a similar PR with #3213, but will be refactoring that class to get functionality like you have. I will make more substantive comments when I can have a closer look at this.

…h. Two failures remain: TestNutsCheckTrace.test_bad_init raises a RuntimeError instead of a ValueError, but that seems to be the fault of parallel_sampling.py chaining the exception. Then test_step_exact for SMC gives different traces. Finally, sometimes, test_glm_link_func2 gets into a deadlock. Im not sure why. Will check in TravisCI if they pop out.

lucianopaz · 2018-09-26T13:58:36Z

@twiecki, I'm not familiar with networkx. I took a quick look at it before writing the DependenceDAG class but I did not find an operation like the right outer join I wanted to use. I'm sure that networkx must have a more efficient data structure for the graph than I do, but as I didn't find the right function, and also thought that it would clober up the dependencies, it was better to write a minimal class from scratch.

About the ancestry information, as I had not fetched the right upstream, I had not seen what @ColCarroll had done with the ancestry in model_graph.py. I think that the way he gets the conditional dependents in #3213 is really nice. I think that could be used instead of a conditional_on attribute, although I think that having an explicit conditional_on is nice and clean.

ColCarroll · 2018-09-26T14:21:41Z

Agree on everything you said, @lucianopaz - my strategy seems more error prone and pretty opaque, but your strategy means having 2 copies of the graph (one in theano, one in the DAG) which may not agree. Interested in other opinions on that.

twiecki · 2018-09-26T14:31:19Z

I think that having an explicit conditional_on is nice and clean.

While I agree from an API stand-point, the amount of code added here is non-trivial, which is my main concern.

…dded `conditional_on` attribute of every distribution. This function does a breadth first search on the node's `logpt` or `transformed.logpt` graph, looking for named nodes which are different from the root node, or the node's transformed, and is also not a `TensorConstant` or `SharedVariable`. Each branch was searched until the first named node was found. This way, the parent conditionals of the `root` searched node, which were only one step away from it in the bayesian network were returned. However, this ran into a problem with `Mixture` classes. These add to the `logpt` graph, another `logpt` graph from the `comp_dists`. This leads to the problem that the `logpt`'s first level conditionals will also be seen as if they were first level conditional of the `root`. Furthermore, many copies of nodes done by the added `logpt` ended up being inserted into the computed `conditional_on`. This lead to a very strange error, in which loops appeared in the DAG, and depths started to be wrong. In particular, there were no depth 0 nodes. My view is that the explicit `conditional_on` attribute prevents problems like this one from happening, and so I left it as is, to discuss. Other changes done in this commit are that `test_exact_step` for the SMC uses `draw_values` on a hierarchy, and given that `draw_values`'s behavior changed in the hierarchy situations, the exact trace values must also be adjusted. Finally `test_bad_init` was changed to run on one core, this way the parallel exception chaining does not change the exception type.

…n distribution.py.

lucianopaz · 2018-09-27T12:13:19Z

@twiecki, @ColCarroll, I implemented a function, get_first_level_conditionals in model.py in order to try to get rid of the added conditional_on attribute of every distribution. This function follows @ColCarroll's idea of transversing the logpt graph. It does a breadth first search on the node's logpt or transformed.logpt graph, looking for named nodes which are different from the root node, or the root's transformed, and are also not a TensorConstant or SharedVariable. Each branch of the logpt graph is searched until the first named node is found. This way, the parent conditionals of the root searched node, which are only one step away from it in the bayesian network are returned. My idea was that this function should return something like the value of attribute conditional_on, which I had added. The reason I made the search stop on the named nodes was because, if I let it continue, it becomes difficult to tell whether the nodes are deterministically or conditionally connected.

It worked well on all distributions except on the Mixture class. This class instances add to the logpt graph, another logpt graph from the comp_dists. This leads to the problem that the logpt's first level conditionals will also be seen as if they were first level conditional of the root node. Furthermore, many copies of nodes done by the added logpt ended up being inserted into the computed conditional_on. This lead to a very strange error, in which loops appeared in the DAG, and depths started to be wrong. In particular, there were no depth 0 nodes. With the explicit attribute conditional_on, this problem was averted. For this reason, I decided to leave the implementation of get_first_level_conditionals, but commented out the call I did to it on line 1873 of model.py, and leave uncommented lines 1874-1877. So for the current commit, the attribute conditional_on is still used, but we can discuss how to solve the problem with comp_dists in get_first_level_conditionals.

twiecki · 2018-09-27T13:34:34Z

@lucianopaz Thanks for giving that a try. I would imagine that there is likely to be some way to get it to work, @ColCarroll mentioned he might have some ideas.

My bigger concern is with the DAG class, however, did you take a look at the way the networkx PR handles this? I would imagine they have all the required functionality in there. Perhaps we could even get @ColCarroll's work to create the networkx graph and then just use that for the graph traversal.

Just to make sure my higher level point is understood: While I think this is a bug, it doesn't seem like a major one. To add a large amount of code that we will have to maintain for solving it (and nothing else) is very costly, so this should be evaluated thoroughly.

Finally, seems like your editor adds auto-line breaks to existing code which we don't require.

lucianopaz · 2018-09-27T14:02:02Z

I focused on solving the bugs and still have not taken a look at the networkx implementation. I'll go into that now.

lucianopaz · 2018-09-28T13:50:50Z

I took another go at removing the conditional_on attribute and replace it completely with the get_first_level_conditionals. I managed to pinpoint the origin of the strange error in the Mixture distribution, so now it's working using @ColCarroll's approach to compute the conditionals on the fly.

I also reduced the custom code in DependenceDAG by making it a subclass of networkx.DiGraph. I left it as a class because it's easier to encapsulate the add function, which has the important logic: walk down the theano ownership graph, check for conditional dependence, and adds the edge with the proper attribute's value. Then it also allows the implementation of the get_sub_dag's custom operation which makes draw_values job easier. Finally, it also has the get_nodes_in_depth_layers function that returns the nodes in a list of lists according to their depth in the bayesian network. In all, the implementation now is much more compact, and could be used with other networkx functions.

…ss of networkx.DiGraph

lucianopaz · 2018-09-29T06:46:38Z

I've been thinking how to completely remove the DependenceDAG class and replace it with a pure DiGraph. I'll get it implemented on Monday

ColCarroll · 2018-09-30T01:36:36Z

I think this would work with the class we already have in model_graph.py - adding networkx as a dependency seems heavy.

… is now stored in a networkx.DiGraph. Networkx is then used to compute subgraphs, and perform topological_sort on the graph during draw_values execution.

lucianopaz · 2018-10-01T10:20:13Z

@twiecki, @ColCarroll, I finished the implementation that completely eliminates the DependenceDAG class, and represents the DAG with a networkx.DiGraph. I agree with you, @ColCarroll, I think that networkx may be a heavy dependency.

However, I think that there are 5 things that we would need to add to the ModelGraph class in order to use it for transversal in draw_values:

draw_values could be called outside from a model's context. This should still be possible, so ModelGraph instances would have to add the functionality for not knowing the context's model while creating the graph.
The compute graph in ModelGraph stores the ancestors of each variable in a dictionary. It would be convenient to also have the reversed representation (the children dictionary), I mean a dictionary with the nodes that need the value of the key's variable to be able to compute or sample from them. This would enable O(1) lookup of compute children from a given node, which is used in draw_values.
In my opinion, asides from the networkx.DiGraph class, the main algorithm that we can profit from is topological_sort. This gives the nodes in the graph ordered according to their "depth" in the bayesian network. It's also nice to have the is_directed_acyclic_graph function for sanity checks. If we want to stop using the topological_sort, we absolutely need to extend the ModelGraph class, to get each variable's depth in the hierarchy.
Something that I think is important is knowing if the variables are deterministically or conditionally linked. In my first commit I wrote that I think that deterministic relations should take precedence over conditional relations. What I mean is that if we have, for example a = pm.HalfNormal('a', sd='b'). Then both a_log__ and b would be ancestors of a. If both a_log__ and b were given values in point, I would expect a == exp(a_log__), ignoring b's values completely. To enforce this, in draw_values I check the deterministic children and ancestors to see if we can compute a variable's value from its deterministic parents and go straight to _compute_value instead of passing through the _draw_value logic. I may be mistaken and this could be unnecessary, but if I'm not, ModelGraph should discriminate between deterministic and conditional dependencies, which I tried to do with get_first_level_conditionals and walk_down_ownership.
Finally, draw_values is called frequently, so I assumed we could profit from having a model's precomputed graph, and then getting sub graph copies out of it, like what I implemented in get_sub_dag. In my implementation, the nodes which were left outside of the original graph, are then forcibly added into it, wrapping them as hashable if need be. These arbitrary nodes, which are not contained in the model.var_names, and which may not even have names, can be troublesome to add into the existing ModelGraph.

If you consider that we should discard the networkx dependency, we could try to merge together the ModelGraph and DependenceDAG classes from commit 890ae74. The bad part is that I think that the resulting class should be entirely in model.py, because I think that the model should have its graph instance as an attribute.

twiecki · 2018-10-01T10:31:07Z

pymc3/util.py

@@ -60,14 +67,16 @@ def is_transformed_name(name):
    Returns
    -------
    bool
-        Boolean, whether the string could have been produced by `get_transormed_name`
+        Boolean, whether the string could have been produced by


These shouldn't show up here.

twiecki · 2018-10-01T10:33:07Z

pymc3/model.py

@@ -1518,3 +1493,324 @@ def all_continuous(vars):
        return False
    else:
        return True
+
+
+class ConstantNodeException(Exception):


Maybe move these to a separate file, graph.py?

not_shared_or_constant_variable needs model.FreeRV, model.MultiObservedRV and model.TransformedRV. If we move it away, we'll get into a circular import problem. That's why I left it in model.py.

twiecki · 2018-10-01T10:39:07Z

I like this implementation much better. If we require graph traversal logic, which seems very clear given the purpose of this package, networkx is a perfectly reasonable dependency. Moreover, it's well maintained and packaged, so I don't see many problems with that.

I don't know enough about ModelGraph but could that too use this new networkx representation? Or more generally, are the other parts in the code base that could be simplified now that we have this new representation.

Finally, the line breaks add a lot of diff noise and are unnecessary under our current style rules, can you please revert those?

lucianopaz · 2018-10-31T08:49:24Z

@twiecki, sorry I was unable to work on this last week. I'll try to get an implementation of @aseyboldt's suggestion to work. The main difficultly will be that we cannot require the model context to be known, as I pointed out in an earlier reply, almost all tests in the testsuite call draw_values outside of a model's context. I'll try to find a way to circumvent the problem, but I still think that the cleanest and clearest solution is to build the dag.

twiecki · 2018-10-31T11:12:38Z

@lucianopaz I think in that case we should rather change tests to have a model in the context. Almost everything in pymc3 is centered around having access to the model. Not having thought about this part of the code base a lot before, I was surprised to learn that it wasn't required here. So for consistency it makes sense to require it here as well.

My main concern here is code complexity though. While a DAG sounds nice to have, if we don't need to have it, we shouldn't add it. Moreover, the actual building of the DAG is a fairly simple part of this PR, most complexity comes from traversing it in the right way. If that could be saved by offloading it to theano, that would be amazing.

So my suggestion is, if you want to, try @aseyboldt's approach without thinking about the tests. Finally, sorry if I'm coming across as very critical and difficult here, we greatly appreciate your work, but I think it is important to get this part correct and it seems like there were some design problems from the start that we're bumping up at against here, and for our future sanity, it'd be best to try and fix those.

…, even in the presence of plates of the same shape.

lucianopaz · 2018-10-31T14:22:21Z

@twiecki, I understand. I'll start with @aseyboldt's approach now. I wanted to first push a small commit to leave a fix I had worked on before last week, to get ModelGraph to use networkX for more accurate plate detection.

lucianopaz · 2018-10-31T14:35:38Z

@twiecki, @aseyboldt, I've started to work on the storing the order in which each variable was defined while constructing the model and I ran into a problem early on, and I wanted to ask for your opinion. My idea was to store a list called var_construction_order inside each model, and inside of add_random_variable, append the var to it. If we want to unpickle a model that was created before the addition of the attribute var_construction_order, how should we set its value? Should we get the dag, and then infer the creation order?

twiecki · 2018-10-31T17:09:35Z

@lucianopaz model.vars should already have the right order.

twiecki · 2018-11-01T10:02:54Z

To clarify, my naive of idea of what is required at minimum (pseudo-code):

curr_point = model.test_point
for var in model.vars:
    sample_func = create_sample_func_with_givens(var, givens=curr_point)
    curr_point[var] = sample_func()

Not sure how much I'm missing there but theano should handle all the rest correctly.

lucianopaz · 2018-11-01T13:01:45Z

I understand. My concern is that the deterministics, for example the TransformedRV's, wont be in model.vars, and their order relative to other FreeRV's will be lost. I'll keep the var_construction_order list for now, and try to initialize it almost right, for old stored models in __setstate__, using model.vars.

lucianopaz · 2018-11-01T14:09:47Z

Ran into another hurdle. What I'm attempting to do inside draw_values is the following (as pseudo -code)

queue = list(params that are numbers, numpy arrays, TensorConstant or SharedVariables)
if len(params) > len(queue):
    queue.extend(model.vars + remaining params)

for var in queue:
    value = _draw_value(var, ...)
    # update point and givens with value
    output[var] = value

The issue with this is that I'm running into a recursion limit, because sometimes I'm trying to draw_values from an RV that should call random. This then calls draw_values again, which resets the queue with the entire list of model.vars, and so on. The solution would be to set the queue only with the subset of model.vars that were defined "before" the conflicting RV. This could be done more or less easily if the supplied parameter was itself an RV, but if it's a deterministic, a potential or just a theano expression, we will need to transverse their ownership graphs to get the relevant subset of model.vars. The thing is that this is really similar to what is already being done with the dag logic that is in place.

Furthermore, I'm having trouble thinking of how to make _draw_value correctly determine that a TransformedRV should evaluate its theano expression instead of calling its random method in the case the transformed variable was given a value.

Do you have any thought on these problems?

junpenglao · 2018-11-01T17:42:55Z

I understand. My concern is that the deterministics, for example the TransformedRV's, wont be in model.vars, and their order relative to other FreeRV's will be lost.

They are in model.named_vars

twiecki · 2018-11-01T19:19:00Z

@lucianopaz Does that recursion have any purpose in this new scenario? Maybe we can just drop it.

lucianopaz · 2018-11-01T19:26:14Z

@junpenglao, now I realize that my post was not clear. When the RV is added to the model, it's appended to one or more of the lists free_rvs, observed_rvs, deterministics, potentials, etc. The absolute order between members of free_rvs and deterministics is lost. named_vars inherits from dict and doesn't preserve the order of key insertion, at least in Python 2. That's why I was planning on still having a dedicated list to store the vars in order

lucianopaz · 2018-11-01T19:40:25Z

@lucianopaz Does that recursion have any purpose in this new scenario? Maybe we can just drop it.

I understand that the recursion comes from how the distribution's random method is structured. Imagine that we just want to draw from a single RV in the model. The draw_values call could trigger a call to random, and the general structure of every distribution's random method is:

Call draw_values to get the distribution's parameter values.
Generate the actual samples from the distributions.

If the distribution's parameters are themselves RVs that come from another distribution, the draw_values could trigger a separate call to random, which will have the same structure. I think that we cannot avoid entering the recursion. The key thing is to make draw_values smart enough not to attempt to draw from all the model's RVs every time. In my opinion, that means trimming the queue of model.vars.

lucianopaz · 2018-11-01T20:01:48Z

While trying to fix the tests to make them all call draw_values from inside a model's context I came across a condition that made me wonder if we can actually force a model on top of draw_values.

One can create a distribution's instance outside of a model's context doing something like d = Distribution.dist(). If we now force draw_values to be called from inside a model's context, then d.random() will not work outside a context anymore. I'm confused about this, because it looks like we're partially breaking the Distribution.dist by forcing a model on top of it. Can we really force draw_values to always be inside a model's context?

twiecki · 2018-11-01T21:03:02Z

@lucianopaz Should draw_values() even do anything if the params exist in point? Seems like in that case it should just return whatever value is in point as that's what we want to fix it to. In that case, we would never go more than 1 level deep into the recursion as every point will already be present.

aseyboldt · 2018-11-01T21:15:48Z

I can't look into this more closely right now, my computer broke down and it will take till next week before I can get it fixed. It seems to me that much of this trouble is because we don't distinguish properly between sampling from a distribution and sampling from a model. The former *should* happen in the random method, and should be well defined without a model. It only makes sense if all parameters of a distribution are given explicitly. The latter only makes sense if we know what model we are talking about. Draw_values tries to do both at the same time. Maybe we can work around that mess without too much breakage if we split draw_values into two functions: one, that we still call `draw_values ` so that we don't break distribution implementations (even though that name won't really match what it is doing then?). This should use theano to compute the parameters but should *not* call other sampling functions. If the parameters can not be computed deterministically from the input it should just throw an error. A second function will walk through the variables in creation order and call the appropriate random functions. (Checking if necessary if they are transformed vars or not.) I think you are right that we need to keep an additional list in the model.

…

On November 1, 2018 9:03:58 PM CET, Luciano Paz ***@***.***> wrote: While trying to fix the tests to make them all call `draw_values` from inside a model's context I came across a condition that made me wonder we can actually for a model on top of `draw_values`. One can create a distribution's instance outside of a model's context doing something like `d = Distribution.dist()`. If we now force `draw_values` to be called from inside a model's context, then `d.random()` will not work outside a context anymore. I'm confused about this, because it looks like we're partially breaking the `Distribution.dist` by forcing a model on top of it. Can we really force `draw_values` to always be inside a model's context? -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #3214 (comment)

-Adrian

lucianopaz · 2018-11-02T10:40:19Z

@twiecki, I agree that it shouldn't call random again when the params are already in point. I think that _draw_value already does that because point[param.name] is returned with precedence over param.random. I'm still tracing the precise origin of the recurrence, but I'm sure it has to do with the re-appearance of variables in draw_values because of the loop over all the model.vars. That's why I think that the main problem is that we cannot use the list of all the model.vars in order of creation each time we go into draw_values. We have to get the subset of relevant model.vars involved in each call to draw_values.

@aseyboldt, I think it will be very hard to be backwards compatible while splitting draw_values into two separate functions that work as you described. The main problem is that a distribution can take anything as a parameter, from a number, an array, a theano expression, or another RV or distribution instance. The current implementation of draw_values in master, very elegantly deals with any case, and naturally transverses the probability graph backwards because of the nested calls to param.random and the chosen structure of the distribution's random methods. I think that draw_values should be as flexible as it currently is.

Again, I'm inclined to explicitly exploit the dag, as it currently stands in this PR. The current working idea is that when we enter draw_values, the graph of ancestors of the supplied params is constructed. This is either done by copying a subgraph from the model's precomputed dependence_dag, or it retraces the entire dag (this is done by get_sub_dag). The forward pass through the dag is guaranteed by the topological_sort, and we don't get into any recurrence because of networkx.subgraph in the context of a model. It handles out of context calls to draw_values and it at least tries to minimize the number of draws it makes with the forward pass. I'm sure that the forward pass could be improved later on, to just sample from the branches in the dag that are completely necessary, whereas now it just samples every node in the subgraph of ancestors of params, even if their values will end up being ignored.

twiecki · 2018-11-02T11:04:57Z

@lucianopaz I think it's worth it to still find out why the recursion occurs, it might uncover a bug. If we get to the intended behavior and there will never be recursion, the method using givens should work fine, no?

lucianopaz · 2018-11-02T13:43:02Z

I've narrowed the origin of the recursion down to the following minimal example

import pymc3 as pm
from pymc3.distributions.distribution import draw_values

with pm.Model():
    mu0 = pm.Normal('mu0', mu=0., tau=1e-3)
    sigma0 = pm.Gamma('sigma0', alpha=1., beta=1., transform=None)
    mu1 = pm.Normal('mu1', mu=mu0, tau=1e-3)
    sigma1 = pm.Gamma('sigma1', mu=sigma0, sd=1., transform=None)
    y = pm.Normal('y', mu=mu1, sd=sigma1)
    # Call 0
    mu1_draw = draw_values([mu1])
    # Call 1
    sigma1_draw = draw_values([sigma1])
    # Call 2
    y_draw = draw_values([y], point={'sigma1': 1})

The draw_values I'm working with now looks like this:

def draw_values(params, point=None, size=None, model=None):
    if point is None:
        point = {}
    model = modelcontext(model)
    queue = []
    last = []
    drawn_value_map = {}
    counter = 0
    for i, p in enumerate(params):
        p_name = getattr(p, 'name', None)
        # First put the things we can get a value from directly without
        # evaluating a theano function or calling a random method 
        if isinstance(p, (numbers.Number,
                          np.ndarray,
                          theano.tensor.TensorConstant,
                          theano.tensor.sharedvar.SharedVariable)):
            drawn_value_map[i] = counter
            counter += 1
            queue.append(p)
        elif p_name is not None and p_name in point:
            drawn_value_map[i] = counter
            counter += 1
            queue.append(p)
        else:
            last.append((i, p))
    # If params contained model rvs, these should go according
    # to their creation order.
    # If params contained other things, these should go at the end
    if last:
        # var_construction_order is a list with all model rvs, deterministics, etc
        # placed in the order in which they were added to the model
        # We only add the variables that are not already in the queue
        queue.extend([v for v in model.var_construction_order
                      if v not in queue])
        counter = len(queue)
        for i, p in last:
            try:
                # If the param is already in the queue, get its index
                ind = queue.index(p)
            except Exception:
                ind = None
            # If the param is not in the queue, it is added at the end
            if ind is None:
                ind = counter
                counter += 1
                queue.append(p)
            drawn_value_map[i] = ind

    # Init drawn values and updatable point and givens
    drawn = {}
    givens = []
    nodes_missing_inputs = {}
    for ind, param in enumerate(queue):
        try:
            value = _draw_value(param, point=point, givens=givens, size=size)
        except theano.gof.fg.MissingInputError as e:
            # This deals with eventual auto transformed rvs that miss their input
            # value
            nodes_missing_inputs[ind] = e
            continue
        drawn[ind] = value
        givens.append((param, value))
        param_name = getattr(param, 'name', None)
        if param_name is not None:
            point[param_name] = value

    output = []
    # Get the output in the correct order
    for ind, param in enumerate(params):
        value = drawn.get(drawn_value_map[ind], None)
        if value is None:
            if ind in nodes_missing_inputs:
                raise nodes_missing_inputs[ind]
            else:
                raise RuntimeError('Failed to draw the value from parameter '
                                   '{}.'.format(param))
        output.append(value)
    return output

Basically, draw_values does not do almost any special handling of how to initiate the queue. It only says, if I can get the value of the param's element without fancy theano.function or random calls, it will go first in the queue. If anything else remains in params, then it puts all the model variables in creation order next, and then it finally puts the rest of params, which are mostly theano expressions. There is no conditional dependence sorting, no graph exploration, it just hopes that the creation order will work. The list that holds the variables in creation order is model.var_construction_order, and for the example I wrote at the beginning looks like this:

model.var_construction_order = [mu0, sigma0, mu1, sigma1, y]

The example's call 0 works, but the calls 1 and 2 enter the infinite recursion loop and die.

The reason that call 1 dies is that sigma1's distribution parameters are theano expressions, so they are added at the end of the queue, even after sigma1 itself. So the queue looks like:

queue = [some_constants, mu0, sigma0, mu1, sigma1, y, Elemwise{mul,no_inplace}.0, Elemwise{mul,no_inplace}.0]

This makes draw_values to always fall into the loop of drawing the distribution parameters for sigma1.

The reason call 2 dies is that, for some reason, Normal.random calls draw_values([mu, tau, sd],...). sd is sigma1, and available in point, but tau is an expression, which is added after the model variables in creation order, so we fall into the loop of trying to draw the values of y's distribution for ever.

These problems point to the fact that the queue cannot just be initialized with all the rvs in creation order, because some theano expressions NEED to be placed before certain rvs. To rearrange the queue correctly, we would need to explore the theano expressions graphs, and place them right after their ancestors appear in the list of model variables. It's like doing the topological sort ourselves. Furthermore, the actual order of variable creation is lost in unpickled old models, so we could break saved data. I'm more convinced that the dag is the simplest and most robust solution to correctly handle draw_values.

twiecki · 2018-11-11T20:28:24Z

@lucianopaz what's your email?

lucianopaz · 2018-11-12T06:12:30Z

@twiecki, my work email is [email protected]. You can send me a message there and then, if you prefer, I can send you my personal gmail account.

…n PR pymc-devs#3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values.

junpenglao · 2018-11-27T17:34:00Z

Close in favor of #3273

* Fix for #3225. Made Triangular `c` attribute be handled consistently with scipy.stats. Added test and updated example code. * Fix for #3210 which uses a completely different approach than PR #3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values. * Removed leftover print statement * Added release notes and draw values context managers to mixture and multivariate distributions that make many calls to draw_values or other distributions random methods within their own random.

lucianopaz added 3 commits May 25, 2018 12:18

Merge pull request #1 from pymc-devs/master

6aedaa9

Upstream fetch

Merge branch 'master' of https://github.com/pymc-devs/pymc3

32ce3c2

Resolved merge conflicts with upstream master, which I had not fetche…

bd25baa

…d properly before starting the fix...

junpenglao requested a review from ColCarroll September 26, 2018 07:37

junpenglao added the request discussion label Sep 26, 2018

lucianopaz added 4 commits September 26, 2018 18:01

Fixed collections import error

df5e3ae

Fixed list copy and defaults of DependenceDAG.__init__

d43d149

Cleaned up model.py, made it comply with pep8, and fixed lint error o…

890ae74

…n distribution.py.

Fix get_first_level_conditionals and also made DependenceDAG a subcla…

237f8ba

…ss of networkx.DiGraph

Completely removed DependenceDAG class. The variable dependence graph…

4ef4ea3

… is now stored in a networkx.DiGraph. Networkx is then used to compute subgraphs, and perform topological_sort on the graph during draw_values execution.

twiecki reviewed Oct 1, 2018

View reviewed changes

Reverted unnecessary format changes.

659647e

twiecki mentioned this pull request Oct 27, 2018

black codestyle #3239

Closed

Finished adaptation of ModelGraph to use networkx for plate detection…

ba8305f

…, even in the presence of plates of the same shape.

Merge branch 'master' into master

fbbf4c3

Fixed lint errors and test_step error due to upstream merge conflict.

08eccbf

lucianopaz mentioned this pull request Nov 27, 2018

Fix for #3210 without computing the Bayes network #3273

Merged

junpenglao closed this Nov 27, 2018

lucianopaz mentioned this pull request May 2, 2019

pm.model_graph.model_to_graphviz hangs (powerset computation) #3458

Closed

Make draw_values draw from the joint distribution #3214

Make draw_values draw from the joint distribution #3214

Conversation

lucianopaz commented Sep 26, 2018

junpenglao commented Sep 26, 2018

lucianopaz commented Sep 26, 2018

twiecki commented Sep 26, 2018

ColCarroll commented Sep 26, 2018

lucianopaz commented Sep 26, 2018

ColCarroll commented Sep 26, 2018

twiecki commented Sep 26, 2018

lucianopaz commented Sep 27, 2018

twiecki commented Sep 27, 2018

lucianopaz commented Sep 27, 2018

lucianopaz commented Sep 28, 2018

lucianopaz commented Sep 29, 2018

ColCarroll commented Sep 30, 2018

lucianopaz commented Oct 1, 2018

twiecki Oct 1, 2018

Choose a reason for hiding this comment

twiecki Oct 1, 2018 • edited Loading

Choose a reason for hiding this comment

lucianopaz Oct 1, 2018

Choose a reason for hiding this comment

twiecki commented Oct 1, 2018

lucianopaz commented Oct 31, 2018

twiecki commented Oct 31, 2018

lucianopaz commented Oct 31, 2018

lucianopaz commented Oct 31, 2018

twiecki commented Oct 31, 2018

twiecki commented Nov 1, 2018

lucianopaz commented Nov 1, 2018

lucianopaz commented Nov 1, 2018 • edited Loading

junpenglao commented Nov 1, 2018

twiecki commented Nov 1, 2018

lucianopaz commented Nov 1, 2018

lucianopaz commented Nov 1, 2018

lucianopaz commented Nov 1, 2018 • edited Loading

twiecki commented Nov 1, 2018

aseyboldt commented Nov 1, 2018 via email

lucianopaz commented Nov 2, 2018

twiecki commented Nov 2, 2018

lucianopaz commented Nov 2, 2018

twiecki commented Nov 11, 2018

lucianopaz commented Nov 12, 2018

junpenglao commented Nov 27, 2018

twiecki Oct 1, 2018 •

edited

Loading

lucianopaz commented Nov 1, 2018 •

edited

Loading

lucianopaz commented Nov 1, 2018 •

edited

Loading