Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make draw_values draw from the joint distribution #3214

Closed
wants to merge 18 commits into from

Conversation

lucianopaz
Copy link
Contributor

This is a fix I worked on for #3210. The idea is what I discussed in that issue's thread.

Sorry that the commit history is a mess. I had mistakenly thought I had fetched the remote upstream, and did my changes on a really old commit, and had a bit of trouble with the merge process.

With the proposed mechanics for building the variable dependency DAG draw_values is now able to draw from the joint probability distribution.

With the issue's example code, the expected joint distribution is:

expected_output

and now the result from draw_values is:

draw_values_output

@junpenglao
Copy link
Member

This is a pretty big change, but I think introducing conditional_on could potentially be really useful.
@fonnesbeck, @twiecki what do you think?

@lucianopaz
Copy link
Contributor Author

I'm sorry about the test coverage and test failures. Due to my bad upstream fetch, the local tests that I ran were outdated. I'm debugging the PR now to try to get it up to date and passing.

@twiecki
Copy link
Member

twiecki commented Sep 26, 2018

Thanks @lucianopaz, this is quite an impressive PR.

We should definitely fix this bug. Without having thought about this deeply enough two questions come to mind in terms of possible simplifications:

@ColCarroll
Copy link
Member

This is very interesting! I think Thomas' intuition is right - I am working on a similar PR with #3213, but will be refactoring that class to get functionality like you have. I will make more substantive comments when I can have a closer look at this.

…h. Two failures remain: TestNutsCheckTrace.test_bad_init raises a RuntimeError instead of a ValueError, but that seems to be the fault of parallel_sampling.py chaining the exception. Then test_step_exact for SMC gives different traces. Finally, sometimes, test_glm_link_func2 gets into a deadlock. Im not sure why. Will check in TravisCI if they pop out.
@lucianopaz
Copy link
Contributor Author

@twiecki, I'm not familiar with networkx. I took a quick look at it before writing the DependenceDAG class but I did not find an operation like the right outer join I wanted to use. I'm sure that networkx must have a more efficient data structure for the graph than I do, but as I didn't find the right function, and also thought that it would clober up the dependencies, it was better to write a minimal class from scratch.

About the ancestry information, as I had not fetched the right upstream, I had not seen what @ColCarroll had done with the ancestry in model_graph.py. I think that the way he gets the conditional dependents in #3213 is really nice. I think that could be used instead of a conditional_on attribute, although I think that having an explicit conditional_on is nice and clean.

@ColCarroll
Copy link
Member

Agree on everything you said, @lucianopaz - my strategy seems more error prone and pretty opaque, but your strategy means having 2 copies of the graph (one in theano, one in the DAG) which may not agree. Interested in other opinions on that.

@twiecki
Copy link
Member

twiecki commented Sep 26, 2018

I think that having an explicit conditional_on is nice and clean.

While I agree from an API stand-point, the amount of code added here is non-trivial, which is my main concern.

…dded `conditional_on` attribute of every distribution. This function does a breadth first search on the node's `logpt` or `transformed.logpt` graph, looking for named nodes which are different from the root node, or the node's transformed, and is also not a `TensorConstant` or `SharedVariable`. Each branch was searched until the first named node was found. This way, the parent conditionals of the `root` searched node, which were only one step away from it in the bayesian network were returned. However, this ran into a problem with `Mixture` classes. These add to the `logpt` graph, another `logpt` graph from the `comp_dists`. This leads to the problem that the `logpt`'s first level conditionals will also be seen as if they were first level conditional of the `root`. Furthermore, many copies of nodes done by the added `logpt` ended up being inserted into the computed `conditional_on`. This lead to a very strange error, in which loops appeared in the DAG, and depths started to be wrong. In particular, there were no depth 0 nodes. My view is that the explicit `conditional_on` attribute prevents problems like this one from happening, and so I left it as is, to discuss. Other changes done in this commit are that `test_exact_step` for the SMC uses `draw_values` on a hierarchy, and given that `draw_values`'s behavior changed in the hierarchy situations, the exact trace values must also be adjusted. Finally `test_bad_init` was changed to run on one core, this way the parallel exception chaining does not change the exception type.
@lucianopaz
Copy link
Contributor Author

@twiecki, @ColCarroll, I implemented a function, get_first_level_conditionals in model.py in order to try to get rid of the added conditional_on attribute of every distribution. This function follows @ColCarroll's idea of transversing the logpt graph. It does a breadth first search on the node's logpt or transformed.logpt graph, looking for named nodes which are different from the root node, or the root's transformed, and are also not a TensorConstant or SharedVariable. Each branch of the logpt graph is searched until the first named node is found. This way, the parent conditionals of the root searched node, which are only one step away from it in the bayesian network are returned. My idea was that this function should return something like the value of attribute conditional_on, which I had added. The reason I made the search stop on the named nodes was because, if I let it continue, it becomes difficult to tell whether the nodes are deterministically or conditionally connected.

It worked well on all distributions except on the Mixture class. This class instances add to the logpt graph, another logpt graph from the comp_dists. This leads to the problem that the logpt's first level conditionals will also be seen as if they were first level conditional of the root node. Furthermore, many copies of nodes done by the added logpt ended up being inserted into the computed conditional_on. This lead to a very strange error, in which loops appeared in the DAG, and depths started to be wrong. In particular, there were no depth 0 nodes. With the explicit attribute conditional_on, this problem was averted. For this reason, I decided to leave the implementation of get_first_level_conditionals, but commented out the call I did to it on line 1873 of model.py, and leave uncommented lines 1874-1877. So for the current commit, the attribute conditional_on is still used, but we can discuss how to solve the problem with comp_dists in get_first_level_conditionals.

@twiecki
Copy link
Member

twiecki commented Sep 27, 2018

@lucianopaz Thanks for giving that a try. I would imagine that there is likely to be some way to get it to work, @ColCarroll mentioned he might have some ideas.

My bigger concern is with the DAG class, however, did you take a look at the way the networkx PR handles this? I would imagine they have all the required functionality in there. Perhaps we could even get @ColCarroll's work to create the networkx graph and then just use that for the graph traversal.

Just to make sure my higher level point is understood: While I think this is a bug, it doesn't seem like a major one. To add a large amount of code that we will have to maintain for solving it (and nothing else) is very costly, so this should be evaluated thoroughly.

Finally, seems like your editor adds auto-line breaks to existing code which we don't require.

@lucianopaz
Copy link
Contributor Author

I focused on solving the bugs and still have not taken a look at the networkx implementation. I'll go into that now.

@lucianopaz
Copy link
Contributor Author

I took another go at removing the conditional_on attribute and replace it completely with the get_first_level_conditionals. I managed to pinpoint the origin of the strange error in the Mixture distribution, so now it's working using @ColCarroll's approach to compute the conditionals on the fly.

I also reduced the custom code in DependenceDAG by making it a subclass of networkx.DiGraph. I left it as a class because it's easier to encapsulate the add function, which has the important logic: walk down the theano ownership graph, check for conditional dependence, and adds the edge with the proper attribute's value. Then it also allows the implementation of the get_sub_dag's custom operation which makes draw_values job easier. Finally, it also has the get_nodes_in_depth_layers function that returns the nodes in a list of lists according to their depth in the bayesian network. In all, the implementation now is much more compact, and could be used with other networkx functions.

@lucianopaz
Copy link
Contributor Author

I've been thinking how to completely remove the DependenceDAG class and replace it with a pure DiGraph. I'll get it implemented on Monday

@ColCarroll
Copy link
Member

I think this would work with the class we already have in model_graph.py - adding networkx as a dependency seems heavy.

… is now stored in a networkx.DiGraph. Networkx is then used to compute subgraphs, and perform topological_sort on the graph during draw_values execution.
@lucianopaz
Copy link
Contributor Author

@twiecki, @ColCarroll, I finished the implementation that completely eliminates the DependenceDAG class, and represents the DAG with a networkx.DiGraph. I agree with you, @ColCarroll, I think that networkx may be a heavy dependency.

However, I think that there are 5 things that we would need to add to the ModelGraph class in order to use it for transversal in draw_values:

  1. draw_values could be called outside from a model's context. This should still be possible, so ModelGraph instances would have to add the functionality for not knowing the context's model while creating the graph.
  2. The compute graph in ModelGraph stores the ancestors of each variable in a dictionary. It would be convenient to also have the reversed representation (the children dictionary), I mean a dictionary with the nodes that need the value of the key's variable to be able to compute or sample from them. This would enable O(1) lookup of compute children from a given node, which is used in draw_values.
  3. In my opinion, asides from the networkx.DiGraph class, the main algorithm that we can profit from is topological_sort. This gives the nodes in the graph ordered according to their "depth" in the bayesian network. It's also nice to have the is_directed_acyclic_graph function for sanity checks. If we want to stop using the topological_sort, we absolutely need to extend the ModelGraph class, to get each variable's depth in the hierarchy.
  4. Something that I think is important is knowing if the variables are deterministically or conditionally linked. In my first commit I wrote that I think that deterministic relations should take precedence over conditional relations. What I mean is that if we have, for example a = pm.HalfNormal('a', sd='b'). Then both a_log__ and b would be ancestors of a. If both a_log__ and b were given values in point, I would expect a == exp(a_log__), ignoring b's values completely. To enforce this, in draw_values I check the deterministic children and ancestors to see if we can compute a variable's value from its deterministic parents and go straight to _compute_value instead of passing through the _draw_value logic. I may be mistaken and this could be unnecessary, but if I'm not, ModelGraph should discriminate between deterministic and conditional dependencies, which I tried to do with get_first_level_conditionals and walk_down_ownership.
  5. Finally, draw_values is called frequently, so I assumed we could profit from having a model's precomputed graph, and then getting sub graph copies out of it, like what I implemented in get_sub_dag. In my implementation, the nodes which were left outside of the original graph, are then forcibly added into it, wrapping them as hashable if need be. These arbitrary nodes, which are not contained in the model.var_names, and which may not even have names, can be troublesome to add into the existing ModelGraph.

If you consider that we should discard the networkx dependency, we could try to merge together the ModelGraph and DependenceDAG classes from commit 890ae74. The bad part is that I think that the resulting class should be entirely in model.py, because I think that the model should have its graph instance as an attribute.

pymc3/util.py Outdated
@@ -60,14 +67,16 @@ def is_transformed_name(name):
Returns
-------
bool
Boolean, whether the string could have been produced by `get_transormed_name`
Boolean, whether the string could have been produced by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These shouldn't show up here.

@@ -1518,3 +1493,324 @@ def all_continuous(vars):
return False
else:
return True


class ConstantNodeException(Exception):
Copy link
Member

@twiecki twiecki Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move these to a separate file, graph.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not_shared_or_constant_variable needs model.FreeRV, model.MultiObservedRV and model.TransformedRV. If we move it away, we'll get into a circular import problem. That's why I left it in model.py.

@twiecki
Copy link
Member

twiecki commented Oct 1, 2018

I like this implementation much better. If we require graph traversal logic, which seems very clear given the purpose of this package, networkx is a perfectly reasonable dependency. Moreover, it's well maintained and packaged, so I don't see many problems with that.

I don't know enough about ModelGraph but could that too use this new networkx representation? Or more generally, are the other parts in the code base that could be simplified now that we have this new representation.

Finally, the line breaks add a lot of diff noise and are unnecessary under our current style rules, can you please revert those?

@twiecki twiecki mentioned this pull request Oct 27, 2018
@lucianopaz
Copy link
Contributor Author

@twiecki, sorry I was unable to work on this last week. I'll try to get an implementation of @aseyboldt's suggestion to work. The main difficultly will be that we cannot require the model context to be known, as I pointed out in an earlier reply, almost all tests in the testsuite call draw_values outside of a model's context. I'll try to find a way to circumvent the problem, but I still think that the cleanest and clearest solution is to build the dag.

@twiecki
Copy link
Member

twiecki commented Oct 31, 2018

@lucianopaz I think in that case we should rather change tests to have a model in the context. Almost everything in pymc3 is centered around having access to the model. Not having thought about this part of the code base a lot before, I was surprised to learn that it wasn't required here. So for consistency it makes sense to require it here as well.

My main concern here is code complexity though. While a DAG sounds nice to have, if we don't need to have it, we shouldn't add it. Moreover, the actual building of the DAG is a fairly simple part of this PR, most complexity comes from traversing it in the right way. If that could be saved by offloading it to theano, that would be amazing.

So my suggestion is, if you want to, try @aseyboldt's approach without thinking about the tests. Finally, sorry if I'm coming across as very critical and difficult here, we greatly appreciate your work, but I think it is important to get this part correct and it seems like there were some design problems from the start that we're bumping up at against here, and for our future sanity, it'd be best to try and fix those.

…, even in the presence of plates of the same shape.
@lucianopaz
Copy link
Contributor Author

@twiecki, I understand. I'll start with @aseyboldt's approach now. I wanted to first push a small commit to leave a fix I had worked on before last week, to get ModelGraph to use networkX for more accurate plate detection.

@lucianopaz
Copy link
Contributor Author

@twiecki, @aseyboldt, I've started to work on the storing the order in which each variable was defined while constructing the model and I ran into a problem early on, and I wanted to ask for your opinion. My idea was to store a list called var_construction_order inside each model, and inside of add_random_variable, append the var to it. If we want to unpickle a model that was created before the addition of the attribute var_construction_order, how should we set its value? Should we get the dag, and then infer the creation order?

@twiecki
Copy link
Member

twiecki commented Oct 31, 2018

@lucianopaz model.vars should already have the right order.

@twiecki
Copy link
Member

twiecki commented Nov 1, 2018

To clarify, my naive of idea of what is required at minimum (pseudo-code):

curr_point = model.test_point
for var in model.vars:
    sample_func = create_sample_func_with_givens(var, givens=curr_point)
    curr_point[var] = sample_func()

Not sure how much I'm missing there but theano should handle all the rest correctly.

@lucianopaz
Copy link
Contributor Author

I understand. My concern is that the deterministics, for example the TransformedRV's, wont be in model.vars, and their order relative to other FreeRV's will be lost. I'll keep the var_construction_order list for now, and try to initialize it almost right, for old stored models in __setstate__, using model.vars.

@lucianopaz
Copy link
Contributor Author

lucianopaz commented Nov 1, 2018

Ran into another hurdle. What I'm attempting to do inside draw_values is the following (as pseudo -code)

queue = list(params that are numbers, numpy arrays, TensorConstant or SharedVariables)
if len(params) > len(queue):
    queue.extend(model.vars + remaining params)

for var in queue:
    value = _draw_value(var, ...)
    # update point and givens with value
    output[var] = value

The issue with this is that I'm running into a recursion limit, because sometimes I'm trying to draw_values from an RV that should call random. This then calls draw_values again, which resets the queue with the entire list of model.vars, and so on. The solution would be to set the queue only with the subset of model.vars that were defined "before" the conflicting RV. This could be done more or less easily if the supplied parameter was itself an RV, but if it's a deterministic, a potential or just a theano expression, we will need to transverse their ownership graphs to get the relevant subset of model.vars. The thing is that this is really similar to what is already being done with the dag logic that is in place.

Furthermore, I'm having trouble thinking of how to make _draw_value correctly determine that a TransformedRV should evaluate its theano expression instead of calling its random method in the case the transformed variable was given a value.

Do you have any thought on these problems?

@junpenglao
Copy link
Member

I understand. My concern is that the deterministics, for example the TransformedRV's, wont be in model.vars, and their order relative to other FreeRV's will be lost.

They are in model.named_vars

@twiecki
Copy link
Member

twiecki commented Nov 1, 2018

@lucianopaz Does that recursion have any purpose in this new scenario? Maybe we can just drop it.

@lucianopaz
Copy link
Contributor Author

@junpenglao, now I realize that my post was not clear. When the RV is added to the model, it's appended to one or more of the lists free_rvs, observed_rvs, deterministics, potentials, etc. The absolute order between members of free_rvs and deterministics is lost. named_vars inherits from dict and doesn't preserve the order of key insertion, at least in Python 2. That's why I was planning on still having a dedicated list to store the vars in order

@lucianopaz
Copy link
Contributor Author

@lucianopaz Does that recursion have any purpose in this new scenario? Maybe we can just drop it.

I understand that the recursion comes from how the distribution's random method is structured. Imagine that we just want to draw from a single RV in the model. The draw_values call could trigger a call to random, and the general structure of every distribution's random method is:

  1. Call draw_values to get the distribution's parameter values.
  2. Generate the actual samples from the distributions.

If the distribution's parameters are themselves RVs that come from another distribution, the draw_values could trigger a separate call to random, which will have the same structure. I think that we cannot avoid entering the recursion. The key thing is to make draw_values smart enough not to attempt to draw from all the model's RVs every time. In my opinion, that means trimming the queue of model.vars.

@lucianopaz
Copy link
Contributor Author

lucianopaz commented Nov 1, 2018

While trying to fix the tests to make them all call draw_values from inside a model's context I came across a condition that made me wonder if we can actually force a model on top of draw_values.

One can create a distribution's instance outside of a model's context doing something like d = Distribution.dist(). If we now force draw_values to be called from inside a model's context, then d.random() will not work outside a context anymore. I'm confused about this, because it looks like we're partially breaking the Distribution.dist by forcing a model on top of it. Can we really force draw_values to always be inside a model's context?

@twiecki
Copy link
Member

twiecki commented Nov 1, 2018

@lucianopaz Should draw_values() even do anything if the params exist in point? Seems like in that case it should just return whatever value is in point as that's what we want to fix it to. In that case, we would never go more than 1 level deep into the recursion as every point will already be present.

@aseyboldt
Copy link
Member

aseyboldt commented Nov 1, 2018 via email

@lucianopaz
Copy link
Contributor Author

@twiecki, I agree that it shouldn't call random again when the params are already in point. I think that _draw_value already does that because point[param.name] is returned with precedence over param.random. I'm still tracing the precise origin of the recurrence, but I'm sure it has to do with the re-appearance of variables in draw_values because of the loop over all the model.vars. That's why I think that the main problem is that we cannot use the list of all the model.vars in order of creation each time we go into draw_values. We have to get the subset of relevant model.vars involved in each call to draw_values.

@aseyboldt, I think it will be very hard to be backwards compatible while splitting draw_values into two separate functions that work as you described. The main problem is that a distribution can take anything as a parameter, from a number, an array, a theano expression, or another RV or distribution instance. The current implementation of draw_values in master, very elegantly deals with any case, and naturally transverses the probability graph backwards because of the nested calls to param.random and the chosen structure of the distribution's random methods. I think that draw_values should be as flexible as it currently is.

Again, I'm inclined to explicitly exploit the dag, as it currently stands in this PR. The current working idea is that when we enter draw_values, the graph of ancestors of the supplied params is constructed. This is either done by copying a subgraph from the model's precomputed dependence_dag, or it retraces the entire dag (this is done by get_sub_dag). The forward pass through the dag is guaranteed by the topological_sort, and we don't get into any recurrence because of networkx.subgraph in the context of a model. It handles out of context calls to draw_values and it at least tries to minimize the number of draws it makes with the forward pass. I'm sure that the forward pass could be improved later on, to just sample from the branches in the dag that are completely necessary, whereas now it just samples every node in the subgraph of ancestors of params, even if their values will end up being ignored.

@twiecki
Copy link
Member

twiecki commented Nov 2, 2018

@lucianopaz I think it's worth it to still find out why the recursion occurs, it might uncover a bug. If we get to the intended behavior and there will never be recursion, the method using givens should work fine, no?

@lucianopaz
Copy link
Contributor Author

I've narrowed the origin of the recursion down to the following minimal example

import pymc3 as pm
from pymc3.distributions.distribution import draw_values

with pm.Model():
    mu0 = pm.Normal('mu0', mu=0., tau=1e-3)
    sigma0 = pm.Gamma('sigma0', alpha=1., beta=1., transform=None)
    mu1 = pm.Normal('mu1', mu=mu0, tau=1e-3)
    sigma1 = pm.Gamma('sigma1', mu=sigma0, sd=1., transform=None)
    y = pm.Normal('y', mu=mu1, sd=sigma1)
    # Call 0
    mu1_draw = draw_values([mu1])
    # Call 1
    sigma1_draw = draw_values([sigma1])
    # Call 2
    y_draw = draw_values([y], point={'sigma1': 1})

The draw_values I'm working with now looks like this:

def draw_values(params, point=None, size=None, model=None):
    if point is None:
        point = {}
    model = modelcontext(model)
    queue = []
    last = []
    drawn_value_map = {}
    counter = 0
    for i, p in enumerate(params):
        p_name = getattr(p, 'name', None)
        # First put the things we can get a value from directly without
        # evaluating a theano function or calling a random method 
        if isinstance(p, (numbers.Number,
                          np.ndarray,
                          theano.tensor.TensorConstant,
                          theano.tensor.sharedvar.SharedVariable)):
            drawn_value_map[i] = counter
            counter += 1
            queue.append(p)
        elif p_name is not None and p_name in point:
            drawn_value_map[i] = counter
            counter += 1
            queue.append(p)
        else:
            last.append((i, p))
    # If params contained model rvs, these should go according
    # to their creation order.
    # If params contained other things, these should go at the end
    if last:
        # var_construction_order is a list with all model rvs, deterministics, etc
        # placed in the order in which they were added to the model
        # We only add the variables that are not already in the queue
        queue.extend([v for v in model.var_construction_order
                      if v not in queue])
        counter = len(queue)
        for i, p in last:
            try:
                # If the param is already in the queue, get its index
                ind = queue.index(p)
            except Exception:
                ind = None
            # If the param is not in the queue, it is added at the end
            if ind is None:
                ind = counter
                counter += 1
                queue.append(p)
            drawn_value_map[i] = ind

    # Init drawn values and updatable point and givens
    drawn = {}
    givens = []
    nodes_missing_inputs = {}
    for ind, param in enumerate(queue):
        try:
            value = _draw_value(param, point=point, givens=givens, size=size)
        except theano.gof.fg.MissingInputError as e:
            # This deals with eventual auto transformed rvs that miss their input
            # value
            nodes_missing_inputs[ind] = e
            continue
        drawn[ind] = value
        givens.append((param, value))
        param_name = getattr(param, 'name', None)
        if param_name is not None:
            point[param_name] = value

    output = []
    # Get the output in the correct order
    for ind, param in enumerate(params):
        value = drawn.get(drawn_value_map[ind], None)
        if value is None:
            if ind in nodes_missing_inputs:
                raise nodes_missing_inputs[ind]
            else:
                raise RuntimeError('Failed to draw the value from parameter '
                                   '{}.'.format(param))
        output.append(value)
    return output

Basically, draw_values does not do almost any special handling of how to initiate the queue. It only says, if I can get the value of the param's element without fancy theano.function or random calls, it will go first in the queue. If anything else remains in params, then it puts all the model variables in creation order next, and then it finally puts the rest of params, which are mostly theano expressions. There is no conditional dependence sorting, no graph exploration, it just hopes that the creation order will work. The list that holds the variables in creation order is model.var_construction_order, and for the example I wrote at the beginning looks like this:

model.var_construction_order = [mu0, sigma0, mu1, sigma1, y]

The example's call 0 works, but the calls 1 and 2 enter the infinite recursion loop and die.

The reason that call 1 dies is that sigma1's distribution parameters are theano expressions, so they are added at the end of the queue, even after sigma1 itself. So the queue looks like:

queue = [some_constants, mu0, sigma0, mu1, sigma1, y, Elemwise{mul,no_inplace}.0, Elemwise{mul,no_inplace}.0]

This makes draw_values to always fall into the loop of drawing the distribution parameters for sigma1.

The reason call 2 dies is that, for some reason, Normal.random calls draw_values([mu, tau, sd],...). sd is sigma1, and available in point, but tau is an expression, which is added after the model variables in creation order, so we fall into the loop of trying to draw the values of y's distribution for ever.

These problems point to the fact that the queue cannot just be initialized with all the rvs in creation order, because some theano expressions NEED to be placed before certain rvs. To rearrange the queue correctly, we would need to explore the theano expressions graphs, and place them right after their ancestors appear in the list of model variables. It's like doing the topological sort ourselves. Furthermore, the actual order of variable creation is lost in unpickled old models, so we could break saved data. I'm more convinced that the dag is the simplest and most robust solution to correctly handle draw_values.

@twiecki
Copy link
Member

twiecki commented Nov 11, 2018

@lucianopaz what's your email?

@lucianopaz
Copy link
Contributor Author

@twiecki, my work email is [email protected]. You can send me a message there and then, if you prefer, I can send you my personal gmail account.

lucianopaz added a commit to lucianopaz/pymc that referenced this pull request Nov 27, 2018
…n PR pymc-devs#3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values.
@junpenglao
Copy link
Member

Close in favor of #3273

@junpenglao junpenglao closed this Nov 27, 2018
junpenglao pushed a commit that referenced this pull request Dec 3, 2018
* Fix for #3225. Made Triangular `c` attribute be handled consistently with scipy.stats. Added test and updated example code.

* Fix for #3210 which uses a completely different approach than PR #3214. It uses a context manager inside `draw_values` that makes all the values drawn from `TensorVariables` or `MultiObservedRV`s available to nested calls of the original call to `draw_values`. It is partly inspired by how Edward2 approaches the problem of forward sampling. Ed2 tensors fix a `_values` attribute after they first call `sample` and then only return that. They can do it because of their functional scheme, where the entire graph is recreated each time the generative function is called. Our object oriented paradigm cannot set a fixed _values, it has to know it is in the context of a single `draw_values` call. That is why I opted for context managers to store the drawn values.

* Removed leftover print statement

* Added release notes and draw values context managers to mixture and multivariate distributions that make many calls to draw_values or other distributions random methods within their own random.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants