Add calcExpectation function - when we figure out Distributions #720

mnwhite · 2020-06-19T15:48:02Z

Added a function to HARK.distribution that automates taking expectations over a function. Basic syntax:

calcExpectation(func,values,dstn)

values can be a list of 1D arrays or ND arrays. If the former, the function generates a tensor grid over them. func must take the deterministic values in order as its first M arguments, followed by the stochastic arguments in dstn as its next N arguments; user will likely need to make a wrapper function.

ConsAggShockModel.py has a new solver called solveConsAggShockNEW that replicates the old version but uses calcExpectation, as a demonstration of what the code would look like using this new feature. Solves about 3% slower due to not knowing how to save on a bit of arithmetic (because of automation).

Wrote first version of non-symbol expectation-evaluating function, completely untested. Also fixed a couple typos in distribution.py.

Still need to test alternate input versions.

calcExpectations now tested on more advanced cases. Fixed dumb typo (copy-pasted from other line) and changed colons to ellipses in axis-construction step. Also wrote new version of ConsAggShock solver that uses calcExpectation, as a demonstration of what the code would look like.

sbenthall · 2020-06-22T14:36:15Z

Hi @mnwhite . This is exciting work.

A few comments. I expect much of this you know already as this is just a first draft, but I'm making complete notes.

Vectorized func

I'm interested in how the func argument of calcExpectation explicitly operates over arrays. I have learned from looking at Dolo code that this is called a vectorized function. There is numpy support for vectorization. In the EconForge stack there is a numba optimized vectorization function.

Given that vectorization of a function is an easy add, I wonder if calcExpectation should take func in a non-vectorized way, and vectorize it as part of the expectation taking process.

Tensor operations on the values?

Since there is only one use case modeled in this PR, I'm having a hard time seeing why there is so much flexibility built into the values argument. I'm having a hard time following the rationale from the comment alone.

It would simplify the calcExpectation method if it only took values in a standardized way.

The tensor manipulations used to reshape values arguments into an appropriate form can be functionalized out into a separate method. That would allow the reader to focus on the expectation calculation.

Looking to see if there is any lower level numpy or scipy support for tensor products, I found this numpy function. I wonder if it's relevant to this code.

Argument order

If I'm not mistaken, the values argument should be strictly speaking optional, since taking the expectation of a function over a distribution does not require that there's non-stochastic arguments. So it would make sense for this argument to be last.

I know @llorracc is keen on matching either the Mathematica or Sympy syntaxes ultimately, and so for the sake of discussion I'll link to those here:

https://reference.wolfram.com/language/ref/Expectation.html

https://docs.sympy.org/latest/modules/stats.html?highlight=expectation#sympy.stats.Expectation

Neither of these methods supports non-stochastic values and it's a bit awkward including them, in my view, though I can see how this is motivated by compatibility with the current time-varying parameter architecture (I think?--I'm honestly not sure I'm following the design.)

But I wanted to note a different way of doing this, which would be to restrict the Expectation-taking method to operate only on distributions, but have the Distribution class have methods to modify a distribution by giving it additional dimensions. Alternatively, the non-stochastic values could be curried into the function/expression before it is passed into the Expectation taking code. That way the expectation-taking code could be more faithful to the mathematical operation.

Performance

In my opinion, a 3% performance hit is not bad. It indicates that there is a small constant factor time increase, but that is negligible in computational terms. There's a famous Donald Knuth quote, "Premature optimization is the root of all evil." By concentrating this operation into one function, it will save in developer time, which is far more expensive than computational time, down the road. It will also make it easier to optimize this function, so e.g. Numba-based optimizations will ripple over the whole library quickly.

Finalizing the PR

I think that before this gets merged in, it should be made as a replacement to solveConsAggShock and not an additional solveConsAggShockNEW

The new method should also be worked into the other solvers where expectation-taking is done.

Functionality like this should have an automated unit test. That would also help document the functionality.

mnwhite · 2020-06-22T15:51:21Z

How `func` deals with vectorization seems like an issue for `func`, not `calcExpectation`. If there's a way to somehow say "this function might not be written with vectorized code, so do that automatically now", then we can do that, but I'm not familiar with it. All of the HARK code I've written uses numpy array operations; there are essentially no loops over values except where mathematically necessary or numerically insignificant. In a few places I wrote some very small steps using a loop to fill in an array because it's much easier to read-- I could have done it with array operations, but it would speed up the code by 0.01% and make it mindbending to read. I might be misinterpreting what you mean by "vectorized code", however, so I'm not sure I understand this comment, particularly "Given that vectorization of a function is an easy add, I wonder if calcExpectation should take func in a non-vectorized way, and vectorize it as part of the expectation taking process." There might be a built in numpy function to create a tensor grid, and we can use it if we identify the right one. np.ufunc.outer works on numpy universal functions, not just a generic function. But I'm not going to remove functionality/flexibility just because I didn't (yet) include an example of it being used. In fact, the pastebin "toddler" file I posted uses the input syntax that isn't used by solveConsAggShockNEW. The point of this function is to evaluate mathematical expressions of the form $\int g(x,y) dF(y)$; g is `func`, x is `values`, and F(y) is the CDF of `dstn`. The user wants this integral/expectation computed at many different values of x, and there are two clear uses that come up in HARK when using the endogenous grid method. In some models, the relevant end-of-period states at which we want to compute future realizations of (marginal) value are simply a rectilinear grid / tensor set of independent grids of each dimension; calcExpectation saves the user the time (and the reader the mental energy) of doing the tiling/reshaping for them. In other models, like ConsAggShock, the relevant end-of-period states *don't* line up in a simple tensor grid, because the bounds of the state space are curved. In this case, the user can construct the arrays of values they're interested in. If there aren't any non-stochastic arguments, then taking the expectation of a function over a DiscreteDistribution instance is simply `np.dot(func(*my_dstn.X), my_dstn.pmf)`, that's it. I don't think we should rewrite the function and the expected format of `func` so that it's "prettier" when a user wants to do a trivial case. I'm editing the function so it will work as is if they just pass an empty list as `values`. I think this basically does follow the syntax of Mathematica, except that our expression is array-valued. I don't think we should try to use sympy notation when we have literally no symbolic math in HARK, and I have an aversion to "trying to make it like Mathematica" for personal history reasons. Including non-stochastic values in the function isn't because of the time-varying structure. It's because the user wants to compute the expectation at multiple function values simultaneously without having to write a new `expr` object every time, as they do in Mathematica. I think mma can probably do multiple inputs simultaneously using Table, but that's *literally* a loop operation. I thought the code in solveConsAggShockNEW would make it clear how the function works. I don't actually understand your suggested alternate way of programming this kind of function, so I can't evaluate whether/how it would be "more faithful to the mathematical operation". Yeah, I made a new function so that it's easy to both compare performance of the old and new functions and to read and compare the code. To me, the NEW version is very, very clean: the functions calcAggObjects and vPnextFunc define how we get from today's end-of-period state to next period's state, and then line 791 takes the expectation of end-of-period marginal value of assets at arrays of relevant states. I can write unit tests.

…

On Mon, Jun 22, 2020 at 10:36 AM Sebastian Benthall < ***@***.***> wrote: Hi @mnwhite <https://github.com/mnwhite> . This is exciting work. A few comments. I expect much of this you know already as this is just a first draft, but I'm making complete notes. *Vectorized func* I'm interested in how the func argument of calcExpectation explicitly operates over arrays. I have learned from looking at Dolo code that this is called a vectorized function. There is numpy support for vectorization <https://numpy.org/doc/1.18/reference/generated/numpy.vectorize.html>. In the EconForge stack there is a numba optimized vectorization function. Given that vectorization of a function is an easy add, I wonder if calcExpectation should take func in a non-vectorized way, and vectorize it as part of the expectation taking process. *Tensor operations on the values?* Since there is only one use case modeled in this PR, I'm having a hard time seeing why there is so much flexibility built into the values argument. I'm having a hard time following the rationale from the comment alone. It would simplify the calcExpectation method if it only took values in a standardized way. The tensor manipulations used to reshape values arguments into an appropriate form can be functionalized out into a separate method. That would allow the reader to focus on the expectation calculation. Looking to see if there is any lower level numpy or scipy support for tensor products, I found this numpy function <https://numpy.org/doc/stable/reference/generated/numpy.ufunc.outer.html#numpy.ufunc.outer>. I wonder if it's relevant to this code. *Argument order* If I'm not mistaken, the values argument should be strictly speaking optional, since taking the expectation of a function over a distribution does not require that there's non-stochastic arguments. So it would make sense for this argument to be last. I know @llorracc <https://github.com/llorracc> is keen on matching either the Mathematica or Sympy syntaxes ultimately, and so for the sake of discussion I'll link to those here: https://reference.wolfram.com/language/ref/Expectation.html https://docs.sympy.org/latest/modules/stats.html?highlight=expectation#sympy.stats.Expectation Neither of these methods supports non-stochastic values and it's a bit awkward including them, in my view, though I can see how this is motivated by compatibility with the current time-varying parameter architecture (I think?--I'm honestly not sure I'm following the design.) But I wanted to note a different way of doing this, which would be to restrict the Expectation-taking method to operate only on distributions, but have the Distribution class have methods to modify a distribution by giving it additional dimensions. Alternatively, the non-stochastic values could be curried into the function/expression before it is passed into the Expectation taking code. That way the expectation-taking code could be more faithful to the mathematical operation. *Performance* In my opinion, a 3% performance hit is not bad. It indicates that there is a small constant factor time increase, but that is negligible in computational terms. There's a famous Donald Knuth quote, "Premature optimization is the root of all evil." By concentrating this operation into one function, it will save in developer time, which is far more expensive than computational time, down the road. It will also make it easier to optimize this function, so e.g. Numba-based optimizations will ripple over the whole library quickly. *Finalizing the PR* I think that before this gets merged in, it should be made as a replacement to solveConsAggShock and not an additional solveConsAggShockNEW The new method should also be worked into the other solvers where expectation-taking is done. Functionality like this should have an automated unit test. That would also help document the functionality. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFIVKKQJJZAABZBIWT3RX5T65ANCNFSM4OC24YZA> .

sbenthall · 2020-06-22T16:18:58Z

RE: Vectorization.

If I'm not mistaken, where HARK uses vectorized functions elsewhere, it is creating them as instantiations of particular classes. Like here:
https://github.com/econ-ark/HARK/blob/master/HARK/interpolation.py#L61

It would be useful if all vectorized functions in HARK were subclasses of some common class (something more specific than HARKObject). Then testing to see if a function was vectorized or not could be done with a type check.

I'm not sure how the transition to Dolo's interpolation code effects this, but I expect it makes it easier. I think a little more thought was put into the type architecture there.

Another possibility is that HARK's vectorized functions could be written as numpy ufuncs. This seems to accomplish something similar.

Re: tensor functionality. A complete unit test suite would have a test for both control flows/use cases for the method. It would thereby provide examples of how the flexibility plays out in practice.

I strongly disagree with you about code style on one point. As a rule, I think that a method that represents a general mathematical expression should always work for the trivial case. That is in part what allows it to be perfectly general. It would make it easier to translate from math to code and back. Optional method arguments are easy to implement in Python, are very Pythonic, etc. No reason not to have them.

If the point of the method is to take the integral over some values, as opposed to more generically computing the expectation over a distribution, then maybe it would be worth wrapping the step where the integration is taken explicitly as a separate step. Maybe my calculus to rusty here though. I'm sure I'll have a clearer idea of what's going on when I see the unit tests.

mnwhite · 2020-06-22T16:23:43Z

I'm using integral and expectation interchangeably here, with "integral" being shorthand for "definite integral over the domain of a distribution". The expectations-calculating step is just a single line using np.dot; everything else is just setting the arrays up for a dot product.

…

On Mon, Jun 22, 2020 at 12:19 PM Sebastian Benthall < ***@***.***> wrote: RE: Vectorization. If I'm not mistaken, where HARK uses vectorized functions elsewhere, it is creating them as instantiations of particular classes. Like here: https://github.com/econ-ark/HARK/blob/master/HARK/interpolation.py#L61 It would be useful if all vectorized functions in HARK were subclasses of some common class (something more specific than HARKObject). Then testing to see if a function was vectorized or not could be done with a type check. I'm not sure how the transition to Dolo's interpolation code effects this, but I expect it makes it easier. I think a little more thought was put into the type architecture there. Another possibility is that HARK's vectorized functions could be written as numpy ufuncs <https://numpy.org/doc/stable/user/c-info.ufunc-tutorial.html>. This seems to accomplish something similar. Re: tensor functionality. A complete unit test suite would have a test for both control flows/use cases for the method. It would thereby provide examples of how the flexibility plays out in practice. I strongly disagree with you about code style on one point. As a rule, I think that a method that represents a general mathematical expression should always work for the trivial case. That is in part what allows it to be perfectly general. It would make it easier to translate from math to code and back. Optional method arguments are easy to implement in Python, are very Pythonic, etc. No reason not to have them. If the point of the method is to take the integral over some values, as opposed to more generically computing the expectation over a distribution, then maybe it would be worth wrapping the step where the integration is taken explicitly as a separate step. Maybe my calculus to rusty here though. I'm sure I'll have a clearer idea of what's going on when I see the unit tests. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFL2O5JJ2JCL5MVMAYDRX6AABANCNFSM4OC24YZA> .

llorracc · 2020-06-22T19:05:45Z

I know @llorracc is keen on matching either the Mathematica or Sympy syntaxes ultimately,

It's not so much the syntaxes that I want to match, it's the conceptual structure of how these operations are conceived and expressed. (I'd add Dolo to the list here).

The Mma, SymPy, and Dolo people are very smart, and have put a LOT of thought into creating maximally flexible and useful ways of doing things. All three (and I think STAN, too; probably other languages as well) have ended up in the same place:

The thing whose expectation is being taken is written as a function with the language's usual syntax, but where the syntax identifies certain variables explicitly as being stochastic draws from a distribution.
Each stochastic variable is required to be a ProbabilityDistribution object (to use Mma's syntax), where the nature of the ProbabilityDistribution determines how the expectation is numerically computed.
ProbabilityDistribution-type computational objects are assumed (and required) to satisfy the mathematical properties associated with a probability distribution:
- There is some explicitly configured interval (or set) over which they are defined
- The probabilities of all of the separate outcomes sum to 1
- Any other aspects of the distribution that are required to define it, must have values. (Like, a NormalDistribution object must have a mean and a variance).

I don't think we should try to use sympy notation when we have literally no symbolic math in HARK

On the contrary, that is a powerful reason TO adopt sympy (or, probably, Dolo) notation: There's no existing code that will need to be rewritten if we adopt this standard going forward.

I have an aversion to "trying to make it like Mathematica"

The goal is not to try to "make it like Mathematica" in any sense other than understanding the mathematical/computational reasons Mathematica (and SymPy and Dolo (and STAN)) have done things the way that they have. Those are likely to be very good reasons, some of which we might not think of ourselves if invent our own ways of doing things. Indeed, the painful history of the first 20 years of Mathematica, in which it seemed that every new release broke all my code, testifies to the extent to which it is hard even for very smart people to make design decisions that will turn out to be adequate for purposes you do not envision when you originally start.

Dolo has one additional feature that I like a lot and think we should adopt:

The distribution can be specified abstractly (say, as a NormalDistribution with mean 0 and variance 1), separately from its being discretized. After being discretized, the discretization constitutes another attribute (or set of attributes) of the object.
- I have proposed a modest modification to the way Dolo does things, which is that the discretized object should keep track, inside itself, of the way it was made. Like, it should know that it is a 7 point equiprobable discretization of a censored normal distribution with censoring points at 2 standard deviations above and below the mean. That way, we can pass the discretizations created in one place to be used somewhere else, without losing track of how the object was created.

SB: "I strongly disagree with you about code style on one point. As a rule, I think that a method that represents a general mathematical expression should always work for the trivial case."

Matt was not saying that his method would not work for the trivial case; he was saying that, stylistically, we should not require users to use the "calcExpectation" tool when the expectation can be directly computed by an expression like np.dot(func(*my_dstn.X), my_dstn.pmf).

Of course we cannot stop users from doing something like that. But I think it should be explicitly deprecated, in the interest of code readability and easy mapping between mathematical and computational representation. When you see the word 'Expectation' or 'calcExpectation' or whatever, you do not need to stare further at the objects and examine closely whether the operation being performed is actually an expectation or whether it is some other kind of matrix or vector calculation.

sbenthall · 2020-06-22T19:23:14Z

@llorracc I think I see what you mean now about wanting to go more in the direction of Mma, SymPy, and Dolo. I think much of what you're asking for is out of scope for the current PR, as it involves more deep work in the Distribution code. But I think it's definitely good that it's being brought up here and I think we should continue to chart this sort of thing out as we converge on a 1.0 API.

Maybe these ideas should be split into separate issues.

The distribution can be specified abstractly (say, as a NormalDistribution with mean 0 and variance 1), separately from its being discretized. After being discretized, the discretization constitutes another attribute (or set of attributes) of the object.

This is already in HARK: https://github.com/econ-ark/HARK/blob/master/HARK/distribution.py#L10

`

he was saying that, stylistically, we should not require users to use the "calcExpectation" tool when the expectation can be directly computed by an expression like np.dot(func(*my_dstn.X), my_dstn.pmf).

Of course we cannot stop users from doing something like that. But I think it should be explicitly deprecated, in the interest of code readability and easy mapping between mathematical and computational representation.

I'm not following.

In my view, the best design would be for this to not be a standalone function at all.
Rather, it would be a method on the Distribution class.

So you could do something like:

>>> X = Normal(mu = 1.0, sigma = 1.0)
>>> X.expectation()
1
>>> X.approx(10).expectation(func = lambda x : x * 2)
2

This would be closer to the math:

$E_X[x^2]$

Nobody requires mathematicians to write out $\sum_{x \in X} x P(x)$ every time they want to represent the expected value of a variable.

mnwhite · 2020-06-22T19:33:52Z

To me, the expectation operator is being represented by my `calcExpectation` function. We *can* put something like calcExpectation as a method of DiscreteDistribution, but that's not how I think about the math. I read a statement like `calcExpectation(func,values,dstn)` as being pretty intuitive: you're calculating the expectation of a function at some values subject to a distribution. The docstring is explicit about the order of inputs for the function.

…

On Mon, Jun 22, 2020 at 3:23 PM Sebastian Benthall ***@***.***> wrote: @llorracc <https://github.com/llorracc> I think I see what you mean now about wanting to go more in the direction of Mma, SymPy, and Dolo. I think much of what you're asking for is out of scope for the current PR, as it involves more deep work in the Distribution code. But I think it's definitely good that it's being brought up here and I think we should continue to chart this sort of thing out as we converge on a 1.0 API. Maybe these ideas should be split into separate issues. 1. The distribution can be specified abstractly (say, as a NormalDistribution with mean 0 and variance 1), separately from its being discretized. After being discretized, the discretization constitutes another attribute (or set of attributes) of the object. This is already in HARK: https://github.com/econ-ark/HARK/blob/master/HARK/distribution.py#L10 ` he was saying that, stylistically, we should not require users to use the "calcExpectation" tool when the expectation can be directly computed by an expression like np.dot(func(*my_dstn.X), my_dstn.pmf). Of course we cannot stop users from doing something like that. But I think it should be explicitly deprecated, in the interest of code readability and easy mapping between mathematical and computational representation. I'm not following. In my view, the best design would be for this to not be a standalone function at all. Rather, it would be a method on the Distribution class. So you could do something like: >>> X = Normal(mu = 1.0, sigma = 1.0) >>> X.expectation() 1 >>> X.approx(10).expectation(func = lambda x : x * 2) 2 This would be closer to the math: $E_X[x^2]$ Nobody requires mathematicians to write out $\sum_{x \in X} x P(x)$ every time they want to represent the expected value of a variable. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFOFXLCL3HP2L2LBBI3RX6VS7ANCNFSM4OC24YZA> .

sbenthall · 2020-06-22T19:42:51Z

Ok, but why not order the arguments in descending order of necessity, and make the unnecessary ones optional?

mnwhite · 2020-06-22T19:50:44Z

Because this input is only "optional" if you want to do the trivial case that takes one very easy line and doesn't require a separate function call. And it's not a huge burden for the user who really wants to do this strange trivial case using calcExpectation rather than just writing the expectation inline with np.dot to simply add the argument []. And because the argument order is currently aligned the way you would read it in math: $\int g(x,y) dF(y)$ or $E[ g(x,y) | y ~ F(y)]$. If it's important to you, I'll change the argument order. But it feels unnatural and backward.

…

On Mon, Jun 22, 2020 at 3:43 PM Sebastian Benthall ***@***.***> wrote: Ok, but why not order the arguments in descending order of necessity, and make the unnecessary ones optional? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFN2UC2YKWESOIKSZ4TRX6X4RANCNFSM4OC24YZA> .

calcExpectation now correctly handles passing a null argument (empty list) for values, representing a function with no deterministic arguments.

sbenthall · 2020-06-22T20:57:28Z

Yes, it is important to me. Thanks Matt.

I think we should enable users to use our code like they would use math as much as possible (allowing, for example, $E[X]$ and $E[g(X)]$ as well as whatever is going on with the non-stochastic values), but I think it's also, in Python code, important to use Python conventions for letting users leave options out when there is a degenerate case that's a reasonable default. In this case, that would be that func is the identity function, and there are no non-stochastic values in the express.

sbenthall · 2020-06-22T21:02:10Z

Actually, I see one wrinkle in what I just said which might be important.

$E[X]$ isn't defined when X is a multivariate function?

sbenthall · 2020-06-22T21:06:19Z

Ah, no, it's well defined as the vector of expected values: https://en.wikipedia.org/wiki/Multivariate_random_variable#Expected_value

mnwhite · 2020-06-22T21:06:57Z

Wait wait, now you've changed it again. Before you wanted to be able to omit `values`, now you want to be able to omit `values` *and* `func`, and claim that the identity function is the default for `func`. As you just noted, this won't work for multivariate distributions, at least not with the expected return. And it strains credulity that any user would possibly want to use calcExpectation(dstn) when they can just do np.dot(dstn.X, dstn.pmf). I really don't think these are reasonable trivial cases at all. Heck, if we're using the word generate, we might as well have dstn default to a degenerate distribution at zero.

…

On Mon, Jun 22, 2020 at 4:57 PM Sebastian Benthall ***@***.***> wrote: Yes, it is important to me. Thanks Matt. I think we should enable users to use our code like they would use math as much as possible (allowing, for example, $E[X]$ and $E[g(X)]$ as well as whatever is going on with the non-stochastic values), but I think it's also, in Python code, to use Python conventions for letting users leave options our when there is a degenerate case that's a reasonable default. In this case, that would be that func is the identity function, and there are no non-stochastic values in the express. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFMK6VEBBFMXQI3CBYDRX7AUNANCNFSM4OC24YZA> .

mnwhite · 2020-06-22T21:10:13Z

So the function should return different types of output depending on which combination of inputs are passed?

…

On Mon, Jun 22, 2020 at 5:06 PM Sebastian Benthall ***@***.***> wrote: Ah, no, it's well defined as the vector of expected values: https://en.wikipedia.org/wiki/Multivariate_random_variable#Expected_value — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFLBL2WRTGNFAXIAKMTRX7BVRANCNFSM4OC24YZA> .

sbenthall · 2020-06-22T21:16:44Z

Matt, if you can't be bothered to do this, then I can do it as a separate PR later.

I don't think it's a strain on incredulity that uses would want to use an expectation function rather than write out the matrix operations, since in mathematical notation there's many times when you would want to use the expectation operation rather than write it out.

One reason why this would be useful would because in case our underlying data representation for Distributions changes (we already have a few different reasons why it would), then downstream code would not need to change.

I suppose I'd argue that the expectation operator should return whatever type the draw operator returns.

If you think I'm being unreasonable, I'd encourage you to look at what other libraries do for this. Just a couple examples:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html#scipy.stats.rv_continuous

https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.expect.html#scipy.stats.rv_discrete.expect

mnwhite · 2020-06-22T21:26:31Z

If this were an expectation method on a distribution, this would be fine: my_dstn.expectation() should return something that's the same shape as my_dstn.draw(). But this function is taking an expectation over a transformation (or function) of a random variable, so we can specify that the return is an array of reals. To me, this is a function that uses a distribution as an input, not an innate feature of the distribution itself. It doesn't have to be a separate PR, we can do it in this branch.

…

On Mon, Jun 22, 2020 at 5:16 PM Sebastian Benthall ***@***.***> wrote: Matt, if you can't be bother to do this, then I can do it as a separate PR later. I don't think it's a strain on incredulity that uses would want to use an expectation function rather than write out the matrix operations, since in mathematical notation there's many times when you would want to use the expectation operation rather than write it out. One reason why this would be useful would because in case our underlying data representation for Distributions changes (we already have a few different reasons why it would), then downstream code would not need to change. I suppose I'd argue that the expectation operator should return whatever type the draw operator returns. If you think I'm being unreasonable, I'd encourage you to look at what other libraries do for this. Just a couple examples: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html#scipy.stats.rv_continuous https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.expect.html#scipy.stats.rv_discrete.expect — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFMCH5HYJARELY77YO3RX7C4TANCNFSM4OC24YZA> .

dstn is now first

llorracc · 2020-06-23T03:32:03Z

Not sure I understand this question. You can ask Mathematica to evaluate x \[Times] 2. If x is a scalar (x=4), the answer will be 8. If x is a vector (x={4,8}) the answer will be {8,16}. That is an attractive way of doing things. It may not be worth the effort, but it is by no means crazy. On Mon, Jun 22, 2020 at 5:10 PM Matthew N. White <[email protected]> wrote:

…

So the function should return different types of output depending on which combination of inputs are passed? On Mon, Jun 22, 2020 at 5:06 PM Sebastian Benthall < ***@***.***> wrote: > Ah, no, it's well defined as the vector of expected values: > https://en.wikipedia.org/wiki/Multivariate_random_variable#Expected_value > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#720 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ADKRAFLBL2WRTGNFAXIAKMTRX7BVRANCNFSM4OC24YZA > > . > — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#720 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCK77OFMQYW4MWH42WNV3RX7CEFANCNFSM4OC24YZA> .

-- - Chris Carroll

llorracc · 2020-06-23T03:39:59Z

If this were an expectation method on a distribution, this would be fine:
my_dstn.expectation() should return something that's the same shape as
my_dstn.draw(). But this function is taking an expectation over a
transformation (or function) of a random variable, so we can specify that
the return is an array of reals. To me, this is a function that uses a
distribution as an input, not an innate feature of the distribution itself.

Yes, Matt's right about this.

Again, the syntax in both Mma and SymPy requires you, for EACH stochastic variable in an expectation/expression, to define its distribution. So you could have

Expectation[x [Times] y, {x [Distributed] NormalDistribution[0,1], y[Distributed] UniformDistribution[-1,1]}].

The expectation is not a slave or an attribute of the distribution. The expectation needs to know the distribution of EACH random variable in the expression (and, if they are correlated, it needs the joint distribution).

PS. Also, it's important to realize that the math of combining distributions requires you to compute a Jacobian if the expression is any more complicated than simple addition. Mma handles this with the TransformedDistribution tool. Am sure sympy does something similar. Another illustration of why we should stand on the shoulders of giants, rather than constructing our own methods to break down a function and compute its Jacobian, when computing the expectation.

sbenthall · 2020-06-23T12:29:57Z

Again, the syntax in both Mma and SymPy requires you, for EACH stochastic variable in an expectation/expression, to define its distribution.

Does it assume that these variables are independent?
Does Mma have a representation of a multivariate distribution?

The expectation needs to know the distribution of EACH random variable in the expression (and, if they are correlated, it needs the joint distribution).

Currently, HARK distributions can be multivariate. So I'm not sure the objection here is substantive.

Expectation[x [Times] y, {x [Distributed] NormalDistribution[0,1], y[Distributed] UniformDistribution[-1,1]}].

One thing that may be attracting you to this syntax is the way the Mathematica interpreter handles variable binding.

What this is doing in effect is allowing the user to pass a scope (binding of symbols to values, here, distributions) that the expression can be evaluated in.

This way of handling scope in Mathematica is a nice feature. Dolang is accomplishing something a bit similar. But Python does not do things this way very easily--a new "domain specific language" has to be built on top of it. To build towards this, we would need to take the challenge of designing that seriously.

That challenge is far more complex than the task of providing an expectation function that refactors the current code.

From my vantage point, the somewhat lighter weight options are:

Using the function's argument order, and the order of values provided by a multivariate distribution, in order to establish the binding.
Using something like a dictionary, for multiple values, or something equivalent for a multivariate value, to give its values names, and using the argument names of the function to do the binding. This is what I've veered toward in the GenericModel work.

I don't think we should underestimate how by going down this road, we are breaking/setting user expectations about how programming syntax works. These are not trivial issues to introduce.

it's important to realize that the math of combining distributions requires you to compute a Jacobian if the expression is any more complicated than simple addition

If I'm not mistaken, the current implementation being discussed is for discretized distributions. So we can sidestep this problem in the short term? We have multivariate support.

I agree with @mnwhite that currently there is no symbolic analysis in HARK, so I'm not sure what you're after. It would be nice to have symbolic expressions one day.

The scipy solution here appears to be numerical:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.expect.html#scipy.stats.rv_discrete.expect

It does not provide an expectation function for continuous random variables. Maybe because this requires more symbolic manipulation than scipy does under the hood.

llorracc · 2020-06-23T13:52:09Z

Again, the syntax in both Mma and SymPy requires you, for EACH stochastic variable in an expectation/expression, to define its distribution.

Does it assume that these variables are independent?
Does Mma have a representation of a multivariate distribution?

If you specify each variable independently, it assumes they are independent:

Expectation[f[x]+g[y],{x\[Distributed]Normal[0,1],y\[Distributed]Normal[0,1]}]]

If you specify the variables as following a multivariate distribution, it uses that correctly:

Expectation[f[x]+g[y],{{x,y}\[Distributed]MultivariateNormal[{mu_x,mu_y},CovMat}]

One thing that may be attracting you to this syntax is the way the Mathematica interpreter handles variable binding.

Honestly, truly, it's NOT the syntax. That's why I keep mentioning Dolo and SymPy as equivalents. All three of them (and Stan) have settled on the same logical structure: In order to take the expectation of something with respect to a variable, you have to give a proper definition of the distribution of the variable. It's not an accident that all four of the leading platforms we have looked at do this the same way. They do it that way because it is the compelling mathematical logic for how to handle taking expectations it in a general-purpose, as opposed to one-off ad-hoc, way.

If we're going to redo how we represent expectations at all, I don't see the point in redoing them in a way that will inevitably require us to redo them again at some point in the not-too-distant future. We may as well just do it right - especially since we can just borrow the "doing it right" architecture from SymPy or Dolo (with preference given to Dolo).

From my vantage point, the somewhat lighter weight options are:
Using the function's argument order, and the order of values provided by a multivariate distribution, in order to establish the binding.
Using something like a dictionary, for multiple values, or something equivalent for a multivariate value, to give its values names, and using the argument names of the function to do the binding.

Why make up our own new idiosyncratic way of doing this, which we will have to break again down the road, when there are ready-made high-quality solutions (one of which we will eventually have to adopt) on the shelf?

I don't think we should underestimate how by going down this road, we are breaking/setting user expectations about how programming syntax works. These are not trivial issues to introduce.

Yes, it's precisely BECAUSE they are nontrivial that I want to be careful not to TREAT them trivially by just making things up on the fly that will work for the present moment but will not stand the test of time. This is something we should do right once, and be done.

it's important to realize that the math of combining distributions requires you to compute a Jacobian if the expression is any more complicated than simple addition

If I'm not mistaken, the current implementation being discussed is for discretized distributions. So we can sidestep this problem in the short term? We have multivariate support.

This is a perfect example of why I want to "do it right" now. If we choose the wrong syntax now (one that cannot support the computation of a Jacobian), we have ruled out EVER including continuous distributions (or ever including them in a way that doesn't break existing usages requiring a complete revamping of the codebase).

... we can sidestep this problem in the short term

My proposal to adopt SymPy or Dolo's approach as our well-defined standard going forward also allows us to "sidestep" this issue, since nothing I'm proposing would break any existing code. It would define a "preferred" way to do things going forward.

It would be nice to have symbolic expressions one day.

Maybe; I'm not sure it will ever be worth doing. Allowing continuous distributions is much more likely to be valuable. But what IS worth doing is, at low (current) cost, accommodating all of these possibilities by adopting a carefully designed structure that others have conveniently thought through for us.

The scipy solution here appears to be numerical:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.expect.html#scipy.stats.rv_discrete.expect

The SymPy, Dolo, and Mathematica schemes can represent either discrete or continuous RV's. SciPy may indeed accommodate only numerical ones; I'm not sure. But if so, I don't see any reason why that should mean that, starting from scratch as we are, we should adopt the SciPy approach ...

sbenthall · 2020-06-23T18:01:17Z

That's why I keep mentioning Dolo and SymPy as equivalents. All three of them (and Stan) have settled on the same logical structure: In order to take the expectation of something with respect to a variable, you have to give a proper definition of the distribution of the variable.

I'm confused.

In SymPy, you can assign a distribution to a variable and then pass it to the expectation operation:

https://docs.sympy.org/latest/modules/stats.html

>>> x = Symbol('x')
>>> pdf = exp(-x) # pdf of the Continuous Distribution
>>> Z = ContinuousRV(x, pdf, set=Interval(0, oo))
>>> E(Z)
1

I don't believe Dolo has a representation of a probability distribution per se. The closest thing are its processes definitions. I don't believe there's an expectation function implemented on those yet.

Maybe if you could write an example of how you would like to see the expectation taking function work in HARK, that would clarify things.

You've both convince me now that it's worthwhile to have an expectation function that is not a method on a particular distribution, though I think it would be very easy to have both if we have one, and that that would be nice.

I think it would be nice, but more difficult, for the standalone Expectation operator to take multiple distributions. I'm not sure how that would look though.

Here's what happens in SymPy when you try to take the expectation of a formula of two distributions.

>>> A = ContinuousRV(x, pdf, set=Interval(0, oo))
>>> E(Z + A)
2

I'm not 100% sure what's happening here, but it looks like they've overwritten the "+" operator so that it combines the two distributions into a common "expression" object:

https://github.com/sympy/sympy/blob/b92b971eb4712e9ccfbdae993c42b3563ce8f86d/sympy/stats/symbolic_probability.py#L56-L166

This is what I'm talking about: SymPy does a lot of pretty heavy manipulations of the standard Python syntax in order to handle these. It's also very slow to evaluate. SymPy, according to its creators, is an "embedded domain specific programming language".

llorracc · 2020-06-23T18:52:16Z

The closest thing are its processes definitions.

Yes, that's the same thing. "Processes" are just distributions for variables whose realizations occur at a well defined moment in the evolution of a model over time. Since we will always be dealing with dynamic problems (that is, ones that are defined over time), "Processes" are what we will need.

I don't believe there's an expectation function implemented on those yet.

The calculation of expectations is built into the structure of Dolo models.

In SymPy, you can assign a distribution to a variable and then pass it to the expectation operation:

Same thing in Dolo, except it is called a process.

Mathematica is a bit more complex syntactically, but the idea amounts to the same thing.

I'm not 100% sure what's happening here, but it looks like they've overwritten the "+" operator so that it combines the two distributions into a common "expression" object:

I'm sure that's what they must have done, because when you are combining continuous distributions with each other you need to properly use the Jacobian. I'm sure that the "expression" object is just the proper rewriting of the expression using the Jacobian.

In Mathematica it is a bit more explicit: You are supposed to define a TransformedDistribution object, which amounts to the same thing as the SymPy expression:

TransformedDistribution[ x + b y,{x [Distributed] NormalDistribution[0,1], y [Distributed] LogNormalDistribution[0,1]}] is what would correspond to the expression that SymPy constructs on the fly.

SymPy, according to its creators, is an "embedded domain specific programming language".

Where the domain is "Mathematics." And the subdomain is "Statistics and Probability."

You've both convince me now that it's worthwhile to have an expectation function that is not a method on a particular distribution, though I think it would be very easy to have both if we have one, and that that would be nice.

My view is that we should not proliferate ways of doing things; somebody who learns the command for a single distribution will be confused when they have a multivariate distribution. It's not like it is hard to do it for a single distribution. Assuming you have imported the relevant parts of sympy (which, if we were using sympy, we would do anyway), it's just

Z = Normal('Z', 0, 1)
E(Z)
1

sbenthall · 2020-06-23T20:42:14Z

we should be careful how we structure that syntax/machinery so that it won't have to be completely rewritten

My impression from this discussion is that there are a variety of 'reasonable' implementations, all of which would be an improvement on what's there, and which would be rather easy to refactor into each other once that first big lift is done. I'll defer to @mnwhite on the first implementation.

sbenthall · 2020-11-19T17:51:47Z

Bumping this PR -- what's needed to get this PR finished? @mnwhite

llorracc · 2020-12-03T17:03:27Z

This is something we should consult with @albop and Normann about

sbenthall · 2020-12-09T03:13:14Z

Assigning this to myself to resolve conflicts and bring to completion.

Wrote first version of non-symbol expectation-evaluating function, completely untested. Also fixed a couple typos in distribution.py.

Still need to test alternate input versions.

calcExpectations now tested on more advanced cases. Fixed dumb typo (copy-pasted from other line) and changed colons to ellipses in axis-construction step. Also wrote new version of ConsAggShock solver that uses calcExpectation, as a demonstration of what the code would look like.

calcExpectation now correctly handles passing a null argument (empty list) for values, representing a function with no deterministic arguments.

dstn is now first

…Expectation

Calc expectation

mnwhite added 3 commits June 18, 2020 17:54

Added draft of calcExpectation

fa9854b

Wrote first version of non-symbol expectation-evaluating function, completely untested. Also fixed a couple typos in distribution.py.

Fixed typos, basic evalExpectations works

609ec18

Still need to test alternate input versions.

Merge branch 'master' into calcExpectation

a674f51

sbenthall mentioned this pull request Jun 22, 2020

distributions tracking discretizations #724

Closed

calcExpectation handles missing values

e3527a7

calcExpectation now correctly handles passing a null argument (empty list) for values, representing a function with no deterministic arguments.

Change order of arguments

abda8aa

dstn is now first

llorracc changed the title ~~Add calcExpectation function~~ [WIP] Add calcExpectation function - when we figure out Distributions Jul 2, 2020

sbenthall self-assigned this Dec 9, 2020

sbenthall changed the title ~~[WIP] Add calcExpectation function - when we figure out Distributions~~ Add calcExpectation function - when we figure out Distributions Dec 10, 2020

sbenthall added this to the 1.0.0 milestone Dec 10, 2020

sbenthall and others added 7 commits December 15, 2020 11:31

Merge branch 'master' into calcExpectation

78ea75d

Added draft of calcExpectation

5f7cb75

Wrote first version of non-symbol expectation-evaluating function, completely untested. Also fixed a couple typos in distribution.py.

Fixed typos, basic evalExpectations works

0f0cb3f

Still need to test alternate input versions.

calcExpectation handles missing values

18f3368

calcExpectation now correctly handles passing a null argument (empty list) for values, representing a function with no deterministic arguments.

Change order of arguments

83513e5

dstn is now first

fixing the merge

3801a24

sbenthall mentioned this pull request Dec 15, 2020

Calc expectation #884

Merged

3 tasks

sbenthall closed this Dec 15, 2020

sbenthall added 6 commits December 15, 2020 13:17

Merge branch 'calcExpectation' into calcExpectation

36564bc

adding CHANGELOG update

2d61292

Merge branch 'calcExpectation' of github.com:sbenthall/HARK into calc…

ff4a782

…Expectation

adding automated test for calcExpectation

e138d0e

fixing the imports

df82892

Merge pull request #884 from sbenthall/calcExpectation

ccb45f3

Calc expectation

sbenthall mentioned this pull request Dec 15, 2020

Simplify expectation taking code #625

Closed

sbenthall reopened this Dec 22, 2020

Merge branch 'master' into calcExpectation

25f18fe

sbenthall merged commit 498e6af into master Dec 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add calcExpectation function - when we figure out Distributions #720

Add calcExpectation function - when we figure out Distributions #720

mnwhite commented Jun 19, 2020

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

llorracc commented Jun 22, 2020

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020 •

edited

Loading

sbenthall commented Jun 22, 2020

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020 •

edited

Loading

mnwhite commented Jun 22, 2020 via email

llorracc commented Jun 23, 2020 via email

llorracc commented Jun 23, 2020

sbenthall commented Jun 23, 2020

llorracc commented Jun 23, 2020

sbenthall commented Jun 23, 2020

llorracc commented Jun 23, 2020 •

edited

Loading

sbenthall commented Jun 23, 2020

sbenthall commented Nov 19, 2020

llorracc commented Dec 3, 2020

sbenthall commented Dec 9, 2020

Add calcExpectation function - when we figure out Distributions #720

Add calcExpectation function - when we figure out Distributions #720

Conversation

mnwhite commented Jun 19, 2020

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

llorracc commented Jun 22, 2020

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020 • edited Loading

sbenthall commented Jun 22, 2020

sbenthall commented Jun 22, 2020

mnwhite commented Jun 22, 2020 via email

mnwhite commented Jun 22, 2020 via email

sbenthall commented Jun 22, 2020 • edited Loading

mnwhite commented Jun 22, 2020 via email

llorracc commented Jun 23, 2020 via email

llorracc commented Jun 23, 2020

sbenthall commented Jun 23, 2020

llorracc commented Jun 23, 2020

sbenthall commented Jun 23, 2020

llorracc commented Jun 23, 2020 • edited Loading

sbenthall commented Jun 23, 2020

sbenthall commented Nov 19, 2020

llorracc commented Dec 3, 2020

sbenthall commented Dec 9, 2020

sbenthall commented Jun 22, 2020 •

edited

Loading

sbenthall commented Jun 22, 2020 •

edited

Loading

llorracc commented Jun 23, 2020 •

edited

Loading