-
-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add calcExpectation function - when we figure out Distributions #720
Conversation
Wrote first version of non-symbol expectation-evaluating function, completely untested. Also fixed a couple typos in distribution.py.
Still need to test alternate input versions.
calcExpectations now tested on more advanced cases. Fixed dumb typo (copy-pasted from other line) and changed colons to ellipses in axis-construction step. Also wrote new version of ConsAggShock solver that uses calcExpectation, as a demonstration of what the code would look like.
Hi @mnwhite . This is exciting work. A few comments. I expect much of this you know already as this is just a first draft, but I'm making complete notes. Vectorized func I'm interested in how the Given that vectorization of a function is an easy add, I wonder if Tensor operations on the values? Since there is only one use case modeled in this PR, I'm having a hard time seeing why there is so much flexibility built into the It would simplify the The tensor manipulations used to reshape values arguments into an appropriate form can be functionalized out into a separate method. That would allow the reader to focus on the expectation calculation. Looking to see if there is any lower level numpy or scipy support for tensor products, I found this numpy function. I wonder if it's relevant to this code. Argument order If I'm not mistaken, the I know @llorracc is keen on matching either the Mathematica or Sympy syntaxes ultimately, and so for the sake of discussion I'll link to those here: https://reference.wolfram.com/language/ref/Expectation.html https://docs.sympy.org/latest/modules/stats.html?highlight=expectation#sympy.stats.Expectation Neither of these methods supports non-stochastic values and it's a bit awkward including them, in my view, though I can see how this is motivated by compatibility with the current time-varying parameter architecture (I think?--I'm honestly not sure I'm following the design.) But I wanted to note a different way of doing this, which would be to restrict the Expectation-taking method to operate only on distributions, but have the Distribution class have methods to modify a distribution by giving it additional dimensions. Alternatively, the non-stochastic values could be curried into the function/expression before it is passed into the Expectation taking code. That way the expectation-taking code could be more faithful to the mathematical operation. Performance In my opinion, a 3% performance hit is not bad. It indicates that there is a small constant factor time increase, but that is negligible in computational terms. There's a famous Donald Knuth quote, "Premature optimization is the root of all evil." By concentrating this operation into one function, it will save in developer time, which is far more expensive than computational time, down the road. It will also make it easier to optimize this function, so e.g. Numba-based optimizations will ripple over the whole library quickly. Finalizing the PR I think that before this gets merged in, it should be made as a replacement to The new method should also be worked into the other solvers where expectation-taking is done. Functionality like this should have an automated unit test. That would also help document the functionality. |
How `func` deals with vectorization seems like an issue for `func`, not
`calcExpectation`. If there's a way to somehow say "this function might
not be written with vectorized code, so do that automatically now", then we
can do that, but I'm not familiar with it. All of the HARK code I've
written uses numpy array operations; there are essentially no loops over
values except where mathematically necessary or numerically insignificant.
In a few places I wrote some very small steps using a loop to fill in an
array because it's much easier to read-- I could have done it with array
operations, but it would speed up the code by 0.01% and make it mindbending
to read. I might be misinterpreting what you mean by "vectorized code",
however, so I'm not sure I understand this comment, particularly "Given
that vectorization of a function is an easy add, I wonder if
calcExpectation should
take func in a non-vectorized way, and vectorize it as part of the
expectation taking process."
There might be a built in numpy function to create a tensor grid, and we
can use it if we identify the right one. np.ufunc.outer works on numpy
universal functions, not just a generic function. But I'm not going to
remove functionality/flexibility just because I didn't (yet) include an
example of it being used. In fact, the pastebin "toddler" file I posted
uses the input syntax that isn't used by solveConsAggShockNEW. The point
of this function is to evaluate mathematical expressions of the form $\int
g(x,y) dF(y)$; g is `func`, x is `values`, and F(y) is the CDF of `dstn`.
The user wants this integral/expectation computed at many different values
of x, and there are two clear uses that come up in HARK when using the
endogenous grid method. In some models, the relevant end-of-period states
at which we want to compute future realizations of (marginal) value are
simply a rectilinear grid / tensor set of independent grids of each
dimension; calcExpectation saves the user the time (and the reader the
mental energy) of doing the tiling/reshaping for them. In other models,
like ConsAggShock, the relevant end-of-period states *don't* line up in a
simple tensor grid, because the bounds of the state space are curved. In
this case, the user can construct the arrays of values they're interested
in.
If there aren't any non-stochastic arguments, then taking the expectation
of a function over a DiscreteDistribution instance is simply
`np.dot(func(*my_dstn.X), my_dstn.pmf)`, that's it. I don't think we
should rewrite the function and the expected format of `func` so that it's
"prettier" when a user wants to do a trivial case. I'm editing the
function so it will work as is if they just pass an empty list as
`values`. I think this basically does follow the syntax of Mathematica,
except that our expression is array-valued. I don't think we should try to
use sympy notation when we have literally no symbolic math in HARK, and I
have an aversion to "trying to make it like Mathematica" for personal
history reasons.
Including non-stochastic values in the function isn't because of the
time-varying structure. It's because the user wants to compute the
expectation at multiple function values simultaneously without having to
write a new `expr` object every time, as they do in Mathematica. I think
mma can probably do multiple inputs simultaneously using Table, but that's
*literally* a loop operation. I thought the code in solveConsAggShockNEW
would make it clear how the function works. I don't actually understand
your suggested alternate way of programming this kind of function, so I
can't evaluate whether/how it would be "more faithful to the mathematical
operation".
Yeah, I made a new function so that it's easy to both compare performance
of the old and new functions and to read and compare the code. To me, the
NEW version is very, very clean: the functions calcAggObjects and
vPnextFunc define how we get from today's end-of-period state to next
period's state, and then line 791 takes the expectation of end-of-period
marginal value of assets at arrays of relevant states. I can write unit
tests.
…On Mon, Jun 22, 2020 at 10:36 AM Sebastian Benthall < ***@***.***> wrote:
Hi @mnwhite <https://github.com/mnwhite> . This is exciting work.
A few comments. I expect much of this you know already as this is just a
first draft, but I'm making complete notes.
*Vectorized func*
I'm interested in how the func argument of calcExpectation explicitly
operates over arrays. I have learned from looking at Dolo code that this is
called a vectorized function. There is numpy support for vectorization
<https://numpy.org/doc/1.18/reference/generated/numpy.vectorize.html>. In
the EconForge stack there is a numba optimized vectorization function.
Given that vectorization of a function is an easy add, I wonder if
calcExpectation should take func in a non-vectorized way, and vectorize
it as part of the expectation taking process.
*Tensor operations on the values?*
Since there is only one use case modeled in this PR, I'm having a hard
time seeing why there is so much flexibility built into the values
argument. I'm having a hard time following the rationale from the comment
alone.
It would simplify the calcExpectation method if it only took values in a
standardized way.
The tensor manipulations used to reshape values arguments into an
appropriate form can be functionalized out into a separate method. That
would allow the reader to focus on the expectation calculation.
Looking to see if there is any lower level numpy or scipy support for
tensor products, I found this numpy function
<https://numpy.org/doc/stable/reference/generated/numpy.ufunc.outer.html#numpy.ufunc.outer>.
I wonder if it's relevant to this code.
*Argument order*
If I'm not mistaken, the values argument should be strictly speaking
optional, since taking the expectation of a function over a distribution
does not require that there's non-stochastic arguments. So it would make
sense for this argument to be last.
I know @llorracc <https://github.com/llorracc> is keen on matching either
the Mathematica or Sympy syntaxes ultimately, and so for the sake of
discussion I'll link to those here:
https://reference.wolfram.com/language/ref/Expectation.html
https://docs.sympy.org/latest/modules/stats.html?highlight=expectation#sympy.stats.Expectation
Neither of these methods supports non-stochastic values and it's a bit
awkward including them, in my view, though I can see how this is motivated
by compatibility with the current time-varying parameter architecture (I
think?--I'm honestly not sure I'm following the design.)
But I wanted to note a different way of doing this, which would be to
restrict the Expectation-taking method to operate only on distributions,
but have the Distribution class have methods to modify a distribution by
giving it additional dimensions. Alternatively, the non-stochastic values
could be curried into the function/expression before it is passed into the
Expectation taking code. That way the expectation-taking code could be more
faithful to the mathematical operation.
*Performance*
In my opinion, a 3% performance hit is not bad. It indicates that there is
a small constant factor time increase, but that is negligible in
computational terms. There's a famous Donald Knuth quote, "Premature
optimization is the root of all evil." By concentrating this operation into
one function, it will save in developer time, which is far more expensive
than computational time, down the road. It will also make it easier to
optimize this function, so e.g. Numba-based optimizations will ripple over
the whole library quickly.
*Finalizing the PR*
I think that before this gets merged in, it should be made as a
replacement to solveConsAggShock and not an additional
solveConsAggShockNEW
The new method should also be worked into the other solvers where
expectation-taking is done.
Functionality like this should have an automated unit test. That would
also help document the functionality.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFIVKKQJJZAABZBIWT3RX5T65ANCNFSM4OC24YZA>
.
|
RE: Vectorization. If I'm not mistaken, where HARK uses vectorized functions elsewhere, it is creating them as instantiations of particular classes. Like here: It would be useful if all vectorized functions in HARK were subclasses of some common class (something more specific than HARKObject). Then testing to see if a function was vectorized or not could be done with a type check. I'm not sure how the transition to Dolo's interpolation code effects this, but I expect it makes it easier. I think a little more thought was put into the type architecture there. Another possibility is that HARK's vectorized functions could be written as numpy ufuncs. This seems to accomplish something similar. Re: tensor functionality. A complete unit test suite would have a test for both control flows/use cases for the method. It would thereby provide examples of how the flexibility plays out in practice. I strongly disagree with you about code style on one point. As a rule, I think that a method that represents a general mathematical expression should always work for the trivial case. That is in part what allows it to be perfectly general. It would make it easier to translate from math to code and back. Optional method arguments are easy to implement in Python, are very Pythonic, etc. No reason not to have them. If the point of the method is to take the integral over some values, as opposed to more generically computing the expectation over a distribution, then maybe it would be worth wrapping the step where the integration is taken explicitly as a separate step. Maybe my calculus to rusty here though. I'm sure I'll have a clearer idea of what's going on when I see the unit tests. |
I'm using integral and expectation interchangeably here, with "integral"
being shorthand for "definite integral over the domain of a distribution".
The expectations-calculating step is just a single line using np.dot;
everything else is just setting the arrays up for a dot product.
…On Mon, Jun 22, 2020 at 12:19 PM Sebastian Benthall < ***@***.***> wrote:
RE: Vectorization.
If I'm not mistaken, where HARK uses vectorized functions elsewhere, it is
creating them as instantiations of particular classes. Like here:
https://github.com/econ-ark/HARK/blob/master/HARK/interpolation.py#L61
It would be useful if all vectorized functions in HARK were subclasses of
some common class (something more specific than HARKObject). Then testing
to see if a function was vectorized or not could be done with a type check.
I'm not sure how the transition to Dolo's interpolation code effects this,
but I expect it makes it easier. I think a little more thought was put into
the type architecture there.
Another possibility is that HARK's vectorized functions could be written
as numpy ufuncs
<https://numpy.org/doc/stable/user/c-info.ufunc-tutorial.html>. This
seems to accomplish something similar.
Re: tensor functionality. A complete unit test suite would have a test for
both control flows/use cases for the method. It would thereby provide
examples of how the flexibility plays out in practice.
I strongly disagree with you about code style on one point. As a rule, I
think that a method that represents a general mathematical expression
should always work for the trivial case. That is in part what allows it to
be perfectly general. It would make it easier to translate from math to
code and back. Optional method arguments are easy to implement in Python,
are very Pythonic, etc. No reason not to have them.
If the point of the method is to take the integral over some values, as
opposed to more generically computing the expectation over a distribution,
then maybe it would be worth wrapping the step where the integration is
taken explicitly as a separate step. Maybe my calculus to rusty here
though. I'm sure I'll have a clearer idea of what's going on when I see the
unit tests.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFL2O5JJ2JCL5MVMAYDRX6AABANCNFSM4OC24YZA>
.
|
It's not so much the syntaxes that I want to match, it's the conceptual structure of how these operations are conceived and expressed. (I'd add Dolo to the list here). The Mma, SymPy, and Dolo people are very smart, and have put a LOT of thought into creating maximally flexible and useful ways of doing things. All three (and I think STAN, too; probably other languages as well) have ended up in the same place:
On the contrary, that is a powerful reason TO adopt sympy (or, probably, Dolo) notation: There's no existing code that will need to be rewritten if we adopt this standard going forward.
The goal is not to try to "make it like Mathematica" in any sense other than understanding the mathematical/computational reasons Mathematica (and SymPy and Dolo (and STAN)) have done things the way that they have. Those are likely to be very good reasons, some of which we might not think of ourselves if invent our own ways of doing things. Indeed, the painful history of the first 20 years of Mathematica, in which it seemed that every new release broke all my code, testifies to the extent to which it is hard even for very smart people to make design decisions that will turn out to be adequate for purposes you do not envision when you originally start. Dolo has one additional feature that I like a lot and think we should adopt:
Matt was not saying that his method would not work for the trivial case; he was saying that, stylistically, we should not require users to use the "calcExpectation" tool when the expectation can be directly computed by an expression like np.dot(func(*my_dstn.X), my_dstn.pmf). Of course we cannot stop users from doing something like that. But I think it should be explicitly deprecated, in the interest of code readability and easy mapping between mathematical and computational representation. When you see the word 'Expectation' or 'calcExpectation' or whatever, you do not need to stare further at the objects and examine closely whether the operation being performed is actually an expectation or whether it is some other kind of matrix or vector calculation. |
@llorracc I think I see what you mean now about wanting to go more in the direction of Mma, SymPy, and Dolo. I think much of what you're asking for is out of scope for the current PR, as it involves more deep work in the Distribution code. But I think it's definitely good that it's being brought up here and I think we should continue to chart this sort of thing out as we converge on a 1.0 API. Maybe these ideas should be split into separate issues.
This is already in HARK: https://github.com/econ-ark/HARK/blob/master/HARK/distribution.py#L10 `
I'm not following. In my view, the best design would be for this to not be a standalone function at all. So you could do something like:
This would be closer to the math: Nobody requires mathematicians to write out |
To me, the expectation operator is being represented by my
`calcExpectation` function. We *can* put something like calcExpectation as
a method of DiscreteDistribution, but that's not how I think about the math.
I read a statement like `calcExpectation(func,values,dstn)` as being pretty
intuitive: you're calculating the expectation of a function at some values
subject to a distribution. The docstring is explicit about the order of
inputs for the function.
…On Mon, Jun 22, 2020 at 3:23 PM Sebastian Benthall ***@***.***> wrote:
@llorracc <https://github.com/llorracc> I think I see what you mean now
about wanting to go more in the direction of Mma, SymPy, and Dolo. I think
much of what you're asking for is out of scope for the current PR, as it
involves more deep work in the Distribution code. But I think it's
definitely good that it's being brought up here and I think we should
continue to chart this sort of thing out as we converge on a 1.0 API.
Maybe these ideas should be split into separate issues.
1. The distribution can be specified abstractly (say, as a
NormalDistribution with mean 0 and variance 1), separately from its being
discretized. After being discretized, the discretization constitutes
another attribute (or set of attributes) of the object.
This is already in HARK:
https://github.com/econ-ark/HARK/blob/master/HARK/distribution.py#L10
`
he was saying that, stylistically, we should not require users to use the
"calcExpectation" tool when the expectation can be directly computed by an
expression like np.dot(func(*my_dstn.X), my_dstn.pmf).
Of course we cannot stop users from doing something like that. But I think
it should be explicitly deprecated, in the interest of code readability and
easy mapping between mathematical and computational representation.
I'm not following.
In my view, the best design would be for this to not be a standalone
function at all.
Rather, it would be a method on the Distribution class.
So you could do something like:
>>> X = Normal(mu = 1.0, sigma = 1.0)
>>> X.expectation()
1
>>> X.approx(10).expectation(func = lambda x : x * 2)
2
This would be closer to the math:
$E_X[x^2]$
Nobody requires mathematicians to write out $\sum_{x \in X} x P(x)$ every
time they want to represent the expected value of a variable.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFOFXLCL3HP2L2LBBI3RX6VS7ANCNFSM4OC24YZA>
.
|
Ok, but why not order the arguments in descending order of necessity, and make the unnecessary ones optional? |
Because this input is only "optional" if you want to do the trivial case
that takes one very easy line and doesn't require a separate function
call. And it's not a huge burden for the user who really wants to do this
strange trivial case using calcExpectation rather than just writing the
expectation inline with np.dot to simply add the argument []. And because
the argument order is currently aligned the way you would read it in math:
$\int g(x,y) dF(y)$ or $E[ g(x,y) | y ~ F(y)]$.
If it's important to you, I'll change the argument order. But it feels
unnatural and backward.
…On Mon, Jun 22, 2020 at 3:43 PM Sebastian Benthall ***@***.***> wrote:
Ok, but why not order the arguments in descending order of necessity, and
make the unnecessary ones optional?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFN2UC2YKWESOIKSZ4TRX6X4RANCNFSM4OC24YZA>
.
|
calcExpectation now correctly handles passing a null argument (empty list) for values, representing a function with no deterministic arguments.
Yes, it is important to me. Thanks Matt. I think we should enable users to use our code like they would use math as much as possible (allowing, for example, |
Actually, I see one wrinkle in what I just said which might be important. |
Ah, no, it's well defined as the vector of expected values: https://en.wikipedia.org/wiki/Multivariate_random_variable#Expected_value |
Wait wait, now you've changed it again. Before you wanted to be able to
omit `values`, now you want to be able to omit `values` *and* `func`, and
claim that the identity function is the default for `func`. As you just
noted, this won't work for multivariate distributions, at least not with
the expected return. And it strains credulity that any user would possibly
want to use calcExpectation(dstn) when they can just do np.dot(dstn.X,
dstn.pmf). I really don't think these are reasonable trivial cases at
all. Heck, if we're using the word generate, we might as well have dstn
default to a degenerate distribution at zero.
…On Mon, Jun 22, 2020 at 4:57 PM Sebastian Benthall ***@***.***> wrote:
Yes, it is important to me. Thanks Matt.
I think we should enable users to use our code like they would use math as
much as possible (allowing, for example, $E[X]$ and $E[g(X)]$ as well as
whatever is going on with the non-stochastic values), but I think it's
also, in Python code, to use Python conventions for letting users leave
options our when there is a degenerate case that's a reasonable default. In
this case, that would be that func is the identity function, and there
are no non-stochastic values in the express.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFMK6VEBBFMXQI3CBYDRX7AUNANCNFSM4OC24YZA>
.
|
So the function should return different types of output depending on which
combination of inputs are passed?
…On Mon, Jun 22, 2020 at 5:06 PM Sebastian Benthall ***@***.***> wrote:
Ah, no, it's well defined as the vector of expected values:
https://en.wikipedia.org/wiki/Multivariate_random_variable#Expected_value
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFLBL2WRTGNFAXIAKMTRX7BVRANCNFSM4OC24YZA>
.
|
Matt, if you can't be bothered to do this, then I can do it as a separate PR later. I don't think it's a strain on incredulity that uses would want to use an One reason why this would be useful would because in case our underlying data representation for Distributions changes (we already have a few different reasons why it would), then downstream code would not need to change. I suppose I'd argue that the expectation operator should return whatever type the draw operator returns. If you think I'm being unreasonable, I'd encourage you to look at what other libraries do for this. Just a couple examples: |
If this were an expectation method on a distribution, this would be fine:
my_dstn.expectation() should return something that's the same shape as
my_dstn.draw(). But this function is taking an expectation over a
transformation (or function) of a random variable, so we can specify that
the return is an array of reals. To me, this is a function that uses a
distribution as an input, not an innate feature of the distribution itself.
It doesn't have to be a separate PR, we can do it in this branch.
…On Mon, Jun 22, 2020 at 5:16 PM Sebastian Benthall ***@***.***> wrote:
Matt, if you can't be bother to do this, then I can do it as a separate PR
later.
I don't think it's a strain on incredulity that uses would want to use an
expectation function rather than write out the matrix operations, since
in mathematical notation there's many times when you would want to use the
expectation operation rather than write it out.
One reason why this would be useful would because in case our underlying
data representation for Distributions changes (we already have a few
different reasons why it would), then downstream code would not need to
change.
I suppose I'd argue that the expectation operator should return whatever
type the draw operator returns.
If you think I'm being unreasonable, I'd encourage you to look at what
other libraries do for this. Just a couple examples:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_continuous.html#scipy.stats.rv_continuous
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.rv_discrete.expect.html#scipy.stats.rv_discrete.expect
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFMCH5HYJARELY77YO3RX7C4TANCNFSM4OC24YZA>
.
|
dstn is now first
Not sure I understand this question.
You can ask Mathematica to evaluate x \[Times] 2.
If x is a scalar (x=4), the answer will be 8.
If x is a vector (x={4,8}) the answer will be {8,16}.
That is an attractive way of doing things. It may not be worth the effort,
but it is by no means crazy.
On Mon, Jun 22, 2020 at 5:10 PM Matthew N. White <[email protected]>
wrote:
… So the function should return different types of output depending on which
combination of inputs are passed?
On Mon, Jun 22, 2020 at 5:06 PM Sebastian Benthall <
***@***.***>
wrote:
> Ah, no, it's well defined as the vector of expected values:
>
https://en.wikipedia.org/wiki/Multivariate_random_variable#Expected_value
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#720 (comment)>, or
> unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ADKRAFLBL2WRTGNFAXIAKMTRX7BVRANCNFSM4OC24YZA
>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#720 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKCK77OFMQYW4MWH42WNV3RX7CEFANCNFSM4OC24YZA>
.
--
- Chris Carroll
|
Yes, Matt's right about this. Again, the syntax in both Mma and SymPy requires you, for EACH stochastic variable in an expectation/expression, to define its distribution. So you could have Expectation[x [Times] y, {x [Distributed] NormalDistribution[0,1], y[Distributed] UniformDistribution[-1,1]}]. The expectation is not a slave or an attribute of the distribution. The expectation needs to know the distribution of EACH random variable in the expression (and, if they are correlated, it needs the joint distribution). PS. Also, it's important to realize that the math of combining distributions requires you to compute a Jacobian if the expression is any more complicated than simple addition. Mma handles this with the TransformedDistribution tool. Am sure sympy does something similar. Another illustration of why we should stand on the shoulders of giants, rather than constructing our own methods to break down a function and compute its Jacobian, when computing the expectation. |
Does it assume that these variables are independent?
Currently, HARK distributions can be multivariate. So I'm not sure the objection here is substantive.
One thing that may be attracting you to this syntax is the way the Mathematica interpreter handles variable binding. What this is doing in effect is allowing the user to pass a scope (binding of symbols to values, here, distributions) that the expression can be evaluated in. This way of handling scope in Mathematica is a nice feature. Dolang is accomplishing something a bit similar. But Python does not do things this way very easily--a new "domain specific language" has to be built on top of it. To build towards this, we would need to take the challenge of designing that seriously. That challenge is far more complex than the task of providing an expectation function that refactors the current code. From my vantage point, the somewhat lighter weight options are:
I don't think we should underestimate how by going down this road, we are breaking/setting user expectations about how programming syntax works. These are not trivial issues to introduce.
If I'm not mistaken, the current implementation being discussed is for discretized distributions. So we can sidestep this problem in the short term? We have multivariate support. I agree with @mnwhite that currently there is no symbolic analysis in HARK, so I'm not sure what you're after. It would be nice to have symbolic expressions one day. The scipy solution here appears to be numerical: It does not provide an expectation function for continuous random variables. Maybe because this requires more symbolic manipulation than scipy does under the hood. |
If you specify each variable independently, it assumes they are independent:
If you specify the variables as following a multivariate distribution, it uses that correctly:
Honestly, truly, it's NOT the syntax. That's why I keep mentioning Dolo and SymPy as equivalents. All three of them (and Stan) have settled on the same logical structure: In order to take the expectation of something with respect to a variable, you have to give a proper definition of the distribution of the variable. It's not an accident that all four of the leading platforms we have looked at do this the same way. They do it that way because it is the compelling mathematical logic for how to handle taking expectations it in a general-purpose, as opposed to one-off ad-hoc, way. If we're going to redo how we represent expectations at all, I don't see the point in redoing them in a way that will inevitably require us to redo them again at some point in the not-too-distant future. We may as well just do it right - especially since we can just borrow the "doing it right" architecture from SymPy or Dolo (with preference given to Dolo).
Why make up our own new idiosyncratic way of doing this, which we will have to break again down the road, when there are ready-made high-quality solutions (one of which we will eventually have to adopt) on the shelf?
Yes, it's precisely BECAUSE they are nontrivial that I want to be careful not to TREAT them trivially by just making things up on the fly that will work for the present moment but will not stand the test of time. This is something we should do right once, and be done.
This is a perfect example of why I want to "do it right" now. If we choose the wrong syntax now (one that cannot support the computation of a Jacobian), we have ruled out EVER including continuous distributions (or ever including them in a way that doesn't break existing usages requiring a complete revamping of the codebase).
My proposal to adopt SymPy or Dolo's approach as our well-defined standard going forward also allows us to "sidestep" this issue, since nothing I'm proposing would break any existing code. It would define a "preferred" way to do things going forward.
Maybe; I'm not sure it will ever be worth doing. Allowing continuous distributions is much more likely to be valuable. But what IS worth doing is, at low (current) cost, accommodating all of these possibilities by adopting a carefully designed structure that others have conveniently thought through for us.
The SymPy, Dolo, and Mathematica schemes can represent either discrete or continuous RV's. SciPy may indeed accommodate only numerical ones; I'm not sure. But if so, I don't see any reason why that should mean that, starting from scratch as we are, we should adopt the SciPy approach ... |
I'm confused. In SymPy, you can assign a distribution to a variable and then pass it to the expectation operation: https://docs.sympy.org/latest/modules/stats.html
I don't believe Dolo has a representation of a probability distribution per se. The closest thing are its processes definitions. I don't believe there's an expectation function implemented on those yet. Maybe if you could write an example of how you would like to see the expectation taking function work in HARK, that would clarify things. You've both convince me now that it's worthwhile to have an expectation function that is not a method on a particular distribution, though I think it would be very easy to have both if we have one, and that that would be nice. I think it would be nice, but more difficult, for the standalone Expectation operator to take multiple distributions. I'm not sure how that would look though. Here's what happens in SymPy when you try to take the expectation of a formula of two distributions.
I'm not 100% sure what's happening here, but it looks like they've overwritten the "+" operator so that it combines the two distributions into a common "expression" object: This is what I'm talking about: SymPy does a lot of pretty heavy manipulations of the standard Python syntax in order to handle these. It's also very slow to evaluate. SymPy, according to its creators, is an "embedded domain specific programming language". |
Yes, that's the same thing. "Processes" are just distributions for variables whose realizations occur at a well defined moment in the evolution of a model over time. Since we will always be dealing with dynamic problems (that is, ones that are defined over time), "Processes" are what we will need.
The calculation of expectations is built into the structure of Dolo models.
Same thing in Dolo, except it is called a process. Mathematica is a bit more complex syntactically, but the idea amounts to the same thing.
I'm sure that's what they must have done, because when you are combining continuous distributions with each other you need to properly use the Jacobian. I'm sure that the "expression" object is just the proper rewriting of the expression using the Jacobian. In Mathematica it is a bit more explicit: You are supposed to define a TransformedDistribution[ x + b y,{x [Distributed] NormalDistribution[0,1], y [Distributed] LogNormalDistribution[0,1]}] is what would correspond to the
Where the domain is "Mathematics." And the subdomain is "Statistics and Probability."
My view is that we should not proliferate ways of doing things; somebody who learns the command for a single distribution will be confused when they have a multivariate distribution. It's not like it is hard to do it for a single distribution. Assuming you have imported the relevant parts of sympy (which, if we were using sympy, we would do anyway), it's just
|
My impression from this discussion is that there are a variety of 'reasonable' implementations, all of which would be an improvement on what's there, and which would be rather easy to refactor into each other once that first big lift is done. I'll defer to @mnwhite on the first implementation. |
Bumping this PR -- what's needed to get this PR finished? @mnwhite |
This is something we should consult with @albop and Normann about |
Assigning this to myself to resolve conflicts and bring to completion. |
Wrote first version of non-symbol expectation-evaluating function, completely untested. Also fixed a couple typos in distribution.py.
Still need to test alternate input versions.
calcExpectations now tested on more advanced cases. Fixed dumb typo (copy-pasted from other line) and changed colons to ellipses in axis-construction step. Also wrote new version of ConsAggShock solver that uses calcExpectation, as a demonstration of what the code would look like.
calcExpectation now correctly handles passing a null argument (empty list) for values, representing a function with no deterministic arguments.
dstn is now first
Calc expectation
Added a function to HARK.distribution that automates taking expectations over a function. Basic syntax:
calcExpectation(func,values,dstn)
values can be a list of 1D arrays or ND arrays. If the former, the function generates a tensor grid over them. func must take the deterministic values in order as its first M arguments, followed by the stochastic arguments in dstn as its next N arguments; user will likely need to make a wrapper function.
ConsAggShockModel.py has a new solver called
solveConsAggShockNEW
that replicates the old version but uses calcExpectation, as a demonstration of what the code would look like using this new feature. Solves about 3% slower due to not knowing how to save on a bit of arithmetic (because of automation).