-
-
Notifications
You must be signed in to change notification settings - Fork 199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve speed of ConsGenIncProcessModel solvers #567
Comments
Why not just have them only solve for a few periods? Just set cycles=1 or
5 or something.
…On Thu, Mar 12, 2020 at 5:46 PM Sebastian Benthall ***@***.***> wrote:
The ConsGenIncProcessModel solvers, with default values, are currently the
longest running automated tests.
Maybe there is a way to improve the performance of these solvers.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#567>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFMEFDIRJ5XTDLIGWE3RHFJ2RANCNFSM4LGV5CGQ>
.
|
Hmmm. Ok, I'm trying that, So, with With
|
I guess I'll go with I wonder if you could help me understand what's going on here though. |
Sure. The cycles attribute indicates how many times the sequence of
periods described by the time-varying parameters will "happen" to the
agent, with 0 standing in for infinity. For any value of cycles other than
0, it's a finite horizon or lifecycle model-- there is *some* definite end
point that the agent cannot exist past. Basically *every* model that
everyone ever works with would be specified as either cycles=0 or cycles=1;
having cycles >= 2 isn't completely useless, but it's best used for
teaching (or sometimes debugging). For example, you can quickly plot how
an infinite horizon solution converges by setting cycles=50 and solving a
problem with a 1 period cycle (the policy function *visually* converges in
about 50 periods).
The default dictionary for ConsGenIncProcess model specifies a cycle that
is T_cycles=1 periods long. If you set cycles=0, the model solves the
problem of "what should I do if I will live out this kind of period over
and over again?". If you set cycles to 1 with a one period cycle, you are
asking it to solve the problem of "what should I do if I live this period,
and then it's the terminal period"? The solution isn't trying to converge,
you're just asking it to solve a problem with one non-terminal period and
one (trivial) terminal period. It's a lifecycle model for a mayfly.
Setting cycles=5 works fine for *most* models in HARK, but
ConsGenIncProcess is a special case because of the pLvlGrid. This
attribute is constructed in one of the methods called by update; it
represents the grid of permanent income levels at which to solve for
optimal consumption, with the consumption function interpolated in
between. It's constructed by simulation, literally drawing out the
population of pLvl at each time index of the cycle and then grabbing
percentiles as given in the primitive attribute pLvlPctiles. In the
default dictionary, it captures the middle 99.8% of the distribution. If
cycles=1, this just means looping over each period of the lifecycle and
finding the distribution of permanent income at each age. If cycles=0, the
code assumes that 1000 periods is a "long time" and simulates the
population for that long, then uses the distribution of pLvl in that
fake-ergodic distribution at each t of the cycle.
But if cycles >= 2, this method won't work. The distribution of pLvl is
different on successive cycles; there is not a one-to-one mapping between
t_age and t_cycle, nor is there a meaningful "long run" distribution.
There are ways around this, but I didn't take a stand on how we should
handle this non-standard case when I wrote that method. As is, if you
really wanted to have a sequence of periods (or just one period) repeat a
finite number of times, you would have to write the parameter dictionary to
have that loop in every time-varying primitive parameter, and then set
cycles=1. For example, PermGroFac would change from being [1.02] to
5*[1.02].
…On Fri, Mar 13, 2020 at 12:35 PM Sebastian Benthall < ***@***.***> wrote:
I guess I'll go with cycles=1.
I wonder if you could help me understand what's going on here though.
Why is does cycles=1 mean that the model is a lifecycle model?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#567 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFIIXBZB2H2UXZK7IE3RHJOFVANCNFSM4LGV5CGQ>
.
|
Thanks for explaining all that. So if I understand correctly...
And I've made some changes to #527 that set But maybe there is a way to improve the speed of the ConsGenIncProcessModel solver by introducing a threshold that would interrupt the solver's internal simulation if the model converges before 1000 periods. It sounds like 50 is sufficient to prove the concept? |
Not quite. Both T_cycle and cycles are instance-level attributes. The
model that an AgentType subclass solves is agnostic to T_cycle and cycles;
the specific instance of that model (a "problem") fills in those
time-defining attributes.
T_sim as an attribute represents the maximum number of periods you would
ever be interested in simulating for this instance. It's used to
initialize the arrays that will track idiosyncratic variable histories of
the attributes named in track_vars. When simulate() is called, it will
simulate for T_sim period by default, but you can also specify an argument
T <= T_sim for it to simulate for. This is useful for doing things like
"simulate 100 periods, then screw with the state of the world, then
simulate 300 more periods".
ConsGenIncProcess uses backward iteration methods to solve the model, just
like all our other solvers. It simply uses simulation to *construct the
grid the problem will be solved on*, in the pre-solution step. The 1000
periods thing is just an approximation to the "long run" distribution of
permanent income. I.e. if a model period is a year, it assumes that after
1000 years, the population distribution of pLvl is roughly distributed the
same as it would after 10,000 years or 10B years.
"Making sure the code doesn't break" is a really, really important test.
Having the solver try a 1 or 2 period problem and then check to see if a
few points of the policy function match the solution from the prior version
of the code is pretty much the best we can do with our model tests.
Your last paragraph is incorrectly premised for the reasons above.
…On Fri, Mar 13, 2020 at 2:33 PM Sebastian Benthall ***@***.***> wrote:
Thanks for explaining all that.
So if I understand correctly...
- T_cycles is the number of periods in a cycle, a property of the
*model*
- cycles is the number (probably infinite or 1) of times the agent
lives through the cycle, for the purpose of the *solver*
- T_sim is the number of *periods* to be simulated when simulate() is
called
And ConsGenIncProcess doesn't use dynamic programming in the solver in
the same way as the other models, but uses simulation internally, with a
1000 period limit.
I've made some changes to #527 <#527>
that set cycles=1 for the tests, but these tests at this point just make
sure the code doesn't break; it doesn't get at the internals at all.
But maybe there is a way to improve the speed of the
ConsGenIncProcessModel solver by introducing a threshold that would
interrupt the solver's internal simulation if the model converges before
1000 periods. It sounds like 50 is sufficient to prove the concept?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#567 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFJDEP5TJGGWNCTNBLTRHJ36XANCNFSM4LGV5CGQ>
.
|
Also, I just noticed that our *documentation* notebooks are in the
/examples folder (at least on the branch I'm working on), and the actual
*examples* have been deleted and replaced with bizarro notebook output. Is
this intended? I thought the whole point of the /examples directory is to
have small, minimal .py files that implement our models?
…On Fri, Mar 13, 2020 at 2:48 PM Matthew White ***@***.***> wrote:
Not quite. Both T_cycle and cycles are instance-level attributes. The
model that an AgentType subclass solves is agnostic to T_cycle and cycles;
the specific instance of that model (a "problem") fills in those
time-defining attributes.
T_sim as an attribute represents the maximum number of periods you would
ever be interested in simulating for this instance. It's used to
initialize the arrays that will track idiosyncratic variable histories of
the attributes named in track_vars. When simulate() is called, it will
simulate for T_sim period by default, but you can also specify an argument
T <= T_sim for it to simulate for. This is useful for doing things like
"simulate 100 periods, then screw with the state of the world, then
simulate 300 more periods".
ConsGenIncProcess uses backward iteration methods to solve the model, just
like all our other solvers. It simply uses simulation to *construct the
grid the problem will be solved on*, in the pre-solution step. The 1000
periods thing is just an approximation to the "long run" distribution of
permanent income. I.e. if a model period is a year, it assumes that after
1000 years, the population distribution of pLvl is roughly distributed the
same as it would after 10,000 years or 10B years.
"Making sure the code doesn't break" is a really, really important test.
Having the solver try a 1 or 2 period problem and then check to see if a
few points of the policy function match the solution from the prior version
of the code is pretty much the best we can do with our model tests.
Your last paragraph is incorrectly premised for the reasons above.
On Fri, Mar 13, 2020 at 2:33 PM Sebastian Benthall <
***@***.***> wrote:
> Thanks for explaining all that.
>
> So if I understand correctly...
>
> - T_cycles is the number of periods in a cycle, a property of the
> *model*
> - cycles is the number (probably infinite or 1) of times the agent
> lives through the cycle, for the purpose of the *solver*
> - T_sim is the number of *periods* to be simulated when simulate() is
> called
>
> And ConsGenIncProcess doesn't use dynamic programming in the solver in
> the same way as the other models, but uses simulation internally, with a
> 1000 period limit.
>
> I've made some changes to #527
> <#527> that set cycles=1 for the
> tests, but these tests at this point just make sure the code doesn't break;
> it doesn't get at the internals at all.
>
> But maybe there is a way to improve the speed of the
> ConsGenIncProcessModel solver by introducing a threshold that would
> interrupt the solver's internal simulation if the model converges before
> 1000 periods. It sounds like 50 is sufficient to prove the concept?
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#567 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ADKRAFJDEP5TJGGWNCTNBLTRHJ36XANCNFSM4LGV5CGQ>
> .
>
|
Nevermind, I found it one folder down.
Open up examples/ConsumptionSaving/example_ConsGenIncProcessModel.py ... At
lines 38 and 130 you can add the argument True to the call to solve() if
you want to see it run cycle by cycle. These examples only take about 6-8
seconds to solve; this doesn't seem like a long time to me.
Also, it looks like process_time is giving weird numbers. Can we use
time.time instead?
…On Fri, Mar 13, 2020 at 2:56 PM Matthew White ***@***.***> wrote:
Also, I just noticed that our *documentation* notebooks are in the
/examples folder (at least on the branch I'm working on), and the actual
*examples* have been deleted and replaced with bizarro notebook output. Is
this intended? I thought the whole point of the /examples directory is to
have small, minimal .py files that implement our models?
On Fri, Mar 13, 2020 at 2:48 PM Matthew White ***@***.***> wrote:
> Not quite. Both T_cycle and cycles are instance-level attributes. The
> model that an AgentType subclass solves is agnostic to T_cycle and cycles;
> the specific instance of that model (a "problem") fills in those
> time-defining attributes.
>
> T_sim as an attribute represents the maximum number of periods you would
> ever be interested in simulating for this instance. It's used to
> initialize the arrays that will track idiosyncratic variable histories of
> the attributes named in track_vars. When simulate() is called, it will
> simulate for T_sim period by default, but you can also specify an argument
> T <= T_sim for it to simulate for. This is useful for doing things like
> "simulate 100 periods, then screw with the state of the world, then
> simulate 300 more periods".
>
> ConsGenIncProcess uses backward iteration methods to solve the model,
> just like all our other solvers. It simply uses simulation to *construct
> the grid the problem will be solved on*, in the pre-solution step. The
> 1000 periods thing is just an approximation to the "long run" distribution
> of permanent income. I.e. if a model period is a year, it assumes that
> after 1000 years, the population distribution of pLvl is roughly
> distributed the same as it would after 10,000 years or 10B years.
>
> "Making sure the code doesn't break" is a really, really important test.
> Having the solver try a 1 or 2 period problem and then check to see if a
> few points of the policy function match the solution from the prior version
> of the code is pretty much the best we can do with our model tests.
>
> Your last paragraph is incorrectly premised for the reasons above.
>
> On Fri, Mar 13, 2020 at 2:33 PM Sebastian Benthall <
> ***@***.***> wrote:
>
>> Thanks for explaining all that.
>>
>> So if I understand correctly...
>>
>> - T_cycles is the number of periods in a cycle, a property of the
>> *model*
>> - cycles is the number (probably infinite or 1) of times the agent
>> lives through the cycle, for the purpose of the *solver*
>> - T_sim is the number of *periods* to be simulated when simulate()
>> is called
>>
>> And ConsGenIncProcess doesn't use dynamic programming in the solver in
>> the same way as the other models, but uses simulation internally, with a
>> 1000 period limit.
>>
>> I've made some changes to #527
>> <#527> that set cycles=1 for the
>> tests, but these tests at this point just make sure the code doesn't break;
>> it doesn't get at the internals at all.
>>
>> But maybe there is a way to improve the speed of the
>> ConsGenIncProcessModel solver by introducing a threshold that would
>> interrupt the solver's internal simulation if the model converges before
>> 1000 periods. It sounds like 50 is sufficient to prove the concept?
>>
>> —
>> You are receiving this because you commented.
>> Reply to this email directly, view it on GitHub
>> <#567 (comment)>,
>> or unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/ADKRAFJDEP5TJGGWNCTNBLTRHJ36XANCNFSM4LGV5CGQ>
>> .
>>
>
|
@sbenthall please confer with @MridulS who wrote the excellent |
Keeping to the original question:
So, suppose one wanted to write a test for this functionality, but did not want to wait for it to go through 1000 steps. Maybe there is or could be a way to limit that number of steps to 50, for the purpose of testing. |
As written, no. The idea that 1000 periods is a "long time" in model terms
is hard coded. It's not pulled from an attribute.
That could be changed with 2 lines, but there's not really a point. That
construction step takes 1 second or so, probably much less. So there isn't
really a point in speeding it up for testing purposes. It's literally just
drawing 10,000,000 or so lognormal shocks, executing 10,000,000
multiplications, sorting 10,000 numbers, and pulling a few percentiles.
This is not a burdensome step.
…On Fri, Mar 13, 2020, 7:19 PM Sebastian Benthall ***@***.***> wrote:
Keeping to the original question:
ConsGenIncProcess uses backward iteration methods to solve the model, just
like all our other solvers. It simply uses simulation to
*construct the grid the problem will be solved on*, in the pre-solution
step. The 1000
periods thing is just an approximation to the "long run" distribution of
permanent income. I.e. if a model period is a year, it assumes that after
1000 years, the population distribution of pLvl is roughly distributed the
same as it would after 10,000 years or 10B years.
So, suppose one wanted to write a test for *this* functionality, but did
not want to wait for it to go through 1000 steps.
Maybe there is or could be a way to limit that number of steps to 50, for
the purpose of testing.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#567 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFK4DM7LENYREOCWILTRHK5PHANCNFSM4LGV5CGQ>
.
|
Ah, I was under the impression that this simulation step was the performance bottleneck. The next thing I'll try is performance profiling tools to see what's slowing the solver down. |
How long is the solver taking for you? It should only take a few seconds if
it's running the default dictionary.
Any profiler will tell you that the most expensive line is whichever one
evaluates vPfuncNext. That's where literally 95% of any computation will
be, as it's evaluating a linear interpolation over 1D interpolators for
about 48*56*24 realizations of mNrmNext (if I remember gridsizes in the
default dictionary correctly).
…On Fri, Mar 13, 2020, 9:09 PM Sebastian Benthall ***@***.***> wrote:
Ah, I was under the impression that this simulation step was the
performance bottleneck.
I may be mistaken.
The next thing I'll try is performance profiling tools to see what's
slowing the solver down.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#567 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFPPCKODVDTQDHV4ULDRHLKLTANCNFSM4LGV5CGQ>
.
|
Another use case for the comparator-type tool that @MridulS is working on.
Actually, there are a number of potential tests. These notes derive a number of propositions about the long run distribution of a process like the most important of which is probably: $\sigma^{2}{p} = \left(\frac{\phi}{1-\phi^{2}}\right) \sigma^{2}{\epsilon}$ and another of which is that Actually, it might be better simply to construct this distribution at the outset and then see whether after some number of periods of simulation (50?) the distribution's mean and variance are "close enough" to the analytical values (where "close enough" would be, say, only one chance in 10000 that the mean of the resulting distribution would be farther away than |
We can put in tests like that, but they shouldn't go in
ConsGenIncProcessModel. The point of this model is that (log) permanent
income can follow *any* process, not just p_{t+1} = \gamma + \phi_t +
\epsilon_{t+1} (random walk with drift; replacement/death events also
occur). That special process is covered by ConsIndShockModel, and it
allows us to normalize out permanent income when solving the model. And as
Chris notes, he has derived closed form solutions for the long run
distribution of p_t for that special process. If anywhere, these kind of
tests should be built from that model.
ConsGenIncProcess exists to handle the generic case: P_{t+1} =
SomeFunction(P_t) * \psi_{t+1}. Note the change in notation from lower
(logs) to capital (levels) for permanent income, and the switch from
\epsilon to \psi. If we are completely ignorant about SomeFunction, then
there is no way to know what the long run distribution of P_t or p_t should
be, and thus no tests we should run. There are special cases where this
can be derived, like an AR1 in logs, but there's no generic result.
…On Sat, Mar 14, 2020 at 7:59 AM Christopher Llorracc Carroll < ***@***.***> wrote:
Maybe there is or could be a way to limit that number of steps to 50, for
the purpose of testing.
Another use case for the comparator-type tool that @MridulS
<https://github.com/MridulS> is working on.
So, suppose one wanted to write a test for this functionality, but did not
want to wait for it to go through 1000 steps.
Actually, there are a number of potential tests. These notes
<http://halweb.uc3m.es/esp/Personal/personas/amalonso/esp/TSAtema4.pdf>
derive a number of propositions about the long run distribution of a
process like
$p_{t+1} = \gamma + \phi p_{t} + \epsilon_{t+1}$
the most important of which is probably:
$\sigma^{2}*{p} = \left(\frac{\phi}{1-\phi^{2}}\right) \sigma^{2}*
{\epsilon}$
and another of which is that $p$ itself should be distributed according to
$p \sim N\left(\left(\frac{\gamma}{1-\phi}\right),\sigma^{2}_{p}\right)$
Actually, it might be better simply to construct this distribution at the
outset and then see whether after some number of periods of simulation
(50?) the distribution's mean and variance are "close enough" to the
analytical values (where "close enough" would be, say, only one chance in
10000 that the mean of the resulting distribution would be farther away
than $x$ from its analytical value. If the number of periods is $n$ then
the process of simulating $n$ periods is to add $n$ normally distributed
shocks to the process, and then to do a test of whether after those
shocks the data look like they are still normally distributed according to
the
original, supposedly stationary, distribution. (There are standard tests
for
whether a distribution is normal with a given variance -- I think a
$\chi^{2}$ test.)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#567 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFKVX2HGIFYTKQUZPR3RHNWRVANCNFSM4LGV5CGQ>
.
|
Running this test file: is taking 62.7 seconds on my machine. |
It looks to me like those tests are equivalent to what's being run in
example_ConsGenIncProcess, but with a shorter simulation period. Those two
examples take 9.2 seconds and 9.5 seconds to solve on my computer, and the
simulation should take about half a second (it takes 10 seconds to simulate
500 periods).
For all of our solvers, I recommend against testing them by solving an
infinite horizon problem. We should set cycles to something like 2 or 5,
and target our tests off of that small-ish model. These are tests of
whether the solver produces the same result after a code change.
…On Sat, Mar 14, 2020 at 9:41 AM Sebastian Benthall ***@***.***> wrote:
How long is the solver taking for you? It should only take a few seconds
if it's running the default dictionary.
Running this test file:
https://github.com/econ-ark/HARK/blob/master/HARK/ConsumptionSaving/tests/test_ConsGenIncProcessModel.py
is taking 62.7 seconds on my machine.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#567 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADKRAFJXURZ67H6LBT3GRLDRHOCQJANCNFSM4LGV5CGQ>
.
|
Because GitHub markdown does not display math, and raw LaTeX code looks terrible, @mnwhite made a guess about what I was probably saying, but it's not what I was actually saying. My math was for the specialization of the general income process that Matt mentions at the end of his comment: log p_{t+1} = \gamma + \phi \log p_{t} + \epsilon_{t+1}. This would be a good default for simulating income dynamics because there is a closed form solution for the steady state distribution (per my math -- copy and paste into a markdown editor that can handle LaTeX and then it becomes readable!).
Actually, for the special process with purely permanent shocks and a positive probability of death, the formula in cstwMPC is for the variance of the square of P; I don't have a formula for the steady state distribution (which is unbounded above but not lognormal). But it would be easy enough to test whether the simulation produces something close to the derived variance. (I did that when I originally derived the formula to make sure my math was right, and confirmed the result). |
Oops, finger slipped and hit "close and comment" rather than just "comment". Reopening. |
The ConsGenIncProcessModel solvers, with default values, are currently the longest running automated tests.
Maybe there is a way to improve the performance of these solvers.
The text was updated successfully, but these errors were encountered: