Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed of ConsGenIncProcessModel solvers #567

Closed
sbenthall opened this issue Mar 12, 2020 · 19 comments
Closed

Improve speed of ConsGenIncProcessModel solvers #567

sbenthall opened this issue Mar 12, 2020 · 19 comments

Comments

@sbenthall
Copy link
Contributor

The ConsGenIncProcessModel solvers, with default values, are currently the longest running automated tests.

Maybe there is a way to improve the performance of these solvers.

@mnwhite
Copy link
Contributor

mnwhite commented Mar 12, 2020 via email

@sbenthall
Copy link
Contributor Author

Hmmm. Ok, I'm trying that,

So, with cycles=1, it looks like the solver does not make much progress on converging on a solution.

With cycles=5, I get errors in the updatepLvlGrid method, because of an assertion the cycles equal either 0 or 1.

assert False, "Can only handle cycles=0 or cycles=1!"

@sbenthall
Copy link
Contributor Author

I guess I'll go with cycles=1.

I wonder if you could help me understand what's going on here though.
Why is does cycles=1 mean that the model is a lifecycle model?

@mnwhite
Copy link
Contributor

mnwhite commented Mar 13, 2020 via email

@sbenthall
Copy link
Contributor Author

Thanks for explaining all that.

So if I understand correctly...

  • T_cycles is the number of periods in a cycle, a property of the model
  • cycles is the number (probably infinite or 1) of times the agent lives through the cycle, for the purpose of the solver
  • T_sim is the number of periods to be simulated when simulate() is called

And ConsGenIncProcess doesn't use dynamic programming in the solver in the same way as the other models, but uses simulation internally, with a 1000 period limit.

I've made some changes to #527 that set cycles=1 for the tests, but these tests at this point just make sure the code doesn't break; it doesn't get at the internals at all.

But maybe there is a way to improve the speed of the ConsGenIncProcessModel solver by introducing a threshold that would interrupt the solver's internal simulation if the model converges before 1000 periods. It sounds like 50 is sufficient to prove the concept?

@mnwhite
Copy link
Contributor

mnwhite commented Mar 13, 2020 via email

@mnwhite
Copy link
Contributor

mnwhite commented Mar 13, 2020 via email

@mnwhite
Copy link
Contributor

mnwhite commented Mar 13, 2020 via email

@llorracc
Copy link
Collaborator

@sbenthall please confer with @MridulS who wrote the excellent configurator.py tool for trying out various configurations of parameter values for the CGMPortfolio REMARK. This was intended as a prototype for a more general purpose tool that I hope we can construct that will define a standard procedure for configuring and running our models. That way, there could be a special configuration for DemARK testing that, for example, had just a 5 or 10 period configuration for ConsGenIncProcessModel instead of its default. We could have special "error-testing" configurations for all DemARKs (and for that matter REMARKs) that would be constructed to run fast while still doing at least a bit of a workout of the code.

@sbenthall
Copy link
Contributor Author

Keeping to the original question:

ConsGenIncProcess uses backward iteration methods to solve the model, just
like all our other solvers. It simply uses simulation to construct the
grid the problem will be solved on
, in the pre-solution step. The 1000
periods thing is just an approximation to the "long run" distribution of
permanent income. I.e. if a model period is a year, it assumes that after
1000 years, the population distribution of pLvl is roughly distributed the
same as it would after 10,000 years or 10B years.

So, suppose one wanted to write a test for this functionality, but did not want to wait for it to go through 1000 steps.

Maybe there is or could be a way to limit that number of steps to 50, for the purpose of testing.

@mnwhite
Copy link
Contributor

mnwhite commented Mar 14, 2020 via email

@sbenthall
Copy link
Contributor Author

Ah, I was under the impression that this simulation step was the performance bottleneck.
I may be mistaken.

The next thing I'll try is performance profiling tools to see what's slowing the solver down.

@mnwhite
Copy link
Contributor

mnwhite commented Mar 14, 2020 via email

@llorracc
Copy link
Collaborator

Maybe there is or could be a way to limit that number of steps to 50, for the purpose of testing.

Another use case for the comparator-type tool that @MridulS is working on.

So, suppose one wanted to write a test for this functionality, but did not want to wait for it to go through 1000 steps.

Actually, there are a number of potential tests. These notes derive a number of propositions about the long run distribution of a process like

$p_{t+1} = \gamma + \phi p_{t} + \epsilon_{t+1}$

the most important of which is probably:

$\sigma^{2}{p} = \left(\frac{\phi}{1-\phi^{2}}\right) \sigma^{2}{\epsilon}$

and another of which is that $p$ itself should be distributed according to

$p \sim N\left(\left(\frac{\gamma}{1-\phi}\right),\sigma^{2}_{p}\right)$

Actually, it might be better simply to construct this distribution at the outset and then see whether after some number of periods of simulation (50?) the distribution's mean and variance are "close enough" to the analytical values (where "close enough" would be, say, only one chance in 10000 that the mean of the resulting distribution would be farther away than $x$ from its analytical value. If the number of periods is $n$ then the process of simulating $n$ periods is to add $n$ normally distributed shocks to the process, and then to do a test of whether after those
shocks the data look like they are still normally distributed according to the
original, supposedly stationary, distribution. (There are standard tests for
whether a distribution is normal with a given variance -- I think a $\chi^{2}$ test.)

@mnwhite
Copy link
Contributor

mnwhite commented Mar 14, 2020 via email

@sbenthall
Copy link
Contributor Author

How long is the solver taking for you? It should only take a few seconds if it's running the default dictionary.

Running this test file:
https://github.com/econ-ark/HARK/blob/master/HARK/ConsumptionSaving/tests/test_ConsGenIncProcessModel.py

is taking 62.7 seconds on my machine.

@mnwhite
Copy link
Contributor

mnwhite commented Mar 14, 2020 via email

@sbenthall sbenthall added this to the 1.0.0 milestone Mar 14, 2020
@llorracc
Copy link
Collaborator

Because GitHub markdown does not display math, and raw LaTeX code looks terrible, @mnwhite made a guess about what I was probably saying, but it's not what I was actually saying. My math was for the specialization of the general income process that Matt mentions at the end of his comment: log p_{t+1} = \gamma + \phi \log p_{t} + \epsilon_{t+1}. This would be a good default for simulating income dynamics because there is a closed form solution for the steady state distribution (per my math -- copy and paste into a markdown editor that can handle LaTeX and then it becomes readable!).

And as Chris notes, he has derived closed form solutions for the long run distribution of p_t for that special process.

Actually, for the special process with purely permanent shocks and a positive probability of death, the formula in cstwMPC is for the variance of the square of P; I don't have a formula for the steady state distribution (which is unbounded above but not lognormal). But it would be easy enough to test whether the simulation produces something close to the derived variance. (I did that when I originally derived the formula to make sure my math was right, and confirmed the result).

@llorracc llorracc reopened this Mar 14, 2020
@llorracc
Copy link
Collaborator

Oops, finger slipped and hit "close and comment" rather than just "comment". Reopening.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants