Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated tests for functionality in DemARKs #543

Closed
sbenthall opened this issue Feb 24, 2020 · 18 comments
Closed

Automated tests for functionality in DemARKs #543

sbenthall opened this issue Feb 24, 2020 · 18 comments
Assignees

Comments

@sbenthall
Copy link
Contributor

The DemARKs take a long time to execute.

But smaller tests of their functionality could be set up as automated tests in HARK.

That would increase test coverage in a meaningful way (i.e., we could catch more errors that might break DemARKs before they are committed.)

@sbenthall sbenthall self-assigned this Feb 24, 2020
@sbenthall
Copy link
Contributor Author

sbenthall commented Feb 26, 2020

These are all the DemARKs:

  • Alternative-Combos-Of-Parameter-Values.py
  • ChangeLiqConstr.py
  • Chinese-Growth.py
  • ConsPortfolioModelDoc.py
  • DCEGM-Upper-Envelope.py
  • DiamondOLG.py
  • FisherTwoPeriod.py
  • GenIncProcessModel.py
  • Gentle-Intro-To-HARK-Buffer-Stock-Model.py
  • Gentle-Intro-To-HARK-PerfForesightCRRA.py
  • HoweWeSolveIndShockConsumerType.py
  • IncExpectationExample.py
  • IndShockConsumerType.py
  • KeynesFriedmanModigliani.py
  • KinkedRconsumerType.py
  • KrusellSmith.py
  • LifecycleModelExample.py
  • Micro-and-Macro-Implications-of-Very-Impatient-HHs.py
  • MPC-Out-of-Credit-vs-MPC-Out-of-Income.py
  • Nondurables-During-Great-Recession.py
  • PerfForesightConsumerType.py
  • Structural-Estimates-From-Empirical-MPCs-Fagereng-et-al.py
  • TractableBufferStock-Interactive.py
  • Uncertainty-and-the-Saving-Rate.py

@sbenthall
Copy link
Contributor Author

Oh, whoops--several of these have been moved to examples/ already.
However, this is all the more reason to have automated tests for them--it's functionality explicitly supported by HARK.

@sbenthall
Copy link
Contributor Author

There is already a test based on BufferStock.

@llorracc
Copy link
Collaborator

llorracc commented Feb 27, 2020 via email

@MridulS
Copy link
Member

MridulS commented Feb 27, 2020

Had a script which did the timing.

[notebook, time in seconds]

Structural-Estimates-From-Empirical-MPCs-Fagereng-et-al.py 169.72158694267273
Nondurables-During-Great-Recession.py 82.00251603126526
DCEGM-Upper-Envelope.py 4.082727909088135
IndShockConsumerType.py 3.7925097942352295
TractableBufferStock-Interactive.py 1.7823729515075684
GenIncProcessModel.py 97.28985691070557
DiamondOLG.py 3.757750988006592
KeynesFriedmanModigliani.py 6.499566078186035
Micro-and-Macro-Implications-of-Very-Impatient-HHs.py 44.61026406288147
IncExpectationExample.py 440.81260800361633
Gentle-Intro-To-HARK-Buffer-Stock-Model.py 4.099589824676514
KinkedRconsumerType.py 4.050242900848389
KrusellSmith.py 568.4164762496948
ConsPortfolioModelDoc.py 5.891296148300171
ChangeLiqConstr.py 1.9524343013763428
PerfForesightConsumerType.py 2.528262138366699
Gentle-Intro-To-HARK-PerfForesightCRRA.py 2.402021884918213
Chinese-Growth.py 149.02293300628662
Alternative-Combos-Of-Parameter-Values.py 8.333052158355713
LifecycleModelExample.py 4.598069906234741
HoweWeSolveIndShockConsumerType.py 1.5691490173339844
Uncertainty-and-the-Saving-Rate.py 279.39345693588257
FisherTwoPeriod.py 1.9162158966064453
MPC-Out-of-Credit-vs-MPC-Out-of-Income.py 1.6958949565887451

@sbenthall
Copy link
Contributor Author

Thanks @MridulS that's super helpful

@llorracc
Copy link
Collaborator

llorracc commented Feb 27, 2020 via email

@sbenthall
Copy link
Contributor Author

I am going to make tests based on the DemARKs on a case by case basis, in roughly [ascending in execution time, ascending alphabetically] order.

Writing an automated test suite is not just a matter of copy-and-paste.
I'm going to try to design them as high quality tests. That involves:

  • separate different functionality into different tests
  • checking a variety of test cases for a unit of functionality
  • trying to get good overall coverage of the functionality
  • reducing the amount of time it takes to run the test, while maintaining the functional coverage

As there are a lot of DemARKs, I expect to make progress on this gradually over time.

But my goal here is to put in place a high quality test suite that covers the functionality of the code without taking an excessive amount of time.

@MridulS
Copy link
Member

MridulS commented Feb 27, 2020 via email

@llorracc
Copy link
Collaborator

llorracc commented Feb 27, 2020 via email

@sbenthall
Copy link
Contributor Author

One other part of my approach that I didn't mention, @llorracc , is that if I see that a DemARK is using functionality that is already tested, I won't make that test.

Basically, I'm just aiming to apply common sense and software engineering best practices here.

@MridulS I see what you are saying about unit, functional, and more advanced test suites, and thanks for pointing towards that example you wrote. I think your point about how writing good tests can lead to bug discovery is right on. We don't know what bugs may be in the existing code because we don't have tests covering all the functionality!

@sbenthall
Copy link
Contributor Author

FisherTwoPeriod is a cool demo of Jupyter widgets but does not appear to use any currently uncovered HARK library features. I'll check it off.

@sbenthall
Copy link
Contributor Author

As an example of how to speed up the tests:

  • GenIncProcessModel.py, with runtime of 97 seconds, has a simulation with T_sim = 500
  • Lowering this number to something smaller reduces the test time dramatically while covering the same functionality.

@sbenthall
Copy link
Contributor Author

DemARKs that depend on a lot of custom code are not appropriate for being turned into automated tests for the core libraries.
I'm going to check off some of these. Current example: Chinese Growth, Ind Expectation Example.

@sbenthall
Copy link
Contributor Author

Uncertainty-and-the-Saving-Rate looks to be a demonstration of cstwMPC functionality.
The cstwMPC is slated for some heavy refactoring soon, and parts of is may be removed from the HARK library. See #334 #449 #522

As @llorracc has ownership over that code at the moment, I'll remove automated testing it from the scope of this issue.

@sbenthall
Copy link
Contributor Author

The KrusellSmith case is complicated by the fact that the core classes don't yet have functioning default behavior: #557

Checking it off from the list for now. That means all DemARKs have either been ticketed, test in #547, or exempted.

@llorracc
Copy link
Collaborator

llorracc commented Mar 6, 2020 via email

@llorracc
Copy link
Collaborator

llorracc commented Mar 13, 2020

FisherTwoPeriod is a cool demo of Jupyter widgets but does not appear to use any currently uncovered HARK library features. I'll check it off.

@sbenthall, yes. A few of the items in DemARK are basically just lecture notes for my first year course. Almost all of them could in principle be modified to use HARK, and I hope eventually to do that, but for the moment several of them are just straight jupyter/python. So, you are right to exclude them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants