The change in how distributions are handled seems to have messed simulations up #750

Mv77 · 2020-07-02T20:57:32Z

In my restoration of the CGMPortfolio remark, I think I've run into a couple of issues with RNG in simulations.

This PR deals with the fact that the random seed now has to be passed to the distribution's init method and not the draw method.

However, my current simulated income paths look like this, which does not seem right

It looks as if there was invididual-level variation in growth rates, but they stayed constant over time?

I'm looking into it but the income distribution RNG is more complicated so I wanted to ask if there is some obvious answer to what is going on there.

mnwhite · 2020-07-02T21:02:41Z

Yeah, my best guess is that the RNG changes weren't fully/properly implemented in the models. My best guess is that the same seed is being passed to a new RNG in every single period, so the exact same (underlying) shocks are being given to each index in each period. Let me take a look.

…

On Thu, Jul 2, 2020 at 4:57 PM Mateo Velásquez-Giraldo < ***@***.***> wrote: In my restoration of the CGMPortfolio remark, I think I've run into a couple of issues with RNG in simulations. This PR deals with the fact that the random seed now has to be passed to the distribution's *init* method and not the draw method. However, my current simulated income paths look like this, which does not seem right [image: image] <https://user-images.githubusercontent.com/27739595/86408046-40f93400-bc84-11ea-8536-32e5521708e3.png> It looks as if there was invididual-level variation in growth rates, but they stayed constant over time? I'm looking into it but the income distribution RNG is more complicated so I wanted to ask if there is some obvious answer to what is going on there. ------------------------------ You can view, comment on, or merge this pull request online at: #750 Commit Summary - Fix adjust and Risky File Changes - *M* HARK/ConsumptionSaving/ConsPortfolioModel.py <https://github.com/econ-ark/HARK/pull/750/files#diff-68665700d02cb4ea293de6b42a903ff4> (4) Patch Links: - https://github.com/econ-ark/HARK/pull/750.patch - https://github.com/econ-ark/HARK/pull/750.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#750>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFOE5E6SZOVWLI4S3E3RZTYEVANCNFSM4OPHRSEA> .

mnwhite · 2020-07-02T21:12:41Z

First check: IndShockConsumerType is simulating correctly, with different shocks being drawn for each agent. Second check: Simulation of the portfolio model is indeed not working correctly-- the example file hits an error when seed is passed. Let me compare how the RNG is set up between the two modules and see if there's an easy fix. This was just an oversight when the new RNG changes were made, but we should also run the other example files to see if their simulations work as intended.

…

On Thu, Jul 2, 2020 at 5:02 PM Matthew White ***@***.***> wrote: Yeah, my best guess is that the RNG changes weren't fully/properly implemented in the models. My best guess is that the same seed is being passed to a new RNG in every single period, so the exact same (underlying) shocks are being given to each index in each period. Let me take a look. On Thu, Jul 2, 2020 at 4:57 PM Mateo Velásquez-Giraldo < ***@***.***> wrote: > In my restoration of the CGMPortfolio remark, I think I've run into a > couple of issues with RNG in simulations. > > This PR deals with the fact that the random seed now has to be passed to > the distribution's *init* method and not the draw method. > > However, my current simulated income paths look like this, which does not > seem right > [image: image] > <https://user-images.githubusercontent.com/27739595/86408046-40f93400-bc84-11ea-8536-32e5521708e3.png> > It looks as if there was invididual-level variation in growth rates, but > they stayed constant over time? > > I'm looking into it but the income distribution RNG is more complicated > so I wanted to ask if there is some obvious answer to what is going on > there. > ------------------------------ > You can view, comment on, or merge this pull request online at: > > #750 > Commit Summary > > - Fix adjust and Risky > > File Changes > > - *M* HARK/ConsumptionSaving/ConsPortfolioModel.py > <https://github.com/econ-ark/HARK/pull/750/files#diff-68665700d02cb4ea293de6b42a903ff4> > (4) > > Patch Links: > > - https://github.com/econ-ark/HARK/pull/750.patch > - https://github.com/econ-ark/HARK/pull/750.diff > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#750>, or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ADKRAFOE5E6SZOVWLI4S3E3RZTYEVANCNFSM4OPHRSEA> > . >

mnwhite · 2020-07-02T21:30:05Z

Ha, I didn't realize this was a PR-- I misread the email I received and thought it was an issue. I made the two line fixes you have here, and the example file now works correctly for me-- I'm not getting the pattern you are.

In example_ConsPortfolioModel.py in your branch, add 'pLvlNow' to track_vars for MyType (whoops, I need to rename that first example) and solve and simulate that type (for speed/laziness, just throw a breakhere after the simulation so it doesn't do the rest of the examples). Then do plt.plot(MyType.history['pLvlNow'][:,0:5], '.') and you'll see some example permanent income paths like the figure you have above; they move around exactly as you'd expect.

What code are you running that gives that bad picture?

Mv77 · 2020-07-02T21:55:07Z

I just checked the example file as you suggested and it works, pLvls do vary erratically as expected.

The strange picture comes from running this file: https://github.com/Mv77/REMARK/blob/fixCGMPortfolio/REMARKs/CGMPortfolio/Code/Python/Simulations/FewAgents.py, which used to work before. Seeing that the example works now makes me think it might be a particular issue, maybe with the calibration.

I'll have a dig at it after dinner. (this is the last file that needs to work before the CGM remark is completely restored)

mnwhite · 2020-07-02T23:16:40Z

Ok, I'll try to look at that late tonight or early tomorrow morning. Calibration shouldn't be the issue, I don't think.

…

On Thu, Jul 2, 2020, 5:55 PM Mateo Velásquez-Giraldo < ***@***.***> wrote: I just checked the example file as you suggested and it works, pLvls do vary erratically as expected. The strange picture comes from running this file: https://github.com/Mv77/REMARK/blob/fixCGMPortfolio/REMARKs/CGMPortfolio/Code/Python/Simulations/FewAgents.py, which used to work before. Seeing that the example works now makes me think it might be a particular issue, maybe with the calibration. I'll have a dig at it after dinner. (this is the last file that needs to work before the CGM remark is completely restored) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFPQ2YCZMXUEK4WJOHTRZT64PANCNFSM4OPHRSEA> .

Mv77 · 2020-07-02T23:43:32Z

I'm getting closer to the bug. It seems to be the "draw_events" method of the income_dstn that is getting used in line 2027 of ConsIndShockModel.py. Here are the events being drawn for 5 different agents (columns) through their lives (rows)

Indeed, the same permanent shock is being drawn every period for a given agent. I will keep looking to fix it myself but wanted to share in case you do look at it.

My hypothesis of why example_ConsPortfolioModel.py works and this one does not is that in the example, the same distribution object is being used to draw events for every simulation period. In my file, being a life-cycle problem, there is a list of distribution objects, one for every period. Maybe the seed of each object is being set to the same number. I'll investigate!

Mv77 · 2020-07-03T00:01:48Z

Yep. All are being created with the default seed (0). Fixing now!

mnwhite · 2020-07-03T00:05:02Z

Your hypothesis might be correct, but this feature isn't showing up in the lifecycle example in example_ConsIndShock.py ... Permanent income shocks jump around for each agent just as you'd expect.

mnwhite · 2020-07-03T00:14:05Z

Same thing with the portfolio lifecycle example in example_ConsPortfolioModel.py (with completely cockamamie parameters)-- income shocks draw as expected. This is weird.

mnwhite · 2020-07-03T00:16:00Z

But then why is this only happening for that one case? It's a one line fix when the IncomeDstn elements are created (which it looks like you're doing now), but this should come up in the example files as well as the CGM REMARK file.

Mv77 · 2020-07-03T00:29:15Z

My last commit seems to fix the issue.

To the best of my understanding, the issue arose when combining the transitory and permanent distributions of the income process using 'combineIndepDstns'. The method creates the combined distribution without a seed, so it just defaults to 0. I added the option to pass a seed and used it.

I tested both my file and the example and they work.

Mv77 · 2020-07-03T00:33:03Z

But then why is this only happening for that one case? It's a one line fix when the IncomeDstn elements are created (which it looks like you're doing now), but this should come up in the example files as well as the CGM REMARK file.

The example file (at least the portfolio one) works because T_cycle = 1. So there is a single distribution object. The seed starts at 0, HARK takes a draw, and next period it takes a draw from the same object. So the draws are different.

The life-cycle problem from my file creates a list of 60(?) incomeDstn's, all initialized with seed 0. So HARK takes a draw from the 1st element when t_cycle=0, and from the second when t_cycle = 1. Since both objects start out with the same seed, the draws are the same?

mnwhite · 2020-07-03T00:37:12Z

I get all that. But I ran other *lifecycle* examples and they simulated correctly. That's what's confusing.

…

On Thu, Jul 2, 2020 at 8:33 PM Mateo Velásquez-Giraldo < ***@***.***> wrote: But then why is this *only* happening for that one case? It's a one line fix when the IncomeDstn elements are created (which it looks like you're doing now), but this should come up in the example files as well as the CGM REMARK file. The example file (at least the portfolio one) works because T_cycle = 1. So there is a single distribution object. The seed starts at 0, HARK takes a draw, and next period it takes a draw from the *same* object. So the draws are different. The life-cycle problem from my file creates a list of 60(?) incomeDstn's, all initialized with seed 0. So HARK takes a draw from the 1st element when t_cycle=0, and from the second when t_cycle = 1. Since both objects start out with the same seed, the draws are the same? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFIVYTSSTJYGSQRJ2QDRZURMZANCNFSM4OPHRSEA> .

mnwhite · 2020-07-03T00:40:14Z

Anyway, your code changes look correct, and I'm glad this fixed the problem. I'll need to adjust the simulation-based targets for some tests, as the seeds have changed, then I'll merge this branch.

llorracc · 2020-07-03T01:04:15Z

Since both objects start out with the same seed, the draws are the same?

Definitely. This is one of those things that requires a pretty deep understanding of "random" number generation. I feel (fairly strongly) that the default way of doing things should be with a seed based on the time (my preference would be the unix default clock of "seconds since 1970" but not sure if that works on WIN machines; probably there's a convention about how to construct a platform-independent time-varying seed. Like, whatever the system believes to be the current moment of time? Sophisticated and experienced people will know that they can set the seed to the same thing every time if they want to do that. But the default should NOT be that random number sequences are actually all the same (because starting from a seed of 0), because that can lead to exactly these kinds of confusions. The default should be that things work the way a naif expects, and the sophisticate will know how to change things for their purposes. (I'm working hard on substituting "they/their" for "him/his" - in principle I'm completely in favor of there being a gender-neutral possessive, but that fights against my visceral reaction that "they" is improper -- which was true, but should change.)

…

On Thu, Jul 2, 2020 at 8:33 PM Mateo Velásquez-Giraldo < ***@***.***> wrote: But then why is this *only* happening for that one case? It's a one line fix when the IncomeDstn elements are created (which it looks like you're doing now), but this should come up in the example files as well as the CGM REMARK file. The example file (at least the portfolio one) works because T_cycle = 1. So there is a single distribution object. The seed starts at 0, HARK takes a draw, and next period it takes a draw from the *same* object. So the draws are different. The life-cycle problem from my file creates a list of 60(?) incomeDstn's, all initialized with seed 0. So HARK takes a draw from the 1st element when t_cycle=0, and from the second when t_cycle = 1. Since both objects start out with the same seed, the draws are the same? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCK73PP2U64JUNXDTUWBDRZURMZANCNFSM4OPHRSEA> .

-- - Chris Carroll

Mv77 · 2020-07-03T01:39:37Z

I get all that. But I ran other lifecycle examples and they simulated correctly. That's what's confusing.

Got it!

In the example, the default parameters set mortality rates quite high, to illustrate the life-cycle point. So the number of alive agents changes a lot, and the varying number of draws is what generates variation in the shocks.

You can check this by setting init_lifecycle['LivPrb'] = [1]*10 in the life cycle example.

That one was hard to catch 😩 .

llorracc · 2020-07-03T02:09:16Z

That one was hard to catch

and a perfect example of why the default should be to always initialize with a time-based seed. Like the Spanish Inquisition, nobody expects that the results of their simulations will be importantly different if the probability of living is 0.9999 instead of 1.000

PS. But the build is still failing ...

mnwhite · 2020-07-03T02:28:30Z

Ahhhh yes, I forgot that the survival probabilities were silly. Great work. Chris: the survival probabilities are in the range of 0.1 to 0.9; there's no discontinuity at 1. The build is failing because some test targets need to be adjusted. Seeds have changed on this commit, so simulation-based tests have changed results. I *don't* think we should have seed default to a time-based value. We want our code to produce the same results every time it's run.

…

On Thu, Jul 2, 2020, 10:09 PM Christopher Llorracc Carroll < ***@***.***> wrote: That one was hard to catch and a perfect example of why the default should be to always initialize with a time-based seed. Like the Spanish Inquisition, nobody expects that the results of their simulations will be importantly different if the probability of living is 0.9999 instead of 1.000 PS. But the build is still failing ... — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADKRAFPVYFTOQMULGYAEEJDRZU4VTANCNFSM4OPHRSEA> .

Mateo's fixes to the RNG seeding (obviously) changed seeds, so some simulation-based tests had their targets changed.

mnwhite · 2020-07-03T14:45:34Z

Tests will pass after latest commit, will merge. No release notes are necessary, as this just makes tiny adjustments to account for RNG changes in Seb's prior PR; release notes for that PR would encompass these changes.

That said, we might want to look through other RNG-using methods and make sure these kinds of changes are made there. In particular, the medical shocks model and the Markov shocks.

llorracc · 2020-07-03T15:24:07Z

On Thu, Jul 2, 2020 at 10:28 PM Matthew N. White <[email protected]> wrote:

Ahhhh yes, I forgot that the survival probabilities were silly. Great work. Chris: the survival probabilities are in the range of 0.1 to 0.9; there's no discontinuity at 1. The build is failing because some test targets need to be adjusted. Seeds have changed on this commit, so simulation-based tests have changed results. I *don't* think we should have seed default to a time-based value. We want our code to produce the same results every time it's run.

Well, that's the counterargument. Problem is, very often you will not get the same result from one run to the next because it is incredibly easy to make some change that you don't realize will change the order of draws or the number or both. I'm willing to put this up to a vote as an EEP, and abide by the consensus of the rest of you.

…

On Thu, Jul 2, 2020, 10:09 PM Christopher Llorracc Carroll < ***@***.***> wrote: > That one was hard to catch > > and a perfect example of why the default should be to always initialize > with a time-based seed. Like the Spanish Inquisition, nobody expects that > the results of their simulations will be importantly different if the > probability of living is 0.9999 instead of 1.000 > > PS. But the build is still failing ... > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#750 (comment)>, or > unsubscribe > < https://github.com/notifications/unsubscribe-auth/ADKRAFPVYFTOQMULGYAEEJDRZU4VTANCNFSM4OPHRSEA > > . > — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#750 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKCK75RAMIJMZUY5BP2BUTRZU65XANCNFSM4OPHRSEA> .

-- - Chris Carroll

Fix adjust and Risky

3550c7e

Set seed of combined dstns

213f3d0

Mv77 mentioned this pull request Jul 3, 2020

Fix cgm portfolio econ-ark/REMARK#68

Merged

Adjust test targets because of seed change

dddb1a3

Mateo's fixes to the RNG seeding (obviously) changed seeds, so some simulation-based tests had their targets changed.

mnwhite merged commit d302243 into econ-ark:master Jul 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The change in how distributions are handled seems to have messed simulations up #750

The change in how distributions are handled seems to have messed simulations up #750

Mv77 commented Jul 2, 2020

mnwhite commented Jul 2, 2020 via email

mnwhite commented Jul 2, 2020 via email

mnwhite commented Jul 2, 2020

Mv77 commented Jul 2, 2020

mnwhite commented Jul 2, 2020 via email

Mv77 commented Jul 2, 2020 •

edited

Loading

Mv77 commented Jul 3, 2020

mnwhite commented Jul 3, 2020

mnwhite commented Jul 3, 2020

mnwhite commented Jul 3, 2020

Mv77 commented Jul 3, 2020

Mv77 commented Jul 3, 2020

mnwhite commented Jul 3, 2020 via email

mnwhite commented Jul 3, 2020

llorracc commented Jul 3, 2020 via email

Mv77 commented Jul 3, 2020 •

edited

Loading

llorracc commented Jul 3, 2020

mnwhite commented Jul 3, 2020 via email

mnwhite commented Jul 3, 2020

llorracc commented Jul 3, 2020 via email

The change in how distributions are handled seems to have messed simulations up #750

The change in how distributions are handled seems to have messed simulations up #750

Conversation

Mv77 commented Jul 2, 2020

mnwhite commented Jul 2, 2020 via email

mnwhite commented Jul 2, 2020 via email

mnwhite commented Jul 2, 2020

Mv77 commented Jul 2, 2020

mnwhite commented Jul 2, 2020 via email

Mv77 commented Jul 2, 2020 • edited Loading

Mv77 commented Jul 3, 2020

mnwhite commented Jul 3, 2020

mnwhite commented Jul 3, 2020

mnwhite commented Jul 3, 2020

Mv77 commented Jul 3, 2020

Mv77 commented Jul 3, 2020

mnwhite commented Jul 3, 2020 via email

mnwhite commented Jul 3, 2020

llorracc commented Jul 3, 2020 via email

Mv77 commented Jul 3, 2020 • edited Loading

llorracc commented Jul 3, 2020

mnwhite commented Jul 3, 2020 via email

mnwhite commented Jul 3, 2020

llorracc commented Jul 3, 2020 via email

Mv77 commented Jul 2, 2020 •

edited

Loading

Mv77 commented Jul 3, 2020 •

edited

Loading