Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newborns' death probabilities. #981

Merged
merged 4 commits into from
Jun 14, 2021
Merged

Conversation

Mv77
Copy link
Contributor

@Mv77 Mv77 commented Mar 9, 2021

The current implementation of sim_death finds death probabilities as DiePrb = DiePrb_by_t_cycle[self.t_cycle - 1].

For newborns, t_cycle==0 so this becomes DiePrb_by_t_cycle[-1]. Thus, newborns are getting the last survival probability, which in life-cycle calibrations is usually low. Therefore, many die immediately after being born.

Therefore, I'd say this bug results in the unnecessary deaths of many newborns. If that is not worrying enough, I think it may cause many simulation observations to be wasted, as they are spent on instantly-dying newborns.

I am not the most familiar with t_cycle vs t_age. Could someone more familiar review?

  • Tests for new functionality/models or Tests to reproduce the bug-fix in code.
  • Updated documentation of features that add new functionality.
  • Update CHANGELOG.md with major/minor changes.

@Mv77 Mv77 changed the title Newborn's death probabilities. Newborns' death probabilities. Mar 9, 2021
@mnwhite
Copy link
Contributor

mnwhite commented Mar 9, 2021 via email

@Mv77
Copy link
Contributor Author

Mv77 commented Mar 9, 2021

Thanks!

I was running into the issue yesterday. I was pretty convinced that the problem was happening, at least in the very first period of simulations. Maybe the issue is that initialize_sim() spawns newborns and then sim_one_period -> get_mortality kills them. I'm not sure now.

I'll double check soon and get more evidence.

@mnwhite
Copy link
Contributor

mnwhite commented Mar 9, 2021 via email

@sbenthall
Copy link
Contributor

Might be related to this issue
#499

@Mv77
Copy link
Contributor Author

Mv77 commented Mar 10, 2021

I've double checked.

This is an issue that happens only once. initialize_sim() creates newborns and the first thing that sim_one_period() does is call get_mortality(). That first instance is when this bug can occur.

As you said, it is harmless: a set of newborns is replaced with another. Then the question is whether you want it to happen or not.

The advantages of the proposed fix are:

  • Saves some computational cost of re-simulating a part of the first batch of newborns.
  • Eliminates this counterintuitive behavior.

The drawback:

  • It adds an extra line and its computational cost to something that gets called very often.

If you'd rather not incorporate it, this can be closed.

@Mv77 Mv77 closed this Mar 10, 2021
@Mv77 Mv77 reopened this Mar 10, 2021
@sbenthall
Copy link
Contributor

I would recommend using a performance testing tool to resolve these questions about implementation speed.

https://docs.python.org/3/library/profile.html

As a general point, using a profiler to test for performance bottlenecks would be a great thing to do at this stage. Made #982 for this

@mnwhite
Copy link
Contributor

mnwhite commented Mar 10, 2021 via email

@llorracc
Copy link
Collaborator

For things like this, the best resolution is one that prevents any future wasted HUMAN time, so (a) whatever is less confusing and (b) add a comment so any future @Mv77 will not occupy more than 30 sec grokking it

@sbenthall
Copy link
Contributor

Tests are not passing.

@llorracc
Copy link
Collaborator

@Mv77,

Could you revise the PR to just involve adding a comment or two to the existing code which, if such comment had existed 9 months ago would have prevented you from confusion? (The comment should include a link to this (revised) PR.

@codecov-commenter
Copy link

codecov-commenter commented May 27, 2021

Codecov Report

Merging #981 (680153e) into master (3870441) will increase coverage by 0.09%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #981      +/-   ##
==========================================
+ Coverage   71.67%   71.76%   +0.09%     
==========================================
  Files          63       63              
  Lines        9197     9256      +59     
==========================================
+ Hits         6592     6643      +51     
- Misses       2605     2613       +8     
Impacted Files Coverage Δ
HARK/ConsumptionSaving/ConsIndShockModel.py 85.84% <ø> (ø)
HARK/distribution.py 79.34% <0.00%> (-0.53%) ⬇️
HARK/tests/test_approxDstns.py 100.00% <0.00%> (ø)
HARK/tests/test_distribution.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3870441...680153e. Read the comment docs.

@Mv77
Copy link
Contributor Author

Mv77 commented May 27, 2021

@Mv77,

Could you revise the PR to just involve adding a comment or two to the existing code which, if such comment had existed 9 months ago would have prevented you from confusion? (The comment should include a link to this (revised) PR.

This is done in the last few commits. This PR is ready to merge now.

Sorry for throwing everyone into this rabbit hole.

@sbenthall sbenthall merged commit d5ca850 into econ-ark:master Jun 14, 2021
@Mv77 Mv77 deleted the BugFix/Newborns branch June 22, 2021 21:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants