Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose_split exposures different than expose_cy when truncated #46

Closed
MatthewCaseres opened this issue May 31, 2024 · 2 comments · Fixed by #47
Closed

expose_split exposures different than expose_cy when truncated #46

MatthewCaseres opened this issue May 31, 2024 · 2 comments · Fixed by #47

Comments

@MatthewCaseres
Copy link

MatthewCaseres commented May 31, 2024

df1 <- toy_census %>% filter(pol_num == 1) %>% expose_cy(
  start_date = "2007-6-15",
  end_date = "2019-02-27",
  target_status = "Surrender") %>% 
  select(pol_num, cal_yr, cal_yr_end, term_date, issue_date, exposure, status)

df2 <-df1 %>% 
  expose_split() %>%
  select(pol_num, cal_yr, cal_yr_end, issue_date, exposure_cal, status)

output data

> df1 %>% tail(2)
Exposure data

 Exposure type: calendar_year 
 Target status: Surrender 
 Study range: 2007-06-15 to 2019-02-27

# A tibble: 2 × 7
  pol_num cal_yr     cal_yr_end term_date issue_date exposure status
    <int> <date>     <date>     <date>    <date>        <dbl> <fct> 
1       1 2018-01-01 2018-12-31 NA        2010-01-01    1     Active
2       1 2019-01-01 2019-12-31 NA        2010-01-01    0.159 Active
> df2 %>% tail(2)
Exposure data

 Exposure type: split_year 
 Target status: Surrender 
 Study range: 2007-06-15 to 2019-02-27

# A tibble: 2 × 6
  pol_num cal_yr     cal_yr_end issue_date exposure_cal status
    <int> <date>     <date>     <date>            <dbl> <fct> 
1       1 2018-01-01 2018-12-31 2010-01-01            1 Active
2       1 2019-01-01 2019-12-31 2010-01-01            1 Active

pol_num 2 has similar issues, issue is not specific to issue_date on new years

@MatthewCaseres MatthewCaseres changed the title expose_split exposures different than expose_cy exposures in last year with end_date expose_split exposures different than expose_cy when truncated May 31, 2024
@MatthewCaseres
Copy link
Author

Something similar happens with start dates as well:

df1 <- toy_census %>% filter(pol_num == 1) %>% expose_cy(
  start_date = "2011-11-1",
  end_date = "2013-02-27",
  target_status = "Surrender") %>% 
  select(pol_num, cal_yr, cal_yr_end, term_date, issue_date, exposure, status)

df2 <-df1 %>%
  expose_split() %>%
  select(pol_num, cal_yr, cal_yr_end, issue_date, exposure_cal, status)

gives

> df1
  pol_num cal_yr     cal_yr_end term_date issue_date exposure status
    <int> <date>     <date>     <date>    <date>        <dbl> <fct> 
1       1 2011-01-01 2011-12-31 NA        2010-01-01    0.167 Active
2       1 2012-01-01 2012-12-31 NA        2010-01-01    1     Active
3       1 2013-01-01 2013-12-31 NA        2010-01-01    0.159 Active
>df2
  pol_num cal_yr     cal_yr_end issue_date exposure_cal status
    <int> <date>     <date>     <date>            <dbl> <fct> 
1       1 2011-01-01 2011-12-31 2010-01-01            1 Active
2       1 2012-01-01 2012-12-31 2010-01-01            1 Active
3       1 2013-01-01 2013-12-31 2010-01-01            1 Active

@mattheaphy
Copy link
Owner

mattheaphy commented Jun 23, 2024

Thanks for pointing this out. It's going to be patched in release 1.5.0. Both your examples above are now working as intended in that version. I also added new tests to the package to ensure that start / end dates are respected.

Example 1

> df2
Exposure data

 Exposure type: split_year 
 Target status: Surrender 
 Study range: 2007-06-15 to 2019-02-27

# A tibble: 2 × 6
  pol_num cal_yr     cal_yr_end issue_date exposure_cal status
    <int> <date>     <date>     <date>            <dbl> <fct> 
1       1 2018-01-01 2018-12-31 2010-01-01        1     Active
2       1 2019-01-01 2019-02-27 2010-01-01        0.159 Active

Example 2

> df2
Exposure data

 Exposure type: split_year 
 Target status: Surrender 
 Study range: 2011-11-01 to 2013-02-27

# A tibble: 3 × 6
  pol_num cal_yr     cal_yr_end issue_date exposure_cal status
    <int> <date>     <date>     <date>            <dbl> <fct> 
1       1 2011-11-01 2011-12-31 2010-01-01        0.167 Active
2       1 2012-01-01 2012-12-31 2010-01-01        1     Active
3       1 2013-01-01 2013-02-27 2010-01-01        0.159 Active

@mattheaphy mattheaphy linked a pull request Jun 23, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants