From f3e0dd3ee9b5412c2ac372399411e9bea7bd9cce Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sun, 19 Nov 2023 14:03:19 -0500 Subject: [PATCH 01/18] removed starting white spaces from vignettes --- README.Rmd | 2 -- README.md | 2 -- vignettes/actxps.Rmd | 1 - vignettes/exp_summary.Rmd | 7 ------- vignettes/exposures.Rmd | 9 +-------- vignettes/misc.Rmd | 4 ---- vignettes/transactions.Rmd | 5 ----- 7 files changed, 1 insertion(+), 29 deletions(-) diff --git a/README.Rmd b/README.Rmd index 4cc01c5..7fd7260 100644 --- a/README.Rmd +++ b/README.Rmd @@ -88,7 +88,6 @@ Create a summary grouped by policy year and the presence of a guaranteed income rider. ```{r stats-grouped} - exp_res <- exposed_data |> group_by(pol_yr, inc_guar) |> exp_stats() @@ -102,7 +101,6 @@ First, attach one or more columns of expected termination rates to the exposure data. Then, pass these column names to the `expected` argument of `exp_stats()`. ```{r stats-ae} - expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) # using 2 different expected termination rates diff --git a/README.md b/README.md index ac0193b..54185ac 100644 --- a/README.md +++ b/README.md @@ -112,7 +112,6 @@ Create a summary grouped by policy year and the presence of a guaranteed income rider. ``` r - exp_res <- exposed_data |> group_by(pol_yr, inc_guar) |> exp_stats() @@ -147,7 +146,6 @@ exposure data. Then, pass these column names to the `expected` argument of `exp_stats()`. ``` r - expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) # using 2 different expected termination rates diff --git a/vignettes/actxps.Rmd b/vignettes/actxps.Rmd index 8b6ee9b..d17fc7f 100644 --- a/vignettes/actxps.Rmd +++ b/vignettes/actxps.Rmd @@ -108,7 +108,6 @@ exp_res To derive actual-to-expected rates, first attach one or more columns of expected termination rates to the exposure data. Then, pass these column names to the `expected` argument of `exp_stats()`. ```{r stats-ae} - expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) # using 2 different expected termination rates diff --git a/vignettes/exp_summary.Rmd b/vignettes/exp_summary.Rmd index 121b52f..933935f 100644 --- a/vignettes/exp_summary.Rmd +++ b/vignettes/exp_summary.Rmd @@ -78,7 +78,6 @@ If the data frame passed into `exp_stats()` is grouped using `dplyr::group_by()` In the following, `exposed_data` is grouped by policy year before being passed to `exp_stats()`. This results in one row per policy year found in the data. ```{r grouped-1} - exposed_data |> group_by(pol_yr) |> exp_stats() @@ -88,7 +87,6 @@ exposed_data |> Multiple grouping variables are allowed. Below, the presence of an income guarantee (`inc_guar`) is added as a second grouping variable. ```{r grouped-2} - exposed_data |> group_by(inc_guar, pol_yr) |> exp_stats() @@ -105,7 +103,6 @@ Even if the target status exists on the input data, it can be overridden. Howeve Using the example data, a total termination rate can be estimated by including both death and surrender statuses in `target_status`. To ensure exposures are accurate, an adjustment is made to fully expose deaths prior to calling `exp_stats()`^[This adjustment is not necessary on surrenders because the `expose()` function previously did this for us.]. ```{r targ-status} - exposed_data |> mutate(exposure = ifelse(status == "Death", 1, status)) |> group_by(pol_yr) |> @@ -121,7 +118,6 @@ Experience studies often weight output by key policy values. Examples include ac Our sample data contains a column called `premium` that we can weight by. When weights are supplied, the `claims`, `exposure`, and `q_obs` columns will be weighted. If expected termination rates are supplied (see below), these rates and A/E values will also be weighted.^[When weights are supplied, additional columns are created containing the sum of weights, the sum of squared weights, and the number of records. These columns are used for re-summarizing the data (see the "Summary method" section on this page).] ```{r weight-res} - exposed_data |> group_by(pol_yr) |> exp_stats(wt = 'premium') @@ -145,7 +141,6 @@ In the output, 4 new columns are created for expected rates and A/E ratios. ```{r act-exp} - expected_table <- c(seq(0.005, 0.03, length.out = 10), 0.2, 0.15, rep(0.05, 3)) # using 2 different expected termination assumption sets @@ -167,7 +162,6 @@ exp_res |> As noted above, if weights are passed to `exp_stats()` then A/E ratios will also be weighted. ```{r act-exp-wt} - exposed_data2 |> group_by(pol_yr, inc_guar) |> exp_stats(expected = c("expected_1", "expected_2"), @@ -305,7 +299,6 @@ exposed_data |> `exp_stats()` can still work when given a non-`exposed_df` data frame. However, it will be unable to infer certain attributes like the target status and the study dates. For target status, all statuses except the first level are assumed to be terminations. Since this may not be desirable, a warning message will appear informing what statuses were assumed to be terminated. ```{r not-exposed_df} - not_exposed_df <- data.frame(exposed_data) exp_stats(not_exposed_df) diff --git a/vignettes/exposures.Rmd b/vignettes/exposures.Rmd index 6d53e0f..fd373e4 100644 --- a/vignettes/exposures.Rmd +++ b/vignettes/exposures.Rmd @@ -58,7 +58,6 @@ Let's assume we're performing an experience study as of 2022-12-31 and we're int To calculate exposures, we pass our data to the `expose()` function and we specify a study `end_date`. ```{r expose-1} - exposed_data <- expose(toy_census, end_date = "2022-12-31") ``` @@ -151,7 +150,6 @@ If `cal_expo` is set to `TRUE`, calendar year exposures will be calculated. Looking at the second policy, we can see that the first year is left-censored because the policy was issued two-fifths of the way through the year, and the last period is right-censored because the policy terminated roughly seven-tenths of the way through the year. ```{r expo-cal} - exposed_cal <- toy_census |> expose(end_date = "2022-12-31", cal_expo = TRUE, target_status = "Surrender") @@ -214,7 +212,6 @@ The two exposure bases will often not match for two reasons: Some downstream functions like `exp_stats()` expect `exposed_df` objects to have a single column for exposures. For split exposures, the exposure basis must be specified using the `col_exposure` argument. ```{r, split-stats-unclear, eval = FALSE} - exp_stats(split) ``` @@ -226,15 +223,13 @@ tryCatch(exp_stats(split), ``` -```{r, split-stats-clear, eval = FALSE} - +```{r, split-stats-clear} exp_stats(split, col_exposure = "exposure_pol") ``` `expose_split()` doesn't just work with calendar year exposures. Calendar quarters, months, or weeks can also be split. For periods shorter than a year, a record is only split into pre- and post-anniversary segments if a policy anniversary appears in the middle of the period. ```{r, split-qtr} - expose_cq(toy_census, "2022-12-31", target_status = "Surrender") |> expose_split() |> filter(pol_num == 2) |> @@ -248,7 +243,6 @@ Note, however, that calendar period exposures will always be expressed in the or For machine learning feature engineering, the actxps package contains a function called `step_expose()` that is compatible with the recipes package from tidymodels. This function applies the `expose()` function within a recipe. ```{r rec-expose} - library(recipes) expo_rec <- recipe(status ~ ., toy_census) |> @@ -323,7 +317,6 @@ For example, below `exposed_data2` contains study start and end dates that are b When `vctrs::vec_rbind()` is used to combine `exposed_data` and `exposed_data2`, the result combines attributes across both objects. ```{r combine-1} - exposed_data2 <- expose(toy_census, end_date = "2023-12-31", start_date = "1890-01-01", diff --git a/vignettes/misc.Rmd b/vignettes/misc.Rmd index d2d54eb..24c43b3 100644 --- a/vignettes/misc.Rmd +++ b/vignettes/misc.Rmd @@ -31,8 +31,6 @@ The `pol_()` family of functions calculates policy years, months, quarters, week **Example**: assume a policy was issued on 2022-05-10 and we are interested in calculating various policy duration values at the end of calendar years 2022-2032. ```{r pol-dur1} - - dates <- ymd("2022-12-31") + years(0:10) # policy years @@ -53,7 +51,6 @@ pol_wk(dates, "2022-05-10") The more general `pol_interval()` function calculates any arbitrary duration. This function has a third argument where the length of the policy duration can be specified. This argument must be a period object. See `lubridate::period()` for more information. ```{r pol-dur2} - # days pol_interval(dates, "2022-05-10", days(1)) @@ -71,7 +68,6 @@ Below, a very simple logistic regression model is fit to surrender experience in The `col_expected` argument is used to rename the column(s) containing predicted values. If no names are specified, the default name is "expected". ```{r add-preds, fig.height=4, fig.width=5} - # create exposure records exposed_data <- expose(census_dat, end_date = "2019-12-31", target_status = "Surrender") |> diff --git a/vignettes/transactions.Rmd b/vignettes/transactions.Rmd index 69a0397..b868ebd 100644 --- a/vignettes/transactions.Rmd +++ b/vignettes/transactions.Rmd @@ -41,7 +41,6 @@ In this example, we'll be using the `census_dat`, `withdrawals`, and `account_va The `add_transactions()` function attaches transactions to a data frame with exposure-level records. This data frame must have the class `exposed_df`. For our example, we first need to convert `census_dat` into exposure records using the `expose()` function.^[See `vignette('exposures')` for more information on creating `exposed_df` objects.] This example will use policy year exposures. ```{r packages} - library(actxps) library(dplyr) @@ -119,7 +118,6 @@ If the data frame passed into `trx_stats()` is grouped using `dplyr::group_by()` In the following, `exposed_trx` is grouped by the presence of an income guarantee (`inc_guar`) before being passed to `trx_stats()`. This results in four rows because we have two types of transactions and two distinct values of `inc_guar`. ```{r grouped-1} - exposed_trx |> group_by(inc_guar) |> trx_stats() @@ -129,7 +127,6 @@ exposed_trx |> Multiple grouping variables are allowed. Below, policy year (`pol_yr`) is added as a second grouping variable. ```{r grouped-2} - exposed_trx |> group_by(pol_yr, inc_guar) |> trx_stats() @@ -150,7 +147,6 @@ If column names are passed to the `percent_of` argument of `trx_stats()`, the ou For our example, let's assume we're interested in examining withdrawal transactions as a percentage of account values, which are available in the `account_vals` data frame in the column `av_anniv`. ```{r pct-of} - # attach account values data exposed_trx_w_av <- exposed_trx |> left_join(account_vals, by = c("pol_num", "pol_date_yr")) @@ -206,7 +202,6 @@ exposed_trx_w_av |> The `autoplot()` and `autotable()` functions create visualizations and summary tables from `trx_df` objects. See `vignette("visualizations")` for full details on these functions. ```{r trx-plot, warning=FALSE, message=FALSE, fig.height=5.5, fig.width=7} - library(ggplot2) trx_res |> From 237617ffc8c3a18904101e2a168774f1313f1511 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Tue, 21 Nov 2023 08:30:53 -0500 Subject: [PATCH 02/18] created low level class constructor for exp_df objects --- R/exp_stats.R | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/R/exp_stats.R b/R/exp_stats.R index 6256587..472e2be 100644 --- a/R/exp_stats.R +++ b/R/exp_stats.R @@ -332,7 +332,23 @@ finish_exp_stats <- function(.data, target_status, expected, .after = dplyr::last_col()) } - tibble::new_tibble(res, + new_exp_df(res, + .groups = .groups, + target_status = target_status, + start_date = start_date, + expected = expected, + end_date = end_date, + wt = wt, + credibility = credibility, + conf_level = conf_level, cred_r = cred_r, + conf_int = conf_int) +} + +# low level class constructor +new_exp_df <- function(x, .groups, target_status, start_date, expected, + end_date, wt, credibility, conf_level, + cred_r = cred_r, conf_int) { + tibble::new_tibble(x, class = "exp_df", groups = .groups, target_status = target_status, @@ -341,7 +357,8 @@ finish_exp_stats <- function(.data, target_status, expected, end_date = end_date, wt = wt, xp_params = list(credibility = credibility, - conf_level = conf_level, cred_r = cred_r, + conf_level = conf_level, + cred_r = cred_r, conf_int = conf_int)) } From 5346f2b165c3fa21792fd6b1d11a41e114a98e0d Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Tue, 21 Nov 2023 09:03:36 -0500 Subject: [PATCH 03/18] first draft of as_exp_df --- R/exp_df_helpers.R | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 R/exp_df_helpers.R diff --git a/R/exp_df_helpers.R b/R/exp_df_helpers.R new file mode 100644 index 0000000..8d2d8f1 --- /dev/null +++ b/R/exp_df_helpers.R @@ -0,0 +1,32 @@ +#' Exposed data frame helper functions +#' @export +as_exp_df <- function(x, expected = NULL, col_claims = "claims", + col_exposure = "exposure", + target_status = NULL, + start_date = NULL, end_date = NULL, + credibility = FALSE, + conf_level = 0.95, cred_r = 0.05, conf_int = FALSE) { + + x <- x |> + rename(exposure = {{col_exposure}}, + claims = {{col_claims}}) |> + mutate(n_claims = claims) + + new_exp_df(x, + .groups = list(), + target_status = target_status, + start_date = start_date, + expected = expected, + end_date = end_date, + wt = NULL, + credibility = credibility, + conf_level = conf_level, cred_r = cred_r, + conf_int = conf_int) + +} + +#' @export +#' @rdname as_exp_df +is_exp_df <- function(x) { + inherits(x, "exp_df") +} From 56abfb6373f51eebfca2c54c4519778464ff2a0a Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Thu, 23 Nov 2023 08:44:06 -0500 Subject: [PATCH 04/18] created reusable name checking function --- R/exposed_df_helpers.R | 19 ++++++++++++------- tests/testthat/test-exposed_df_helpers.R | 18 ++++++++++++------ 2 files changed, 24 insertions(+), 13 deletions(-) diff --git a/R/exposed_df_helpers.R b/R/exposed_df_helpers.R index dfce5a9..6cb73fb 100644 --- a/R/exposed_df_helpers.R +++ b/R/exposed_df_helpers.R @@ -119,16 +119,11 @@ as_exposed_df <- function(x, end_date, start_date = as.Date("1900-01-01"), # check required columns # pol_num, status, exposure, 2 date cols, policy period (policy expo only) - unmatched <- c("pol_num", "status", "exposure", + req_names <- c("pol_num", "status", "exposure", exp_col_pol_per, exp_cols_dates, exp_cols_trx) - unmatched <- setdiff(unmatched, names(x)) - - if (length(unmatched) > 0) { - rlang::abort(c(x = glue::glue("The following columns are missing from `x`: {paste(unmatched, collapse = ', ')}."), - i = "Hint: create these columns or use the `col_*` arguments to specify existing columns that should be mapped to these elements.")) - } + verify_col_names(names(x), req_names) if (missing(default_status)) { default_status <- most_common(x$status) @@ -612,3 +607,13 @@ verify_get_trx_types <- function(.data, required = TRUE) { } trx_types } + +# function to verify that required names exist and to send an error if not +verify_col_names <- function(x_names, required) { + unmatched <- setdiff(required, x_names) + + if (length(unmatched) > 0) { + rlang::abort(c(x = glue::glue("The following columns are missing: {paste(unmatched, collapse = ', ')}."), + i = "Hint: create these columns or use the `col_*` arguments to specify existing columns that should be mapped to these elements.")) + } +} diff --git a/tests/testthat/test-exposed_df_helpers.R b/tests/testthat/test-exposed_df_helpers.R index cbfa49d..37ee7d5 100644 --- a/tests/testthat/test-exposed_df_helpers.R +++ b/tests/testthat/test-exposed_df_helpers.R @@ -20,16 +20,20 @@ test_that("as_exposed_df works", { start = pol_date_yr, end = pol_date_yr_end) - expect_error(as_exposed_df(data.frame(a = 1:3), Sys.Date())) + expect_error(as_exposed_df(data.frame(a = 1:3), Sys.Date()), + regexp = "The following columns are missing") expect_true(is_exposed_df(as_exposed_df(expo))) expect_false(is_exposed_df(expo2)) - expect_error(as_exposed_df(expo2, end_date = "2022-12-31", expo_length = "yr")) + expect_error(as_exposed_df(expo2, end_date = "2022-12-31", + expo_length = "yr"), + regexp = "`expo_length` must be one of") expect_true(is_exposed_df(expo3)) - expect_error(as_exposed_df(expo4)) + expect_error(as_exposed_df(expo4, + regexp = "The following columns are missing")) expect_no_error(as_exposed_df(expo4, end_date = "2022-12-31", col_pol_num = "pnum")) expect_no_error(as_exposed_df(expo5, end_date = "2022-12-31", @@ -39,7 +43,7 @@ test_that("as_exposed_df works", { col_pol_per = "py", cols_dates = c("start", "end"))) - expect_error(as_exposed_df(1)) + expect_error(as_exposed_df(1), regexp = "`x` must be a data frame.") }) @@ -53,12 +57,14 @@ test_that("as_exposed_df works with transactions", { trx_amt_B = 4) expect_no_error(as_exposed_df(expo6, "2022-12-31", trx_types = c("A", "B"))) - expect_error(as_exposed_df(expo6, "2022-12-31", trx_types = c("A", "C"))) + expect_error(as_exposed_df(expo6, "2022-12-31", trx_types = c("A", "C")), + regexp = "The following columns are missing") expo7 <- expo6 |> rename(n_A = trx_n_A, n_B = trx_n_B, amt_A = trx_amt_A, amt_B = trx_amt_B) - expect_error(as_exposed_df(expo7, "2022-12-31", trx_types = c("A", "B"))) + expect_error(as_exposed_df(expo7, "2022-12-31", trx_types = c("A", "B")), + regexp = "The following columns are missing") expect_no_error(as_exposed_df(expo7, "2022-12-31", trx_types = c("A", "B"), col_trx_n_ = "n_", col_trx_amt_ = "amt_")) From 747a064d5c31d6beb91733029753686d009597de Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Thu, 23 Nov 2023 09:03:01 -0500 Subject: [PATCH 05/18] reference to expose --- R/exposed_df_helpers.R | 1 + 1 file changed, 1 insertion(+) diff --git a/R/exposed_df_helpers.R b/R/exposed_df_helpers.R index 6cb73fb..359fe8e 100644 --- a/R/exposed_df_helpers.R +++ b/R/exposed_df_helpers.R @@ -42,6 +42,7 @@ #' `as_exposed_df()`, an `exposed_df` object. #' #' @importFrom vctrs vec_ptype2 vec_cast +#' @seealso expose #' #' @export is_exposed_df <- function(x) { From 1fea4864ba74304b87ea10bc95efcddd2b8ca9b2 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Thu, 23 Nov 2023 09:34:28 -0500 Subject: [PATCH 06/18] as_exp_df checking, tests, and doc updates --- NAMESPACE | 2 + R/exp_df_helpers.R | 59 +++++++++++++++++++--- man/as_exp_df.Rd | 74 ++++++++++++++++++++++++++++ man/is_exposed_df.Rd | 3 ++ tests/testthat/test-exp_df_helpers.R | 35 +++++++++++++ 5 files changed, 166 insertions(+), 7 deletions(-) create mode 100644 man/as_exp_df.Rd create mode 100644 tests/testthat/test-exp_df_helpers.R diff --git a/NAMESPACE b/NAMESPACE index c1cd702..38c945a 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -67,6 +67,7 @@ export(add_predictions) export(add_transactions) export(anti_join) export(arrange) +export(as_exp_df) export(as_exposed_df) export(autoplot) export(autotable) @@ -88,6 +89,7 @@ export(full_join) export(group_by) export(groups) export(inner_join) +export(is_exp_df) export(is_exposed_df) export(is_split_exposed_df) export(left_join) diff --git a/R/exp_df_helpers.R b/R/exp_df_helpers.R index 8d2d8f1..a54b865 100644 --- a/R/exp_df_helpers.R +++ b/R/exp_df_helpers.R @@ -1,16 +1,61 @@ -#' Exposed data frame helper functions +#' Termination summary helper functions +#' +#' Test for and coerce to the `exp_df` class. +#' +#' `is_exp_df()` will return `TRUE` if `x` is an `exp_df` object. +#' +#' `as_exp_df()` will coerce a data frame to an `exp_df` object if that +#' data frame has columns for exposures and claims. +#' +#' `as_exp_df()` is most useful for converting existing aggregate summaries of +#' experience where individual policy information is not available. +#' +#' @param x An object. For `as_exp_df()`, `x` must be a data frame. +#' @param expected A character vector containing column names in x with +#' expected values +#' @param col_claims Optional. Name of the column in `x` containing claims. The +#' assumed default is "claims". +#' @param col_exposure Optional. Name of the column in `x` containing exposures. +#' The assumed default is "exposure". +#' @param credibility If `TRUE`, future calls to [summary()] will include +#' partial credibility weights and credibility-weighted termination rates. +#' @param conf_level Confidence level used for the Limited Fluctuation +#' credibility method and confidence intervals +#' @param cred_r Error tolerance under the Limited Fluctuation credibility +#' method +#' @param conf_int If `TRUE`, future calls to [summary()] will include +#' confidence intervals around the observed termination rates and any +#' actual-to-expected ratios. +#' @inheritParams expose +#' +#' @return For `is_exp_df()`, a length-1 logical vector. For `as_exp_df()`, +#' an `exp_df` object. +#' +#' @seealso exp_stats +#' #' @export -as_exp_df <- function(x, expected = NULL, col_claims = "claims", - col_exposure = "exposure", +as_exp_df <- function(x, expected = NULL, col_claims, col_exposure, target_status = NULL, start_date = NULL, end_date = NULL, credibility = FALSE, conf_level = 0.95, cred_r = 0.05, conf_int = FALSE) { - x <- x |> - rename(exposure = {{col_exposure}}, - claims = {{col_claims}}) |> - mutate(n_claims = claims) + if (is_exp_df(x)) return(x) + + if (!is.data.frame(x)) { + rlang::abort("`x` must be a data frame.") + } + + # column name alignment + if (!missing(col_exposure)) x <- x |> rename(exposure = {{col_exposure}}) + if (!missing(col_claims)) x <- x |> rename(claims = {{col_claims}}) + + # check required columns + # pol_num, status, exposure, 2 date cols, policy period (policy expo only) + req_names <- c("exposure", "claims") + verify_col_names(names(x), req_names) + + x <- x |> mutate(n_claims = claims) new_exp_df(x, .groups = list(), diff --git a/man/as_exp_df.Rd b/man/as_exp_df.Rd new file mode 100644 index 0000000..8e225e1 --- /dev/null +++ b/man/as_exp_df.Rd @@ -0,0 +1,74 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/exp_df_helpers.R +\name{as_exp_df} +\alias{as_exp_df} +\alias{is_exp_df} +\title{Termination summary helper functions} +\usage{ +as_exp_df( + x, + expected = NULL, + col_claims, + col_exposure, + target_status = NULL, + start_date = NULL, + end_date = NULL, + credibility = FALSE, + conf_level = 0.95, + cred_r = 0.05, + conf_int = FALSE +) + +is_exp_df(x) +} +\arguments{ +\item{x}{An object. For \code{as_exp_df()}, \code{x} must be a data frame.} + +\item{expected}{A character vector containing column names in x with +expected values} + +\item{col_claims}{Optional. Name of the column in \code{x} containing claims. The +assumed default is "claims".} + +\item{col_exposure}{Optional. Name of the column in \code{x} containing exposures. +The assumed default is "exposure".} + +\item{target_status}{Character vector of target status values. Default value += \code{NULL}.} + +\item{start_date}{Experience study start date. Default value = 1900-01-01.} + +\item{end_date}{Experience study end date} + +\item{credibility}{If \code{TRUE}, future calls to \code{\link[=summary]{summary()}} will include +partial credibility weights and credibility-weighted termination rates.} + +\item{conf_level}{Confidence level used for the Limited Fluctuation +credibility method and confidence intervals} + +\item{cred_r}{Error tolerance under the Limited Fluctuation credibility +method} + +\item{conf_int}{If \code{TRUE}, future calls to \code{\link[=summary]{summary()}} will include +confidence intervals around the observed termination rates and any +actual-to-expected ratios.} +} +\value{ +For \code{is_exp_df()}, a length-1 logical vector. For \code{as_exp_df()}, +an \code{exp_df} object. +} +\description{ +Test for and coerce to the \code{exp_df} class. +} +\details{ +\code{is_exp_df()} will return \code{TRUE} if \code{x} is an \code{exp_df} object. + +\code{as_exp_df()} will coerce a data frame to an \code{exp_df} object if that +data frame has columns for exposures and claims. + +\code{as_exp_df()} is most useful for converting existing aggregate summaries of +experience where individual policy information is not available. +} +\seealso{ +exp_stats +} diff --git a/man/is_exposed_df.Rd b/man/is_exposed_df.Rd index 7c34770..4c7a07a 100644 --- a/man/is_exposed_df.Rd +++ b/man/is_exposed_df.Rd @@ -92,3 +92,6 @@ policy periods (for policy exposures only), and exposure start / end dates. Optionally, if \code{x} has transaction counts and amounts by type, these can be specified without calling \code{\link[=add_transactions]{add_transactions()}}. } +\seealso{ +expose +} diff --git a/tests/testthat/test-exp_df_helpers.R b/tests/testthat/test-exp_df_helpers.R new file mode 100644 index 0000000..9ae7759 --- /dev/null +++ b/tests/testthat/test-exp_df_helpers.R @@ -0,0 +1,35 @@ +res <- expose(toy_census, "2022-12-31", target_status = "Surrender") |> + exp_stats() + +test_that("is_exp_df works", { + expect_true(is_exp_df(res)) + expect_false(is_exp_df(mtcars)) +}) + +res2 <- as.data.frame(res) + +test_that("as_exp_df works", { + + + res3 <- as_exp_df(res2) + res4 <- res2 |> + rename(expo = exposure) + res5 <- res4 |> + rename(clms = claims) + + expect_error(as_exp_df(data.frame(a = 1:3), Sys.Date()), + regexp = "The following columns are missing") + + expect_true(is_exp_df(as_exp_df(res))) + + expect_false(is_exp_df(res2)) + + expect_true(is_exp_df(res3)) + + expect_error(as_exp_df(res4, regexp = "The following columns are missing")) + expect_no_error(as_exp_df(res4, col_exposure = "expo")) + expect_no_error(as_exp_df(res5, col_exposure = "expo", col_claims = "clms")) + + expect_error(as_exp_df(1), regexp = "`x` must be a data frame.") + +}) From ee98ecb9e0fee5781fd3e63da71629f0402c9c6f Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Thu, 23 Nov 2023 18:03:31 -0500 Subject: [PATCH 07/18] weight supported added to as_exp_df --- R/exp_df_helpers.R | 60 +++++++++++++++++++++++++--- man/as_exp_df.Rd | 48 +++++++++++++++++++++- tests/testthat/test-exp_df_helpers.R | 2 +- 3 files changed, 102 insertions(+), 8 deletions(-) diff --git a/R/exp_df_helpers.R b/R/exp_df_helpers.R index a54b865..e280a11 100644 --- a/R/exp_df_helpers.R +++ b/R/exp_df_helpers.R @@ -7,16 +7,52 @@ #' `as_exp_df()` will coerce a data frame to an `exp_df` object if that #' data frame has columns for exposures and claims. #' -#' `as_exp_df()` is most useful for converting existing aggregate summaries of +#' `as_exp_df()` is most useful for working with existing aggregate summaries of #' experience where individual policy information is not available. #' +#' If nothing is passed to `wt`, the data frame `x` must include columns +#' containing: +#' +#' - Exposures (`exposure`) +#' - Claim counts (`claims`) +#' +#' If `wt` is passed, the data must include columns containing: +#' +#' - Weighted exposures (`exposure`) +#' - Weighted claims (`claims`) +#' - Claim counts (`n_claims`) +#' - The raw sum of weights **NOT** multiplied by exposures +#' - Exposure record counts (`.weight_n`) +#' - The raw sum of squared weights (`.weight_sq`) +#' +#' The names in parentheses above are expected column names. If the data +#' frame passed to `as_exp_df()` uses different column names, these can be +#' specified using the `col_*` arguments. +#' +#' When a column name is passed to `wt`, the columns `.weight`, `.weight_n`, +#' and `.weight_sq` are used to calculate credibility and confidence intervals. +#' If credibility and confidence intervals aren't required, then it is not +#' necessary to pass anything to `wt`. The results of `as_exp_df()` and any +#' downstream summaries will still be weighted as long as the exposures and +#' claims are pre-weighted. +#' #' @param x An object. For `as_exp_df()`, `x` must be a data frame. #' @param expected A character vector containing column names in x with #' expected values +#' @param wt Optional. Length 1 character vector. Name of the column in `x` +#' containing weights to use in the calculation of claims, exposures, partial +#' credibility, and confidence intervals. #' @param col_claims Optional. Name of the column in `x` containing claims. The #' assumed default is "claims". #' @param col_exposure Optional. Name of the column in `x` containing exposures. #' The assumed default is "exposure". +#' @param col_n_claims Optional and only used used when `wt` is passed. Name of +#' the column in `x` containing the number of claims. +#' @param col_weight_sq Optional and only used used when `wt` is passed. Name of +#' the column in `x` containing the sum of squared weights (also not multiplied +#' by exposures). +#' @param col_weight_n Optional and only used used when `wt` is passed. Name of +#' the column in `x` containing exposure record counts. #' @param credibility If `TRUE`, future calls to [summary()] will include #' partial credibility weights and credibility-weighted termination rates. #' @param conf_level Confidence level used for the Limited Fluctuation @@ -34,7 +70,9 @@ #' @seealso exp_stats #' #' @export -as_exp_df <- function(x, expected = NULL, col_claims, col_exposure, +as_exp_df <- function(x, expected = NULL, wt = NULL, + col_claims, col_exposure, + col_n_claims, col_weight_sq, col_weight_n, target_status = NULL, start_date = NULL, end_date = NULL, credibility = FALSE, @@ -50,12 +88,22 @@ as_exp_df <- function(x, expected = NULL, col_claims, col_exposure, if (!missing(col_exposure)) x <- x |> rename(exposure = {{col_exposure}}) if (!missing(col_claims)) x <- x |> rename(claims = {{col_claims}}) + if (is.null(wt)) { + req_names <- c("exposure", "claims") + } else { + req_names <- c("exposure", "claims", "n_claims", ".weight", + ".weight_sq", ".weight_n") + if (!missing(col_n_claims)) x <- x |> rename(n_claims = {{col_n_claims}}) + x <- x |> rename(.weight = {{wt}}) + if (!missing(col_weight_sq)) x <- x |> + rename(.weight_sq = {{col_weight_sq}}) + if (!missing(col_weight_n)) x <- x |> rename(.weight_n = {{col_weight_n}}) + } + # check required columns - # pol_num, status, exposure, 2 date cols, policy period (policy expo only) - req_names <- c("exposure", "claims") verify_col_names(names(x), req_names) - x <- x |> mutate(n_claims = claims) + if (is.null(wt)) x$n_claims <- x$claims new_exp_df(x, .groups = list(), @@ -63,7 +111,7 @@ as_exp_df <- function(x, expected = NULL, col_claims, col_exposure, start_date = start_date, expected = expected, end_date = end_date, - wt = NULL, + wt = wt, credibility = credibility, conf_level = conf_level, cred_r = cred_r, conf_int = conf_int) diff --git a/man/as_exp_df.Rd b/man/as_exp_df.Rd index 8e225e1..9a62dc3 100644 --- a/man/as_exp_df.Rd +++ b/man/as_exp_df.Rd @@ -8,8 +8,12 @@ as_exp_df( x, expected = NULL, + wt = NULL, col_claims, col_exposure, + col_n_claims, + col_weight_sq, + col_weight_n, target_status = NULL, start_date = NULL, end_date = NULL, @@ -27,12 +31,26 @@ is_exp_df(x) \item{expected}{A character vector containing column names in x with expected values} +\item{wt}{Optional. Length 1 character vector. Name of the column in \code{x} +containing weights to use in the calculation of claims, exposures, partial +credibility, and confidence intervals.} + \item{col_claims}{Optional. Name of the column in \code{x} containing claims. The assumed default is "claims".} \item{col_exposure}{Optional. Name of the column in \code{x} containing exposures. The assumed default is "exposure".} +\item{col_n_claims}{Optional and only used used when \code{wt} is passed. Name of +the column in \code{x} containing the number of claims.} + +\item{col_weight_sq}{Optional and only used used when \code{wt} is passed. Name of +the column in \code{x} containing the sum of squared weights (also not multiplied +by exposures).} + +\item{col_weight_n}{Optional and only used used when \code{wt} is passed. Name of +the column in \code{x} containing exposure record counts.} + \item{target_status}{Character vector of target status values. Default value = \code{NULL}.} @@ -66,8 +84,36 @@ Test for and coerce to the \code{exp_df} class. \code{as_exp_df()} will coerce a data frame to an \code{exp_df} object if that data frame has columns for exposures and claims. -\code{as_exp_df()} is most useful for converting existing aggregate summaries of +\code{as_exp_df()} is most useful for working with existing aggregate summaries of experience where individual policy information is not available. + +If nothing is passed to \code{wt}, the data frame \code{x} must include columns +containing: +\itemize{ +\item Exposures (\code{exposure}) +\item Claim counts (\code{claims}) +} + +If \code{wt} is passed, the data must include columns containing: +\itemize{ +\item Weighted exposures (\code{exposure}) +\item Weighted claims (\code{claims}) +\item Claim counts (\code{n_claims}) +\item The raw sum of weights \strong{NOT} multiplied by exposures +\item Exposure record counts (\code{.weight_n}) +\item The raw sum of squared weights (\code{.weight_sq}) +} + +The names in parentheses above are expected column names. If the data +frame passed to \code{as_exp_df()} uses different column names, these can be +specified using the \verb{col_*} arguments. + +When a column name is passed to \code{wt}, the columns \code{.weight}, \code{.weight_n}, +and \code{.weight_sq} are used to calculate credibility and confidence intervals. +If credibility and confidence intervals aren't required, then it is not +necessary to pass anything to \code{wt}. The results of \code{as_exp_df()} and any +downstream summaries will still be weighted as long as the exposures and +claims are pre-weighted. } \seealso{ exp_stats diff --git a/tests/testthat/test-exp_df_helpers.R b/tests/testthat/test-exp_df_helpers.R index 9ae7759..5ee8ca0 100644 --- a/tests/testthat/test-exp_df_helpers.R +++ b/tests/testthat/test-exp_df_helpers.R @@ -26,7 +26,7 @@ test_that("as_exp_df works", { expect_true(is_exp_df(res3)) - expect_error(as_exp_df(res4, regexp = "The following columns are missing")) + expect_error(as_exp_df(res4), regexp = "The following columns are missing") expect_no_error(as_exp_df(res4, col_exposure = "expo")) expect_no_error(as_exp_df(res5, col_exposure = "expo", col_claims = "clms")) From 41ee3cd908ffadfe17b079bb4cf567e5fbec55e2 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Fri, 24 Nov 2023 08:19:05 -0500 Subject: [PATCH 08/18] tests for as_exp_df with weights --- tests/testthat/test-exp_df_helpers.R | 44 ++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/tests/testthat/test-exp_df_helpers.R b/tests/testthat/test-exp_df_helpers.R index 5ee8ca0..74236f4 100644 --- a/tests/testthat/test-exp_df_helpers.R +++ b/tests/testthat/test-exp_df_helpers.R @@ -33,3 +33,47 @@ test_that("as_exp_df works", { expect_error(as_exp_df(1), regexp = "`x` must be a data frame.") }) + +# weighted tests +res_wt <- expose(census_dat, "2019-12-31", target_status = "Surrender") |> + mutate(ex = 0.05) |> + group_by(pol_yr, product) |> + exp_stats(wt = "premium", expected = "ex", + conf_int = TRUE, credibility = TRUE) + +res_wt2 <- as.data.frame(res_wt) |> + rename(premium = .weight) +res_wt3 <- as_exp_df(res_wt2, wt = "premium", expected = "ex", + conf_int = TRUE, credibility = TRUE) + +test_that("as_exp_df with weights works", { + + + res_wt4 <- res_wt2 |> + rename(expo = exposure) + res_wt5 <- res_wt4 |> + rename(clms = claims, + n = n_claims, + sq = .weight_sq) + + expect_true(is_exp_df(as_exp_df(res_wt))) + expect_true(is_exp_df(res_wt3)) + + expect_error(as_exp_df(res_wt5, wt = "premium"), + regexp = "The following columns are missing") + expect_no_error(as_exp_df(res_wt4, wt = "premium", col_exposure = "expo")) + expect_no_error(as_exp_df(res_wt5, wt = "premium", + col_exposure = "expo", col_claims = "clms", + col_weight_sq = "sq", col_n_claims = "n")) + +}) + +test_that("as_exp_df summary matches an object created by exp_stats", { + x <- summary(res_wt, product) |> select(-product) + y <- summary(res_wt3, product) |> select(-product) + expect_true(dplyr::near(x - y, 0) |> all()) + + x <- summary(res_wt, pol_yr) |> select(-pol_yr) + y <- summary(res_wt3, pol_yr) |> select(-pol_yr) + expect_true(dplyr::near(x - y, 0) |> all()) +}) From 33dd5246ecc2886751ce50ec1547d5da2eccc1b8 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Fri, 24 Nov 2023 08:29:25 -0500 Subject: [PATCH 09/18] documentation update --- R/exp_df_helpers.R | 15 +++++++++++---- R/expose_split.R | 3 ++- R/exposed_df_helpers.R | 3 ++- man/as_exp_df.Rd | 15 +++++++++++---- man/expose_split.Rd | 3 ++- man/is_exposed_df.Rd | 3 ++- 6 files changed, 30 insertions(+), 12 deletions(-) diff --git a/R/exp_df_helpers.R b/R/exp_df_helpers.R index e280a11..f106558 100644 --- a/R/exp_df_helpers.R +++ b/R/exp_df_helpers.R @@ -1,14 +1,17 @@ #' Termination summary helper functions #' -#' Test for and coerce to the `exp_df` class. +#' Convert aggregate experience studies to the `exp_df` class. #' #' `is_exp_df()` will return `TRUE` if `x` is an `exp_df` object. #' #' `as_exp_df()` will coerce a data frame to an `exp_df` object if that #' data frame has columns for exposures and claims. #' -#' `as_exp_df()` is most useful for working with existing aggregate summaries of -#' experience where individual policy information is not available. +#' `as_exp_df()` is most useful for working with aggregate summaries of +#' experience that were not created by actxps where individual policy +#' information is not available. After converting the data to the `exp_df` +#' class, [summary()] can be used to summarize data by any grouping variables, +#' and [autoplot()] and [autotable()] are available for reporting. #' #' If nothing is passed to `wt`, the data frame `x` must include columns #' containing: @@ -36,6 +39,9 @@ #' downstream summaries will still be weighted as long as the exposures and #' claims are pre-weighted. #' +#' `target_status`, `start_date`, and `end_date` are optional arguments that are +#' only used for printing the resulting `exp_df` object. +#' #' @param x An object. For `as_exp_df()`, `x` must be a data frame. #' @param expected A character vector containing column names in x with #' expected values @@ -67,7 +73,8 @@ #' @return For `is_exp_df()`, a length-1 logical vector. For `as_exp_df()`, #' an `exp_df` object. #' -#' @seealso exp_stats +#' @seealso [exp_stats()] for information on how `exp_df` objects are typically +#' created from individual exposure records. #' #' @export as_exp_df <- function(x, expected = NULL, wt = NULL, diff --git a/R/expose_split.R b/R/expose_split.R index 7892cc4..7cb77e1 100644 --- a/R/expose_split.R +++ b/R/expose_split.R @@ -37,7 +37,8 @@ #' @examples #' toy_census |> expose_cy("2022-12-31") |> expose_split() #' -#' @seealso [expose()] +#' @seealso [expose()] for information on creating exposure records from census +#' data. #' #' @export expose_split <- function(.data) { diff --git a/R/exposed_df_helpers.R b/R/exposed_df_helpers.R index 359fe8e..330304f 100644 --- a/R/exposed_df_helpers.R +++ b/R/exposed_df_helpers.R @@ -42,7 +42,8 @@ #' `as_exposed_df()`, an `exposed_df` object. #' #' @importFrom vctrs vec_ptype2 vec_cast -#' @seealso expose +#' @seealso [expose()] for information on how `exposed_df` objects are typically +#' created from census data. #' #' @export is_exposed_df <- function(x) { diff --git a/man/as_exp_df.Rd b/man/as_exp_df.Rd index 9a62dc3..7546c45 100644 --- a/man/as_exp_df.Rd +++ b/man/as_exp_df.Rd @@ -76,7 +76,7 @@ For \code{is_exp_df()}, a length-1 logical vector. For \code{as_exp_df()}, an \code{exp_df} object. } \description{ -Test for and coerce to the \code{exp_df} class. +Convert aggregate experience studies to the \code{exp_df} class. } \details{ \code{is_exp_df()} will return \code{TRUE} if \code{x} is an \code{exp_df} object. @@ -84,8 +84,11 @@ Test for and coerce to the \code{exp_df} class. \code{as_exp_df()} will coerce a data frame to an \code{exp_df} object if that data frame has columns for exposures and claims. -\code{as_exp_df()} is most useful for working with existing aggregate summaries of -experience where individual policy information is not available. +\code{as_exp_df()} is most useful for working with aggregate summaries of +experience that were not created by actxps where individual policy +information is not available. After converting the data to the \code{exp_df} +class, \code{\link[=summary]{summary()}} can be used to summarize data by any grouping variables, +and \code{\link[=autoplot]{autoplot()}} and \code{\link[=autotable]{autotable()}} are available for reporting. If nothing is passed to \code{wt}, the data frame \code{x} must include columns containing: @@ -114,7 +117,11 @@ If credibility and confidence intervals aren't required, then it is not necessary to pass anything to \code{wt}. The results of \code{as_exp_df()} and any downstream summaries will still be weighted as long as the exposures and claims are pre-weighted. + +\code{target_status}, \code{start_date}, and \code{end_date} are optional arguments that are +only used for printing the resulting \code{exp_df} object. } \seealso{ -exp_stats +\code{\link[=exp_stats]{exp_stats()}} for information on how \code{exp_df} objects are typically +created from individual exposure records. } diff --git a/man/expose_split.Rd b/man/expose_split.Rd index abafe87..02d0b36 100644 --- a/man/expose_split.Rd +++ b/man/expose_split.Rd @@ -54,5 +54,6 @@ toy_census |> expose_cy("2022-12-31") |> expose_split() } \seealso{ -\code{\link[=expose]{expose()}} +\code{\link[=expose]{expose()}} for information on creating exposure records from census +data. } diff --git a/man/is_exposed_df.Rd b/man/is_exposed_df.Rd index 4c7a07a..575dda1 100644 --- a/man/is_exposed_df.Rd +++ b/man/is_exposed_df.Rd @@ -93,5 +93,6 @@ Optionally, if \code{x} has transaction counts and amounts by type, these can be specified without calling \code{\link[=add_transactions]{add_transactions()}}. } \seealso{ -expose +\code{\link[=expose]{expose()}} for information on how \code{exposed_df} objects are typically +created from census data. } From e582fd22bbe8ca2fc841b6dda76ecbd172066102 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Fri, 24 Nov 2023 09:36:19 -0500 Subject: [PATCH 10/18] - added new aggregated sim data set - examples for as_exp_df --- R/agg_sim_data.R | 32 +++++++++++++++++++++++++++++++ R/exp_df_helpers.R | 29 ++++++++++++++++++++++++++++ R/sim_data.R | 2 +- data-raw/create_data.R | 17 +++++++++++++++++ data/agg_sim_dat.rda | Bin 0 -> 7388 bytes man/agg_sim_data.Rd | 42 +++++++++++++++++++++++++++++++++++++++++ man/as_exp_df.Rd | 30 +++++++++++++++++++++++++++++ man/sim_data.Rd | 3 +++ 8 files changed, 154 insertions(+), 1 deletion(-) create mode 100644 R/agg_sim_data.R create mode 100644 data/agg_sim_dat.rda create mode 100644 man/agg_sim_data.Rd diff --git a/R/agg_sim_data.R b/R/agg_sim_data.R new file mode 100644 index 0000000..7c36867 --- /dev/null +++ b/R/agg_sim_data.R @@ -0,0 +1,32 @@ +#' Aggregate simulated annuity data +#' +#' A pre-aggregated version of surrender and withdrawal experience from the +#' simulated data sets `census_dat`, `withdrawals`, and `account_vals`. This +#' data is theoretical only and does not represent the experience on any +#' specific product. +#' +#' @format A data frame containing summarized experience study results grouped +#' by policy year, income guarantee presence, tax-qualified status, and product. +#' +#' @details +#' +#' \describe{ +#' \item{pol_yr}{Policy year} +#' \item{inc_guar}{Indicates whether the policy was issued with an income guarantee} +#' \item{qual}{Indicates whether the policy was purchased with tax-qualified funds} +#' \item{product}{Product: a, b, or c} +#' \item{exposure_n}{Sum of policy year exposures by count} +#' \item{claims_n}{Sum of claim counts} +#' \item{wd}{Sum of partial withdrawal transactions} +#' \item{av}{Sum of account value} +#' \item{exposure_amt}{Sum of policy year exposures weighted by account value} +#' \item{claims_amt}{Sum of claims weighted by account value} +#' \item{av_sq}{Sum of squared account values} +#' \item{n}{Sum of records} +#' } +#' @seealso [census_dat] +#' @name agg_sim_data + +NULL +#' @rdname agg_sim_data +"agg_sim_dat" diff --git a/R/exp_df_helpers.R b/R/exp_df_helpers.R index f106558..8aabec8 100644 --- a/R/exp_df_helpers.R +++ b/R/exp_df_helpers.R @@ -76,6 +76,35 @@ #' @seealso [exp_stats()] for information on how `exp_df` objects are typically #' created from individual exposure records. #' +#' @examples +#' # convert pre-aggregated experience into an exp_df object +#' dat <- as_exp_df(agg_sim_dat, col_exposure = "exposure_n", +#' col_claims = "claims_n", +#' target_status = "Surrender", +#' start_date = 2005, end_date = 2019, +#' conf_int = TRUE) +#' dat +#' is_exp_df(dat) +#' +#' # summary by policy year +#' summary(dat, pol_yr) +#' +#' # repeat the prior exercise on a weighted basis +#' dat_wt <- as_exp_df(agg_sim_dat, wt = "av", +#' col_exposure = "exposure_amt", +#' col_claims = "claims_amt", +#' col_n_claims = "claims_n", +#' col_weight_sq = "av_sq", +#' col_weight_n = "n", +#' target_status = "Surrender", +#' start_date = 2005, end_date = 2019, +#' conf_int = TRUE) +#' dat_wt +#' +#' # summary by policy year +#' summary(dat_wt, pol_yr) +#' +#' #' @export as_exp_df <- function(x, expected = NULL, wt = NULL, col_claims, col_exposure, diff --git a/R/sim_data.R b/R/sim_data.R index e15b61b..45f4575 100644 --- a/R/sim_data.R +++ b/R/sim_data.R @@ -43,7 +43,7 @@ #' \item{av_anniv}{Account value on the policy anniversary date} #' } #' - +#' @seealso [census_dat] #' @name sim_data NULL diff --git a/data-raw/create_data.R b/data-raw/create_data.R index 0fbb36c..da3ceb6 100644 --- a/data-raw/create_data.R +++ b/data-raw/create_data.R @@ -23,3 +23,20 @@ source("data-raw/simulate_data.R") usethis::use_data(census_dat, overwrite = TRUE) usethis::use_data(withdrawals, overwrite = TRUE) usethis::use_data(account_vals, overwrite = TRUE) + +agg_sim_dat <- expose_py(census_dat, "2019-12-31", + target_status = "Surrender") |> + add_transactions(withdrawals) |> + left_join(account_vals, by = c("pol_num", "pol_date_yr")) |> + group_by(pol_yr, inc_guar, qual, product) |> + summarize(exposure_n = sum(exposure), + claims_n = sum(status == "Surrender"), + wd = sum(trx_amt_Rider) + sum(trx_amt_Base), + av = sum(av_anniv), + exposure_amt = sum(exposure * av_anniv), + claims_amt = sum((status == "Surrender") * av_anniv), + av_sq = sum(av_anniv ^ 2), + n = n(), + .groups = "drop") + +usethis::use_data(agg_sim_dat, overwrite = TRUE) diff --git a/data/agg_sim_dat.rda b/data/agg_sim_dat.rda new file mode 100644 index 0000000000000000000000000000000000000000..fb07a3bf747abca566b2d5e9cd9e4a4ef7409af0 GIT binary patch literal 7388 zcmV<293$gGT4*^jL0KkKS>s5XcmNXBfB*mg|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0 z|NsC0|Nr0<-#&f!r{6w4XD@x95CHk*_sQ04&gkagJp;|>u5CA6ySC2g!(h&7o!@q~ zZ+Cmxxn<{tQ_xK&Kxv|SnKGZGC+RYtO)_F?Y)wx|>7kVNjY;OF)YQ#2HjE~w$u!aG zHkxGhJyTB#dY`G9n5M~;*)hQN4NoRajEn%J{WUz987aIcp&pGB0%X9-V@*s6wGB3!HBSIk z{ZCC2dOVY0lWitQ%}i=~o=irG>UyW?Ptl5nQ_xK`1klMEdQA-`@l2kCG@5=*GMP4} zN$f_c>7?5v$)}KNU}UGM`cu)Q^lD;ZkdJCLJrU@bXfzr&keR5_h7qTv3FtI5PZJ@d zB6>AF8bTovPh`ks29wo2Q}n0elPBoa9?>;C6nPQl81)nKL~NRw4=L(4rjHcUBhq^+ zF{YE+nreEQ1IONb^g*L)15G_Z zJx7XZp`i5uGCf9y)HDDdrk+rGgHKQdi6l)lDW{~<(4L}tdZ)Ed6V${b)bt4UQawkh zlWJ&fCyIJyKU8=~c~25yH1$0b(^0aUBWgy;l6sz#XxURt(-iiSJery5Ow{t8)gP*P zH8!V{Qz7Ps57h^h(Dgku#T!FYonHlvWSlUB;vw;z4X+P@ETfn)aYEK)N|qZD4k8;+ z;ll>o5@P|moSSY_+=}LcuNt6>4ZPb)NuW=_?s-*rcRuRWHKH0rz<0tKLLn@a0si1-fQy`KITe1COYK@US zeBRg4dP?pDc~xt8ocOe{e9V>T?XN9$GH&^^I*bEb-4(*Yj_vto_%O_xOPMbGo<2HB zGAjH2?ao-oX7N|K7d>gZD(cN)r(I8hou%ADxJ{k|9zFG59c@AN9Z_Z~SEp6mma%#TZq@E5Y>>?yXp|@{q zF$nhk5XnCO_U+eVFROl6yKmB_XnmN<&OfeFHT{(@uZCsuxmb56qYb~fVeIVlyfZHn zb=RnV85my>1iQ+>Engf6xk#K_o=tAp2jczTiV*uv5+}QG~7zsKGz_iON zz`)K9>@meX91>l+Qy303f_NMf^pKpBAvQ1@+`%XkIsA11B%okR6B&gq&^eNynoqzC z36teMDKa=^NDN~lk>oOnLIqWhq z7BkC?1O~NU(SuRQGg5(O2gIK7x(gdKSt{$rSwLcDxMXW(F=PY;AtoIBs7@&Ol+_h< zuJ22;mFBkhTVt`ewyyGf4X?VL>1*`)S6JVQ5uxJQrF{Z$5Ltrkr z`e|$Wwq!k&XVZ6Fq?XksTaYtB>*x=A6~S%aW_64#8V*7_!>$$VU!+q88?FepudS66 zuy+_gj57Mh3#d5+CcHx3fQQ4mH+->0)iYIX4Bxu}01zMo9)r5LT}eko9p8mS5F?45 zg@Ar_P2-Q%bDVB9wLkXa_aA?6rK+hwBWvVFBIBXTW;J_8Fy!IMMv4k3mAwBoPssG3 z@Ln9jM^zjsPyi?hT0sGze|UYFW&FJIyoK=T-t#+pdl8cq9ScFh6Al0X6}^efF`YS9 z;{3PK`Vi2*atY!s`>(&&joCs#01$Dm{^iu-Zn8VvUMJ`L&f0nc<(Rf?^__mBX*mec zJ@=XPq$E<@buvjFyX7&$s0MWBs&cvKFhZG=&OePZvGFPesYH9|1 zg$Wv>gH%+WX#i@P)Im0mQL6N{eBs+>Lazm>JRngF~i#M1}_1$#&YUN;`+WnG!hX1g3vXsGpc zljl#F#*Uy|(*zh+04$g>gK4ywCfGK?At-1KivU1mSdtKGJLo*NgSlx1Cs_eKr$Pf~ z_rC;^LNh2yh(^3gq}ZXAO(cUZ(dsZjiY@yvgLK3AFuPIVNm=mWXSaI(c@qF4t{f22 z+EshXf@PKVu(ANfPJ{Ko1>xVtYI6jGnrObXNr!^VcMp8m*J!!#BGr@HuzP)uvu`k~ zS_7YA1A|Mb(@b^TzRQW%p&Tu7g6j z-3v1US@%XjEBJ1CXBm=MgRyA z7?pvl;#y#LT}8SAj-ZNYsbBzsOdT#2K1EiKInv#9ju3+ehQ|FGv4vE{k&-Gonf*0n zbsCjI5Ioc$&ct4sR#N8Ja}dx?N-3W~bHdrjFNGVs?3Cfc32De7-7AGItjg_z1#;T8 z%5K?SBe*hX3qkm28}@zpx6a_jnY@IJtT+H_i)kTd#o+ju@8OK&%#`wyM!TU=fImla z&{D~a000BL&5ZrwXc7(Ed%D(DmnV(sHg=1fH{Puh9Ih60`3nk2c-(_;Wu{c+XMO|( z3WL3_&d+8I+wEiio3+cs$W>nLmBGjAXbcC?&gpgw5Y>e84xXJmih@e`q5}+8px1mX z#b7~tp|)7ds_t>CekfiUj#hxYS!SDD9<~x`Z73GDLt$~M6UP$FOA>CTyR=Z2 zPS(cZVPa^M^I1kZu9#(eUk&9lc4w|DD=Aw{$}gLHCHp#BBzBBT?l*hnG)wEp_=O7A ztce)#h%GOLVj7Us>3ZkjN27>&b}&L*DplpvgauzI zMY9R{@4BGoIIh?1|pq6>Y&8*_#LC!)a zMGh>*74aJ73WG2&AqVg35~}mY5t86wLQ%O8gtJ2b)Y>@h)7EtbzoZW(8xfgDce8 z%S_dC>G05JUdomxej~kYpr*ucYPw!WNqeE&{c4F`UU}-Ct)DjU3f~%IT~2C7d6b*J){pz+?QAeLMjn{_`aH+-_>>9Bg-Rq(duE($6Qs<93Z5}ncti!$ z=>VvYES+2($2Z>|&$YxV=onDjht3|DY^xYG#tsE&w0A=taR0RJ5{1nHB|bU5>8lw= zBq61A6|)iAjPjV(o{VydgYRqK%TZy@O1LjZq=ScP@~fdM4Bv?2a1n3eQu@AI}25T<7rM+l4ss$si(+b{wis+8|nM5u-Iu1K04|IPb0n52z zwqTxTb-GN)hZ0{Nil*cpHM!kq2LHO3k!|a*g!EiJ$WHI7ukt|{$vXAfHni#5osUM3 z9uc{`FFQo5qOb@^8lS5n=Cd9@AA9X#3W$%_N$ahd9DAhVpL}U`Qf#Z(XLKpw{5BY> zLiKIVSk0g0%vMbgIM)fuo#a59T_aGzkABEwHEe{?#@iWd3KMONY)z)n+kIRpZ6=u0 zX|b`kHri$VkT7kw+iODLOzqqzG%>W;*wNECV+PoU+iYz%XkrshxANg+q(W^r+ii?s z*w{8UG}G+iN1p(-&Rvs_0h$RwS~Q?(HXZ)kH(NE6p4%Ur6`!&eS;zRQvYP?sLgli- zm<+sAyW=#Exo5Yyj*8Ye7~6ulBg>3ig~S!8><1+yU-XxL_h;3X*mWa0(Ev=hdTT=6 z59b&=ADY#&v(d?ZJ3sFZF_$r0|2x=`f_MDC+mTzBv5;gMw$bCgZFKA- z{`e@!usCnD@nXA+A3H^FZ+@JPWygJjo1kf8{4G?xj)w;$3Hcz%X)uyLOIeIO8P2O) zEsNP}V@|`7(fV>)tkigax;QOmxG|D-x*pm`0(Tx&48ctu*MS?3WD0WyN1}kWXRgxA zGzoRAOEAQMvtV@A9e=6G+|Ih&RIjy(yK~(J8tuQKkNvuAbj!c-wSR@Of1L(v89Q|< zu(a_NK)01~?IU-F-Z8k=S5$MSGjcJ>EU9Pt;-2cYHi| zf6h*b&fhY;tixX(njd}1Y^(ZKOX~e3U~H5@hc{+jWImeDYrZNsR?4l%^uPwiQB_h< z*!_ann!_vWaMc6st9^RlYge`WIu;+@F+CL9{#0{Tn!kJFn{u|fl47xv?>EciUOwxS z;{7X=Tea65Yf}Ds>I3-8mw47tc0{Di>iiSZQ!v~1&s5zfgAXV`?X=fSNn>AhZF_K0 z`d@o|l(+cm^}O9|rs^+R|H_tIYd$*&F6GGywde4vDh$4tnR~5ujx&b@jbx$kN)B2B zyWwMyi|LJKGfwP4t&%z$I7ZxscL>r^U|DAUH}qW&X{j35%pO{_mE#lYh0ssiue(>0 zj?>@_VxoFti))AVbL%FR*O&3W$`%=_*1j_yVdb@`kugioRZpXikl0x6x9fX=!7H89 z`}eiDMf30#3>|vn#_{x~5})oStZ>4H?j_|KE=Xb=L-3}C8wBjBR`$0yp9P2zzXb+zYu_phkqoLzzK<2>qQVNSoDeV0ByA3T<3EFi&ndTWj$*_#?F5F2vV2N&;l;OqQhYdltLT*sv+lsPa^F4)!U5Te z?;TL4hYxP=587ABNG=E)CRy!orrWf94m`R&4{Wn)c2hsr=WwO|dR&^=B-3V;O3)3% z?kYT&%gCWt&>Zx0OH9?Rz-!lE>R)j)(B4ntGblwe4iMQvs?o!c-#d%5-l2a(0=xemKAzEO<0T^q; zpc6>9K07)7Beza(g_dXZblyX+CCZU8IlfRt_A$?qq!*>dJU#(Dt1hk-O$ z{2GGj31$2BxMz!;Ii67CRk$0qhcG=dFmSG&_F*PhW1{?O#VJYSuekkIqGd(#r6LOL z^NlQjRG(-+?U9?91LUxwk$a2)SsmpV)$t2LFwvj1xi6Q{J3?NAtD+YAiilgf@rBH|le9^inCjR~Ghk&N zFMX5FV5~lMR$yR4hhpth12tZ1APRrYmD#+ASVwyQEeX#ypHDT^^{o7Jg~4+-kmW(< zY1=XS&#C^!AzaqW&)B{&$sqImS7K-6@F6(oqZ%L{&KqXaYsW0cdZp^G17dm63}CD+ zFP=SFaPqs$@4+_S$`3<}%gW$wI}e8c*oFlQxNPw@lXiTz+qQfy28axEQNGt*iyY2O zLLtzo=1A!NDO_Fg-ROJJ>OKI_?H31B?cir<^=)!N^}E1Q$^EnfiM%r95KKKLi~I8r6w zoDXBxZ)}xMFrs-%1-iod-EpR1N}~uqZe38MtJ*JNkCnqy8ok}!f6d3wDj?n2QxwFZ zti%ZL)^<51FC1}nU}-GmYgl)T*_o{W$KBUi1&~#PCq7)@Hl*=>87Vkf-Q~9@arzQwyuGo@|LT}zrL`6 z*bDeUl=dKT3^Df7KDPa+mqb~rvlQoncZOU$}DZ8Ok22S#L}k-Lgh@AFBbsKP`^zqyxn^wMCwr zM{C0uwsQw6J5?#t936m_vJ*xL@b+t8-y7f3_`j4!th>~iL4)BK^#!6&38(WM5P$Nr zxutbyEBxqSS@p-OSmvtxZ>Sh^^vYQT2C@xS4dX!O@h(4}fqF2|_>f-L0f26TFSmk- zo`}L!wmAC8vY611y%?9jvNb2A+oCuVuDt8t)01M3U82vUn=>NnTiS{*>*Tg;L*?+| zsOF_jyGWmSRfDIxhDxF}#LdeO?D*#==_q$JM^<6Q-^buaI?h%0-0t&}K{@_0(T@pbh{4 zIl1E%GC3&Xl$-Bx*hiY*OfyCc1q9XjQ6BFDl8n` zdDR)7c9J!0M1|yxAXtzLc1+x4<&uHQQJNJxQ;P-$bwH0tszrpJzrYwCi2J#}F!!y( zQW*f4*xpJpw|wG0LmRSWvm0kn<+*505J3oe6YJ0B$ph$m=US%qDu=>!2YnsIv8oCL=v7y>La49!K(AsDP4rcxT>Rp9F>@1>-yBj zGag6ce7F{zSqk#SC{PH8+7Np#X)=GTxRKO?m^@gTZ~8caba=kfnEgv44(Zdz8QtqHbycOa-W1MEufGgYh8X6*i_HSniky%x7Je383KdCl z6dfRn-hfN;!UX6@KsGN7=iO5nA*Bc{bCVqxuhO0{U(ghq?S8J}mZ8}{`V#^6NQP#! z2*4gj$?ZE5e7(LF#R)CZN;WITJG@{StbGWdaS6 z0ss?@S-WA22Rv@&P{%jd5w>$hY(fecQ59Bc>c&roGxK zke_JvK2B*Zf>r^Ua!yTlZk&0EVRD#iijis#kU<3Nw^m%}B9ABqWjafLuGF5eXD@Vg z_f>b1SHF-xgU2jhd!y2AD)?0RK4wuXrI1U>=uW-j?=CTviRoBNroCQ0;V8YDWkrBE zgPzE%QP@hw#?QS{=y<){oeTV54wdxu@!RRmCk8*u|Nd=ub!cG6j{Z^-qYKoHY`=(ZQlZUEcb z^3*L3uJW3vAh2soJU2ob8jBm0#JZu=J$7~{LcXW>dg0MzRDF8FktCJq=i6Z-rZe8duCJI^o(sbnrlT-g94jGrv)6irqJu#rX2SmcyQ_m>cA#1yu5Ds4Cky^t&Iu(8=tDvt zn(6_3HYJ7Dz0M8gpIUnKpF$mcVir5=l%#;-X%38p#QFbL)0LGlXKu3>V+We!gw^54 z2l=X&Tk5bX^(k7Xp&`L98xZ9yW8YUOi3WX~Y}vI4lH>kIw54n7MLc7;w{8`aE>)-h Oi@744C`fUnO}qdD+D=&j literal 0 HcmV?d00001 diff --git a/man/agg_sim_data.Rd b/man/agg_sim_data.Rd new file mode 100644 index 0000000..06e549c --- /dev/null +++ b/man/agg_sim_data.Rd @@ -0,0 +1,42 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/agg_sim_data.R +\docType{data} +\name{agg_sim_data} +\alias{agg_sim_data} +\alias{agg_sim_dat} +\title{Aggregate simulated annuity data} +\format{ +A data frame containing summarized experience study results grouped +by policy year, income guarantee presence, tax-qualified status, and product. + +An object of class \code{tbl_df} (inherits from \code{tbl}, \code{data.frame}) with 180 rows and 12 columns. +} +\usage{ +agg_sim_dat +} +\description{ +A pre-aggregated version of surrender and withdrawal experience from the +simulated data sets \code{census_dat}, \code{withdrawals}, and \code{account_vals}. This +data is theoretical only and does not represent the experience on any +specific product. +} +\details{ +\describe{ +\item{pol_yr}{Policy year} +\item{inc_guar}{Indicates whether the policy was issued with an income guarantee} +\item{qual}{Indicates whether the policy was purchased with tax-qualified funds} +\item{product}{Product: a, b, or c} +\item{exposure_n}{Sum of policy year exposures by count} +\item{claims_n}{Sum of claim counts} +\item{wd}{Sum of partial withdrawal transactions} +\item{av}{Sum of account value} +\item{exposure_amt}{Sum of policy year exposures weighted by account value} +\item{claims_amt}{Sum of claims weighted by account value} +\item{av_sq}{Sum of squared account values} +\item{n}{Sum of records} +} +} +\seealso{ +\link{census_dat} +} +\keyword{datasets} diff --git a/man/as_exp_df.Rd b/man/as_exp_df.Rd index 7546c45..dbc6178 100644 --- a/man/as_exp_df.Rd +++ b/man/as_exp_df.Rd @@ -120,6 +120,36 @@ claims are pre-weighted. \code{target_status}, \code{start_date}, and \code{end_date} are optional arguments that are only used for printing the resulting \code{exp_df} object. +} +\examples{ +# convert pre-aggregated experience into an exp_df object +dat <- as_exp_df(agg_sim_dat, col_exposure = "exposure_n", + col_claims = "claims_n", + target_status = "Surrender", + start_date = 2005, end_date = 2019, + conf_int = TRUE) +dat +is_exp_df(dat) + +# summary by policy year +summary(dat, pol_yr) + +# repeat the prior exercise on a weighted basis +dat_wt <- as_exp_df(agg_sim_dat, wt = "av", + col_exposure = "exposure_amt", + col_claims = "claims_amt", + col_n_claims = "claims_n", + col_weight_sq = "av_sq", + col_weight_n = "n", + target_status = "Surrender", + start_date = 2005, end_date = 2019, + conf_int = TRUE) +dat_wt + +# summary by policy year +summary(dat_wt, pol_yr) + + } \seealso{ \code{\link[=exp_stats]{exp_stats()}} for information on how \code{exp_df} objects are typically diff --git a/man/sim_data.Rd b/man/sim_data.Rd index ee12973..b202945 100644 --- a/man/sim_data.Rd +++ b/man/sim_data.Rd @@ -63,4 +63,7 @@ does not represent the experience on any specific product. } } +\seealso{ +\link{census_dat} +} \keyword{datasets} From ff92121f9085a21b33564807fa07332a91d2a485 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Fri, 24 Nov 2023 12:41:41 -0500 Subject: [PATCH 11/18] new class contructor for trx_df --- R/trx_stats.R | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/R/trx_stats.R b/R/trx_stats.R index 829f0e0..043880b 100644 --- a/R/trx_stats.R +++ b/R/trx_stats.R @@ -348,14 +348,32 @@ finish_trx_stats <- function(.data, trx_types, percent_of, relocate(trx_amt_sq, .after = dplyr::last_col()) } - tibble::new_tibble(res, + new_trx_df(res, + .groups = .groups, + trx_types = trx_types, + start_date = start_date, + percent_of = percent_of, + end_date = end_date, + conf_level = conf_level, + conf_int = conf_int) + +} + +# low level class constructor +new_trx_df <- function(x, .groups, trx_types, + start_date, percent_of, end_date, + conf_level, conf_int) { + + tibble::new_tibble(x, class = "trx_df", - groups = .groups, trx_types = trx_types, + groups = .groups, + trx_types = trx_types, start_date = start_date, percent_of = percent_of, end_date = end_date, xp_params = list(conf_level = conf_level, conf_int = conf_int)) + } verify_trx_df <- function(.data) { From d5cb16d3fc02bd46f9780d342a8c01016d8b5066 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sat, 25 Nov 2023 07:54:20 -0500 Subject: [PATCH 12/18] doc update --- R/exp_df_helpers.R | 5 ++--- man/as_exp_df.Rd | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/R/exp_df_helpers.R b/R/exp_df_helpers.R index 8aabec8..3c99800 100644 --- a/R/exp_df_helpers.R +++ b/R/exp_df_helpers.R @@ -1,6 +1,6 @@ #' Termination summary helper functions #' -#' Convert aggregate experience studies to the `exp_df` class. +#' Convert aggregate termination experience studies to the `exp_df` class. #' #' `is_exp_df()` will return `TRUE` if `x` is an `exp_df` object. #' @@ -55,8 +55,7 @@ #' @param col_n_claims Optional and only used used when `wt` is passed. Name of #' the column in `x` containing the number of claims. #' @param col_weight_sq Optional and only used used when `wt` is passed. Name of -#' the column in `x` containing the sum of squared weights (also not multiplied -#' by exposures). +#' the column in `x` containing the sum of squared weights. #' @param col_weight_n Optional and only used used when `wt` is passed. Name of #' the column in `x` containing exposure record counts. #' @param credibility If `TRUE`, future calls to [summary()] will include diff --git a/man/as_exp_df.Rd b/man/as_exp_df.Rd index dbc6178..130711b 100644 --- a/man/as_exp_df.Rd +++ b/man/as_exp_df.Rd @@ -45,8 +45,7 @@ The assumed default is "exposure".} the column in \code{x} containing the number of claims.} \item{col_weight_sq}{Optional and only used used when \code{wt} is passed. Name of -the column in \code{x} containing the sum of squared weights (also not multiplied -by exposures).} +the column in \code{x} containing the sum of squared weights.} \item{col_weight_n}{Optional and only used used when \code{wt} is passed. Name of the column in \code{x} containing exposure record counts.} @@ -76,7 +75,7 @@ For \code{is_exp_df()}, a length-1 logical vector. For \code{as_exp_df()}, an \code{exp_df} object. } \description{ -Convert aggregate experience studies to the \code{exp_df} class. +Convert aggregate termination experience studies to the \code{exp_df} class. } \details{ \code{is_exp_df()} will return \code{TRUE} if \code{x} is an \code{exp_df} object. From 2f7d48234d7168511e7971138f6e06f7861ebf06 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sat, 25 Nov 2023 09:50:03 -0500 Subject: [PATCH 13/18] more transaction columns added to agg_sim_dat --- R/{agg_sim_data.R => agg_sim_dat.R} | 20 ++++++++++++++------ data-raw/create_data.R | 6 +++++- data/agg_sim_dat.rda | Bin 7388 -> 9778 bytes man/{agg_sim_data.Rd => agg_sim_dat.Rd} | 23 +++++++++++++++-------- 4 files changed, 34 insertions(+), 15 deletions(-) rename R/{agg_sim_data.R => agg_sim_dat.R} (70%) rename man/{agg_sim_data.Rd => agg_sim_dat.Rd} (72%) diff --git a/R/agg_sim_data.R b/R/agg_sim_dat.R similarity index 70% rename from R/agg_sim_data.R rename to R/agg_sim_dat.R index 7c36867..1488247 100644 --- a/R/agg_sim_data.R +++ b/R/agg_sim_dat.R @@ -12,21 +12,29 @@ #' #' \describe{ #' \item{pol_yr}{Policy year} -#' \item{inc_guar}{Indicates whether the policy was issued with an income guarantee} -#' \item{qual}{Indicates whether the policy was purchased with tax-qualified funds} +#' \item{inc_guar}{Indicates whether the policy was issued with an income +#' guarantee} +#' \item{qual}{Indicates whether the policy was purchased with tax-qualified +#' funds} #' \item{product}{Product: a, b, or c} #' \item{exposure_n}{Sum of policy year exposures by count} #' \item{claims_n}{Sum of claim counts} -#' \item{wd}{Sum of partial withdrawal transactions} #' \item{av}{Sum of account value} #' \item{exposure_amt}{Sum of policy year exposures weighted by account value} #' \item{claims_amt}{Sum of claims weighted by account value} #' \item{av_sq}{Sum of squared account values} -#' \item{n}{Sum of records} +#' \item{n}{Number of exposure records} +#' \item{wd}{Sum of partial withdrawal transactions} +#' \item{wd_n}{Count of partial withdrawal transactions} +#' \item{wd_flag}{Count of exposure records with partial withdrawal +#' transactions} +#' \item{wd_sq}{Sum of squared partial withdrawal transactions} +#' \item{av_w_wd}{Sum of account value for exposure records with partial +#' withdrawal transactions} #' } #' @seealso [census_dat] -#' @name agg_sim_data +#' @name agg_sim_dat NULL -#' @rdname agg_sim_data +#' @rdname agg_sim_dat "agg_sim_dat" diff --git a/data-raw/create_data.R b/data-raw/create_data.R index da3ceb6..452c32d 100644 --- a/data-raw/create_data.R +++ b/data-raw/create_data.R @@ -31,12 +31,16 @@ agg_sim_dat <- expose_py(census_dat, "2019-12-31", group_by(pol_yr, inc_guar, qual, product) |> summarize(exposure_n = sum(exposure), claims_n = sum(status == "Surrender"), - wd = sum(trx_amt_Rider) + sum(trx_amt_Base), av = sum(av_anniv), exposure_amt = sum(exposure * av_anniv), claims_amt = sum((status == "Surrender") * av_anniv), av_sq = sum(av_anniv ^ 2), n = n(), + wd = sum(trx_amt_Rider) + sum(trx_amt_Base), + wd_n = sum(trx_n_Rider) + sum(trx_n_Base), + wd_flag = sum(trx_amt_Rider > 0 | trx_amt_Base > 0), + wd_sq = sum(trx_amt_Rider ^ 2) + sum(trx_amt_Base ^ 2), + av_w_wd = sum(av_anniv[trx_amt_Rider > 0 | trx_amt_Base > 0]), .groups = "drop") usethis::use_data(agg_sim_dat, overwrite = TRUE) diff --git a/data/agg_sim_dat.rda b/data/agg_sim_dat.rda index fb07a3bf747abca566b2d5e9cd9e4a4ef7409af0..77c106efe744854a94548f51f21e432a00f657fb 100644 GIT binary patch literal 9778 zcmV-2Ce7JGT4*^jL0KkKS%FQXYXBQP|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0 z|NsC0|Nr0`|2};BkGqe#@Zk@4zEs|LfCX2c`LXw(ef9C`eRq2LyZ`_fxk~!k_ipO8 zW?Q>Wp|;+-Uh{Qnjjwktsn+f89=!Hj!4%Ljxq@R;Unlxma zLU^8&BU90-vYtaJ`cu;+-h=?rs(V78k?NhTz6DOb=n^ix^Oq!kp zL8htg8fXMeQ}P;3sLFmOhNq=GLt!#ZXf-uIAkm1S6v+chda3x+N$D^NlWJixs(NEV zPbz<)28o868URcaQ!yD$Pt$2B;K@(ZQ)JpHWGnd+uvC#4t(MM4myi zQ#Cv(`jcvUgj2*OlTAz!N0i!)Jv4*RX{7R+Xk_%tdY-19nwc_a(Wb;rJxpn+(At|O zqh&P2WZIiiiLjcUnnp*Y^qz#xO)@miH5iPWLs6zrNv4@INNr6TZKzD1ph%KL$e5J# zMACRor1eeeZ5mBaO{7f7nWl{s8kz+$nYcTvJFuCRZ?va9SQH*kaxUo$94-YkwiX*y8t4h_Fx~mT(*xJbrD;JgVt!Pf{-V;SKy?!XdU4L_2*uccRmAZcJ z4u+ahbxC@y%KBNI(=US`Q$@0GNi<^idX#<*N?0c{v3PvAL~wH5wC)$o-`;~8y?cPd zX%Y*r4sWW{Gfo$nxvcdb^3225ZbmhZ3}OR6$Ycl}k0~RZ4K`WnvffCaqBr z%*|6a>JhA>VIC#n>GH3h&#%iS@LFNFJe6Biw!~@*DHfT+L^7kJTf6ny=0AIoh+^rm z(|9AUiV4XlXNU?a}w!t9PjV zsAcwC-j2Q~L~)v1(IK9R&3z2eP=RdDC!fStr{W8;i=I951R*qc!+t15G!!F+b zFgfY%53)qOXZ=2sG*?hohifo41_!w}1d7z%mgb31wL_n6qknNdhTCX$a3iB%zWcxwdoCQx>y< zB|*^9+05eLQ&1oh5h74&8hNBOIboc55DHzcO5y_5 zMVn-Rf(UApB8eMP2nI^GU(hyz((IX@Fdz_6U`jQH7+FGS1OtHqr&xDyUHcA9K^HT^ z0s)(~S6SGytC*e9;ACcvUR%hwNcGX5l*c0T&OPc9OQ3D^R!|BA$RLM+sV9~U=vp?q zdb$IYaLmHDH5ZCY*UMF1(q=C`uPL;a3iFYt-B)t-H+5R9PU{_G%+{eKEE^IGQxLnh zB)L`s<=G7+p@5-)Hjp+UHc6$-Fqx8KCaZ*k$$=&!W}%JFWH!_<1aL_PRt6Yvk-J3jE5^I!c)JgvHligZ$Ldu3A%-*^TkoUi*x-S$zrZ?Y-7lMs62qK>$Dj z0bC#tZ|l-azj=0&mX0$gx;e!VIJtfH93HJ1eaHGVmTrEU=T~?&aS&R6MejEtcpEJN zmZ-IyF0#z4r|i4zsjkR(H!cUk&2t|B7CL-3H~`B;V+ z0AS?n2O1XH(;T+Ff<0EhX57J+({GiKxtYXrf@_jCgsw;6Xtck3KQ)|-^1uKrG*w&T zXhnP~oe!ZF(Y9I==DF_o#;gprWOGe*k3ffi}GFX8)@Y4w6AV7md zh$lAh9laN=sN6ST=@^GV@#8Km#1?{Z&FG*_&=EaC&`uGWgb6Mv%v&^q5DWthFbkGq zNeZh~eE$8R;c2V!{SlUB+M16iN>QIeXB!%XRImkvX#leT#)iSP3=?b{V1$$~N8o5c zh10YM0Pq>zG;Nr&j0-l8HK8gzydh9(O+%$FON?qOpLL|f;i$*?Bwd~h3n`SM}ZHCWMB7kXetRXip#nx9j&&fPuUpVql#B~hEbP@w7o6aGAoUu~h`b4uPv z>=N0D8|U&qEsz9OWhlBW`JCEWj4Yw=480c(RuRZ>4!XQzm^MKG<~WfouVJ5NC+dMS zbyu0;T#-!pXg*WGkn@>-7^Y_TM7QZu`(!Am;D%{l5Z9k(b6Be$^RfnZf*Jxlqqx`N zMgN*3FSZzmkAgt{UmjKDqA=UY; zg&FO|ZJ~V~D}u^9BG3Q|2wi5)2J;s_tfaR4rgwqUgtW7LQM zG9D|vY6@}2IzoG%V);9946cBZ0b9h?+qDJC{;oL!>AM`a=QQ6-y^Vtl-yuYqb_*Zp z5*w<;HMNuFrW+>}@@n0y?%a#{J|F=Ieh`8H0v3|YG6T_C#tgL$9{+0^OE~wQ+e5JB zrm|~q*TK+LVdQ1}EEJL2Qy$EXA6u%C@-we~E6sm6zOw)2Q&{siEN`JhuXxK?rtzS@ zvdxE`!;99D#Q5-K-SYiTP{@}Z0t~`TVC`P&kjC3|S`iEIg?C5xtn($*8cZxqE3~dl zrx1(IP=g{MHN25Kp>&1Bb5QEq$~D5vFT3_+8ig=|rMA?aV1SBAggV6&p5xV&5Sl-` z<%DsBup=T+LAw=&GY}yy)}bBno20sT1{=A!7)1vhR{~Bfy>zSY=G6-(St~JzWK{wo zhQEINq{KuDerHIhv#A7x+GcJ=3;2wW#95vU=gSPR_=GXzpg3`G(q#8i#V#!?5Prgl z6n`XlUfA~>r%p2i(>40lShxbLx}qaot?-E44w74|9j)?b6HMX-p#xwcS~^JCbGMRx zHFqVv2p0r0kuHccFj)RIZ4bBYE^C@ulZU3yjOT1R?2-rUWab8_wW}aBVgU8G4}ASz zK*I!8kPeCd9F1$sGcDA|$Bc}tET10x%}+JiUHu#>U4^19^7?nWJk|%!>ex z7l=>qL-Pb5&R%=zu!TNY*n&&>Ba|krL1&hLfLzxqAZV&i9DBs-^V-&}iwN-lVhuaB zriq%>x?B$0N}m~R?ag_^f9x&pc7n<-6)+d4`kAn2n%1)=+1BvH;N3FC`h0g|EsI{S zZ;Iw_bB_B18uoEBv~n(OQ1)~j9{Q`Jmc9-5f~IeI1O6C5++>x#)fSXNLT{lZ_yol8 zq*xf8zoW=&%4~v4xX7N`az%9>ZMSTejv@2$wGHYkS`HqJ z|3Ybm76MIoMH3vc)iEM*@_SVZPlUuf0y(?o$cwUXe>6yHKE@~ESskK;DEHpRtPax*nJs02Zo=0aji_m4@ z_t@^=eJ1Zc7v(x$a>JnV-66hYos2|UDNqI6lvVJb%xSx2g`Dw%$rHU6Nd*aq)I4F5 zh`<$RTyW%gM+kuh$d$ZVpv)=al;IxnFN>foa)c1OB}Arf@>7EMkkzfSLqsf8M#vp- z=AdPzvPlorSeK6bP9J|F^3v@+SU+Ok70H|TV^z}l6J))3K9h&5d=L2^J$xfV*UPQz zrMk?(ZC)+dyLNJUnAEcz_8#<9og^;x;hNbx-UcrhG5lW{X&1(7{Fn}VQwYy+I+=1W zpXWF-7a2wl>;l7RVSj+ikUE;7rZ9OlV_iv9Y5B;fxz%8*Q<)*<}!# zZM<|Wq=*L7ZMN9PhQ`p?*wY8mKu4DdT6Zpm!J}r0$&0~=NsI>4>~|U8sWGHv_S$d7 zcgcU52T;iPuDVINx=Qa7U_KTuRfTiQvZ3DEEZ%Xh6JDb)&&Xlx<6~UIaLemPvf}_< z9<8Nl`3J|f+dWIS<6}oz0^%Alj^NchNFCQJZnZxZ)a zv5#mzCTJa9-Z77+X?+j8Ykb3Q%xC(F$Mds3oMLy|h$AJb_gQCEPFVEkBJ6`UA0bL=_{ zykZ*sypFP7GuORi?ie2hf0-n$I=IsM&~I8gZR{$hs~whV19;T$Y6^B>9L&yW;Rm47 zN3_yN>zS2_4f^uE0_3;)qm%e*sdSaw>PNn(4LjFGI13>ubTHo zk@bT#5EojU@*--}ogm3`1ckKLF6s%6|b}zo=ZgH@BF!!$9c9mThmtIcBwo zXT9dgXPJw;XndMkH9rSs83zx>Jd-+8xEPV|XNg6)em+j0ky>BW9 z^GWi(Y8%F*1LJJ|h7HWv^}p)#8NZ;{S7uJ)SmNOO9RnY=lr$vOx7=%M9nC+gK9?n> zN_W5Pzor)l1luhqfmd2efB$Db9C7ltr(l7t4b;wWRoTY-cx*6EPc}o-(^APUC^Co_ z)t=elFB8WnyTx_oF?wUJm8$)9GMZF6ag`|JPK?cDRC^7I{O8nUe-;(%$#OITR*O~O z*f?VQ?t6Y|dB0+gevL|J#TL`IyxnV@m~Nq4Vok%{MX(pkMrp=Tq#kSxwFDSvOtDA) zEi7824{%e~mgYIY=KkL))W-WMm4Bz155dEmKlnt2lM~aPS3YPr-sfoj) z8E!;p@ZPk0fU?s|q;=^{yd4o{tLfh?J8UQ8pbE}IxY+yY@R4-g`td!^8ZHT~j?-qc z3{gR2Ke#YW5F!q&jC5MX6RKxu>FdK0&X2h>Y4#=KV|AgvPfcq33U;}gNkn~@6u*Sr z=3~64!lS;eqwGc9w_K-m;au}uL-g22PJp2JI|DfBmftL2rSOIYLp(i@@9qOd-B2glt?DO)nk^eq7h+57t^8MtQ!s$%B> z%TPu)bXjNj_s`2Fph^KsH_m&y|1;x2y~X2mmbi-^N?D~xSBch@K@;inq?u2=OqOB9 z=S&nvNpTn)v&Yo`d3Z{AXHM{8ONKZ0%3Xy8km8gXaPkYX#$KN3@>dgYZmfzw=M~D! zF>=C2AI7m4T)rmKL&ELzD6~7C-*a7FnfGeF?>7l2+BnGQ1JkM2=Y1m3miEZs;At$z6MQ6X;w%ORrC6-SQ_(om5^h$eU1g5NA=Radf zI@!%Ul^uQ<`aCB=Bv{&hoBIc*Q%=|s5!4IRko(+FU&}a6LYC7Pwhaz^SfbWN%^#rxxx6F ztM*!p!S=J@%$w~A4WIz^Fb%BT9kCQYs)tX}i9b-1VXKiw%5ZzT%JNQ+%Tb{uWfrfT zO74B{HRDF@=~*2|S7*R8sa*KF{9s+hW|S&(PYA8`5lS*M^b1(gNDY{c*kv#aENf!I z1;*!|>jiG?)E(@TTv>YngN`579@BLC;nu*GbXoz5g-&d`e|rc8y3OrFh zr<8N9ahr<6OC*!{Md^c8OA~v_PmjU))N-V^VLv#aILBO?-%km6Huh{mDi}erQ$Kz1XD3a9#Av4gbrjJIC8iyW-(kj8)$t z;c`A(WyNirg%0nQTA|R^)5-rf+wWV1$IbQSc_*?>F%{W#^NeW_eU;2Nji7)E3EfWM z=8oKOn?d1(z9 zfF~;d>pNMoiw&6!wEZg)j;l%F<%n(H zFmR+0)1qop&K8@9Q=IQ%kc<_6$Cr#thUN;~oyLaX;uYS}VZQ#y1j6pnYlxYDHl(j# z(;6GIPH&OlZq;cK;bLj9Y9J)eg!>KTAuqxSM9;Mvcf$eKPp%_CD-m?%72HG=+}1sI z2M1P9XBWS$;M2z1(BqZ9wPJL7PD}~bg^(7&`+C*maXS35vQ2^jZ9w`dPr;)g6QKiq zPB}K)cxrd(y7>Qo0MuzjZQFtZr06G@8e2huyJ3KJ1-%dkiqOMm!wnitatlHygcC`J z8@+1sjFA0>E#PTzWhsM#w}wauJgg3e@jh8&c=&^_(fU(JtYRrOgciZs=Tr?ZG$1#I zuq3__xstY+blutKJ3hwM=~P()uljO^2cRHwJHjA}gh*M5evz-EDDo_Yh=%3~F~J!D z!4A_Z{}K=+Rnik7Zb+USluJ+| z6k;@G*Yu*wBR7hvsR1{oeb9t60KM>EEXDgDf9Qk3fLHw12_mB=0F!bYl;wwzg_rm@kGXgEUCQC-rmECmW-wxA-_rY^TpA9Sb2HrQB4}ZVF0U5Zl2-h6 zPz_S!_8CptKRx2o|igi@XiO6K-|FF(I(@esCFh)ShHi9+ae?0yY+G;X1>&C zl0s^eS@AO%6D%1M%aquW z(^^f&j^kCb$8XG)TKZvW87C$nRq$ZpoUQAt5XaQjNu73|k&^0Vm^}u)i}md?co5XI zLl>U*-cg~dVz9EaG6yAW`}NLOxka>GdjuWa>{7kg5iL8i_77z9489ZN9&fq!p==X( zFnbGcpn0fpC|K;y7n54?)>bQgPucfkUZaM^O7H>j{l4D6OTCDCr9`xLeNc43Y1=u|sqzS9`T8&B84r}ty zMYpNH@bWc)1JIIfBu|paA-tmZhylAJ@C|%uukrrO6g!%dDkKW{Q?Tl0fyTp&#)*ro7W%I<4L=14LZOslFdHz?4k<}%a(9<-v7jj}DhjkxSzIhJwgtvK<3ORpq zYV9`8ilFQA-+|QI+zj4iEzx;r%BfjrWqYN}zx7^} z8BoK%!}jOqipH_s?;u{10>5#P`EKN@e|qyD7rw|k(Hyo(BeQdey8>j!ik)Cgev)O$gEMH=4A+2xd9G+y`wq-vYY2l6Zd_xpt zRVF>g43xO=A z4g;Nn0_1(YDIuuNJLMiA9fi9ejfRYSc8(qaIg|^HmaCOwSTWEMp5O%8R4A~d2`N|R zv94ra_%9M8s}S7t?2b2UPIu92rP06T+wrF4@oeE^UFwhp#Ud9!B$#GG22g{>%Dx7P zo9=Hjgr{m$K>3G1fhQEo_L6;dgo(5+WoLp4-MK-4 z2i6T&bYv<95nmbna~%IXb~?P;k`RFt;FLe$QgiMMJZsR+k`3w!&(B74my0m8E2~Iy zLfe4Lb!w+jc!91|_7`l+XGG#39RIBSn7~6iPJD_HQ4?LVZB`KCf z3%>M4fU>Q@;9TMHS_*msk8q);wd-${;69;lMpYEHC3w+BX!W>bQ*^_M`L}GMdI1Cu z|KBT#=qv6;=qvK4$giAj$gYeKv&PMmqs&2_JvtiF@`@rsKrlJn-MAbF6U7Za7it4|jAqOIEZx z-M3!+!z83g_Zb4FhR?4)lGDR!W&TLY&F|3FBpiKmC^}t>Gn~D|Dnut`bau|y#LcWP z5ms!W*4n)he>7oS_!Oo#PlKsI&+uTaG-nx^7gN80srtnHkuY*dBfl*^n%(=D_H@nv z7MD=KIy30~1+E=VV=oTOpg%NL;)ljaC=px6*5`YRWp6gEO>^?Na=6cH&GK&!iM9ql zLSV5^Fg-T57O6Y}auUv;4Uq(b0Fsu!(HsXkI{F?&F$oG62w4Sl+Adcbn#;$&WE>3o?PL-$|-cPExzqDe#@g$ zr6rQ*B|Jt4O-(gg{q;#lprW6pg<>zYbxx(o>x*;qb{&Vg?V57-Yfkx`JswNjwM+oR z++7AL=<$=6`JPLyP-Qzkz0D;>+U2d8LK{MR6O1yT`cO0n5x%U(a(ReQdrwHhJF$^I?N4m11d|Fz#1Qwz*UK=nK6u z>~=SXg`emN{&~1_Xljj8dp%lb^RUP;#6KYonuD|ODHB-w>ON(69`)s5XcmNXBfB*mg|NsC0|NsC0|NsC0|NsC0|NsC0|NsC0 z|NsC0|Nr0<-#&f!r{6w4XD@x95CHk*_sQ04&gkagJp;|>u5CA6ySC2g!(h&7o!@q~ zZ+Cmxxn<{tQ_xK&Kxv|SnKGZGC+RYtO)_F?Y)wx|>7kVNjY;OF)YQ#2HjE~w$u!aG zHkxGhJyTB#dY`G9n5M~;*)hQN4NoRajEn%J{WUz987aIcp&pGB0%X9-V@*s6wGB3!HBSIk z{ZCC2dOVY0lWitQ%}i=~o=irG>UyW?Ptl5nQ_xK`1klMEdQA-`@l2kCG@5=*GMP4} zN$f_c>7?5v$)}KNU}UGM`cu)Q^lD;ZkdJCLJrU@bXfzr&keR5_h7qTv3FtI5PZJ@d zB6>AF8bTovPh`ks29wo2Q}n0elPBoa9?>;C6nPQl81)nKL~NRw4=L(4rjHcUBhq^+ zF{YE+nreEQ1IONb^g*L)15G_Z zJx7XZp`i5uGCf9y)HDDdrk+rGgHKQdi6l)lDW{~<(4L}tdZ)Ed6V${b)bt4UQawkh zlWJ&fCyIJyKU8=~c~25yH1$0b(^0aUBWgy;l6sz#XxURt(-iiSJery5Ow{t8)gP*P zH8!V{Qz7Ps57h^h(Dgku#T!FYonHlvWSlUB;vw;z4X+P@ETfn)aYEK)N|qZD4k8;+ z;ll>o5@P|moSSY_+=}LcuNt6>4ZPb)NuW=_?s-*rcRuRWHKH0rz<0tKLLn@a0si1-fQy`KITe1COYK@US zeBRg4dP?pDc~xt8ocOe{e9V>T?XN9$GH&^^I*bEb-4(*Yj_vto_%O_xOPMbGo<2HB zGAjH2?ao-oX7N|K7d>gZD(cN)r(I8hou%ADxJ{k|9zFG59c@AN9Z_Z~SEp6mma%#TZq@E5Y>>?yXp|@{q zF$nhk5XnCO_U+eVFROl6yKmB_XnmN<&OfeFHT{(@uZCsuxmb56qYb~fVeIVlyfZHn zb=RnV85my>1iQ+>Engf6xk#K_o=tAp2jczTiV*uv5+}QG~7zsKGz_iON zz`)K9>@meX91>l+Qy303f_NMf^pKpBAvQ1@+`%XkIsA11B%okR6B&gq&^eNynoqzC z36teMDKa=^NDN~lk>oOnLIqWhq z7BkC?1O~NU(SuRQGg5(O2gIK7x(gdKSt{$rSwLcDxMXW(F=PY;AtoIBs7@&Ol+_h< zuJ22;mFBkhTVt`ewyyGf4X?VL>1*`)S6JVQ5uxJQrF{Z$5Ltrkr z`e|$Wwq!k&XVZ6Fq?XksTaYtB>*x=A6~S%aW_64#8V*7_!>$$VU!+q88?FepudS66 zuy+_gj57Mh3#d5+CcHx3fQQ4mH+->0)iYIX4Bxu}01zMo9)r5LT}eko9p8mS5F?45 zg@Ar_P2-Q%bDVB9wLkXa_aA?6rK+hwBWvVFBIBXTW;J_8Fy!IMMv4k3mAwBoPssG3 z@Ln9jM^zjsPyi?hT0sGze|UYFW&FJIyoK=T-t#+pdl8cq9ScFh6Al0X6}^efF`YS9 z;{3PK`Vi2*atY!s`>(&&joCs#01$Dm{^iu-Zn8VvUMJ`L&f0nc<(Rf?^__mBX*mec zJ@=XPq$E<@buvjFyX7&$s0MWBs&cvKFhZG=&OePZvGFPesYH9|1 zg$Wv>gH%+WX#i@P)Im0mQL6N{eBs+>Lazm>JRngF~i#M1}_1$#&YUN;`+WnG!hX1g3vXsGpc zljl#F#*Uy|(*zh+04$g>gK4ywCfGK?At-1KivU1mSdtKGJLo*NgSlx1Cs_eKr$Pf~ z_rC;^LNh2yh(^3gq}ZXAO(cUZ(dsZjiY@yvgLK3AFuPIVNm=mWXSaI(c@qF4t{f22 z+EshXf@PKVu(ANfPJ{Ko1>xVtYI6jGnrObXNr!^VcMp8m*J!!#BGr@HuzP)uvu`k~ zS_7YA1A|Mb(@b^TzRQW%p&Tu7g6j z-3v1US@%XjEBJ1CXBm=MgRyA z7?pvl;#y#LT}8SAj-ZNYsbBzsOdT#2K1EiKInv#9ju3+ehQ|FGv4vE{k&-Gonf*0n zbsCjI5Ioc$&ct4sR#N8Ja}dx?N-3W~bHdrjFNGVs?3Cfc32De7-7AGItjg_z1#;T8 z%5K?SBe*hX3qkm28}@zpx6a_jnY@IJtT+H_i)kTd#o+ju@8OK&%#`wyM!TU=fImla z&{D~a000BL&5ZrwXc7(Ed%D(DmnV(sHg=1fH{Puh9Ih60`3nk2c-(_;Wu{c+XMO|( z3WL3_&d+8I+wEiio3+cs$W>nLmBGjAXbcC?&gpgw5Y>e84xXJmih@e`q5}+8px1mX z#b7~tp|)7ds_t>CekfiUj#hxYS!SDD9<~x`Z73GDLt$~M6UP$FOA>CTyR=Z2 zPS(cZVPa^M^I1kZu9#(eUk&9lc4w|DD=Aw{$}gLHCHp#BBzBBT?l*hnG)wEp_=O7A ztce)#h%GOLVj7Us>3ZkjN27>&b}&L*DplpvgauzI zMY9R{@4BGoIIh?1|pq6>Y&8*_#LC!)a zMGh>*74aJ73WG2&AqVg35~}mY5t86wLQ%O8gtJ2b)Y>@h)7EtbzoZW(8xfgDce8 z%S_dC>G05JUdomxej~kYpr*ucYPw!WNqeE&{c4F`UU}-Ct)DjU3f~%IT~2C7d6b*J){pz+?QAeLMjn{_`aH+-_>>9Bg-Rq(duE($6Qs<93Z5}ncti!$ z=>VvYES+2($2Z>|&$YxV=onDjht3|DY^xYG#tsE&w0A=taR0RJ5{1nHB|bU5>8lw= zBq61A6|)iAjPjV(o{VydgYRqK%TZy@O1LjZq=ScP@~fdM4Bv?2a1n3eQu@AI}25T<7rM+l4ss$si(+b{wis+8|nM5u-Iu1K04|IPb0n52z zwqTxTb-GN)hZ0{Nil*cpHM!kq2LHO3k!|a*g!EiJ$WHI7ukt|{$vXAfHni#5osUM3 z9uc{`FFQo5qOb@^8lS5n=Cd9@AA9X#3W$%_N$ahd9DAhVpL}U`Qf#Z(XLKpw{5BY> zLiKIVSk0g0%vMbgIM)fuo#a59T_aGzkABEwHEe{?#@iWd3KMONY)z)n+kIRpZ6=u0 zX|b`kHri$VkT7kw+iODLOzqqzG%>W;*wNECV+PoU+iYz%XkrshxANg+q(W^r+ii?s z*w{8UG}G+iN1p(-&Rvs_0h$RwS~Q?(HXZ)kH(NE6p4%Ur6`!&eS;zRQvYP?sLgli- zm<+sAyW=#Exo5Yyj*8Ye7~6ulBg>3ig~S!8><1+yU-XxL_h;3X*mWa0(Ev=hdTT=6 z59b&=ADY#&v(d?ZJ3sFZF_$r0|2x=`f_MDC+mTzBv5;gMw$bCgZFKA- z{`e@!usCnD@nXA+A3H^FZ+@JPWygJjo1kf8{4G?xj)w;$3Hcz%X)uyLOIeIO8P2O) zEsNP}V@|`7(fV>)tkigax;QOmxG|D-x*pm`0(Tx&48ctu*MS?3WD0WyN1}kWXRgxA zGzoRAOEAQMvtV@A9e=6G+|Ih&RIjy(yK~(J8tuQKkNvuAbj!c-wSR@Of1L(v89Q|< zu(a_NK)01~?IU-F-Z8k=S5$MSGjcJ>EU9Pt;-2cYHi| zf6h*b&fhY;tixX(njd}1Y^(ZKOX~e3U~H5@hc{+jWImeDYrZNsR?4l%^uPwiQB_h< z*!_ann!_vWaMc6st9^RlYge`WIu;+@F+CL9{#0{Tn!kJFn{u|fl47xv?>EciUOwxS z;{7X=Tea65Yf}Ds>I3-8mw47tc0{Di>iiSZQ!v~1&s5zfgAXV`?X=fSNn>AhZF_K0 z`d@o|l(+cm^}O9|rs^+R|H_tIYd$*&F6GGywde4vDh$4tnR~5ujx&b@jbx$kN)B2B zyWwMyi|LJKGfwP4t&%z$I7ZxscL>r^U|DAUH}qW&X{j35%pO{_mE#lYh0ssiue(>0 zj?>@_VxoFti))AVbL%FR*O&3W$`%=_*1j_yVdb@`kugioRZpXikl0x6x9fX=!7H89 z`}eiDMf30#3>|vn#_{x~5})oStZ>4H?j_|KE=Xb=L-3}C8wBjBR`$0yp9P2zzXb+zYu_phkqoLzzK<2>qQVNSoDeV0ByA3T<3EFi&ndTWj$*_#?F5F2vV2N&;l;OqQhYdltLT*sv+lsPa^F4)!U5Te z?;TL4hYxP=587ABNG=E)CRy!orrWf94m`R&4{Wn)c2hsr=WwO|dR&^=B-3V;O3)3% z?kYT&%gCWt&>Zx0OH9?Rz-!lE>R)j)(B4ntGblwe4iMQvs?o!c-#d%5-l2a(0=xemKAzEO<0T^q; zpc6>9K07)7Beza(g_dXZblyX+CCZU8IlfRt_A$?qq!*>dJU#(Dt1hk-O$ z{2GGj31$2BxMz!;Ii67CRk$0qhcG=dFmSG&_F*PhW1{?O#VJYSuekkIqGd(#r6LOL z^NlQjRG(-+?U9?91LUxwk$a2)SsmpV)$t2LFwvj1xi6Q{J3?NAtD+YAiilgf@rBH|le9^inCjR~Ghk&N zFMX5FV5~lMR$yR4hhpth12tZ1APRrYmD#+ASVwyQEeX#ypHDT^^{o7Jg~4+-kmW(< zY1=XS&#C^!AzaqW&)B{&$sqImS7K-6@F6(oqZ%L{&KqXaYsW0cdZp^G17dm63}CD+ zFP=SFaPqs$@4+_S$`3<}%gW$wI}e8c*oFlQxNPw@lXiTz+qQfy28axEQNGt*iyY2O zLLtzo=1A!NDO_Fg-ROJJ>OKI_?H31B?cir<^=)!N^}E1Q$^EnfiM%r95KKKLi~I8r6w zoDXBxZ)}xMFrs-%1-iod-EpR1N}~uqZe38MtJ*JNkCnqy8ok}!f6d3wDj?n2QxwFZ zti%ZL)^<51FC1}nU}-GmYgl)T*_o{W$KBUi1&~#PCq7)@Hl*=>87Vkf-Q~9@arzQwyuGo@|LT}zrL`6 z*bDeUl=dKT3^Df7KDPa+mqb~rvlQoncZOU$}DZ8Ok22S#L}k-Lgh@AFBbsKP`^zqyxn^wMCwr zM{C0uwsQw6J5?#t936m_vJ*xL@b+t8-y7f3_`j4!th>~iL4)BK^#!6&38(WM5P$Nr zxutbyEBxqSS@p-OSmvtxZ>Sh^^vYQT2C@xS4dX!O@h(4}fqF2|_>f-L0f26TFSmk- zo`}L!wmAC8vY611y%?9jvNb2A+oCuVuDt8t)01M3U82vUn=>NnTiS{*>*Tg;L*?+| zsOF_jyGWmSRfDIxhDxF}#LdeO?D*#==_q$JM^<6Q-^buaI?h%0-0t&}K{@_0(T@pbh{4 zIl1E%GC3&Xl$-Bx*hiY*OfyCc1q9XjQ6BFDl8n` zdDR)7c9J!0M1|yxAXtzLc1+x4<&uHQQJNJxQ;P-$bwH0tszrpJzrYwCi2J#}F!!y( zQW*f4*xpJpw|wG0LmRSWvm0kn<+*505J3oe6YJ0B$ph$m=US%qDu=>!2YnsIv8oCL=v7y>La49!K(AsDP4rcxT>Rp9F>@1>-yBj zGag6ce7F{zSqk#SC{PH8+7Np#X)=GTxRKO?m^@gTZ~8caba=kfnEgv44(Zdz8QtqHbycOa-W1MEufGgYh8X6*i_HSniky%x7Je383KdCl z6dfRn-hfN;!UX6@KsGN7=iO5nA*Bc{bCVqxuhO0{U(ghq?S8J}mZ8}{`V#^6NQP#! z2*4gj$?ZE5e7(LF#R)CZN;WITJG@{StbGWdaS6 z0ss?@S-WA22Rv@&P{%jd5w>$hY(fecQ59Bc>c&roGxK zke_JvK2B*Zf>r^Ua!yTlZk&0EVRD#iijis#kU<3Nw^m%}B9ABqWjafLuGF5eXD@Vg z_f>b1SHF-xgU2jhd!y2AD)?0RK4wuXrI1U>=uW-j?=CTviRoBNroCQ0;V8YDWkrBE zgPzE%QP@hw#?QS{=y<){oeTV54wdxu@!RRmCk8*u|Nd=ub!cG6j{Z^-qYKoHY`=(ZQlZUEcb z^3*L3uJW3vAh2soJU2ob8jBm0#JZu=J$7~{LcXW>dg0MzRDF8FktCJq=i6Z-rZe8duCJI^o(sbnrlT-g94jGrv)6irqJu#rX2SmcyQ_m>cA#1yu5Ds4Cky^t&Iu(8=tDvt zn(6_3HYJ7Dz0M8gpIUnKpF$mcVir5=l%#;-X%38p#QFbL)0LGlXKu3>V+We!gw^54 z2l=X&Tk5bX^(k7Xp&`L98xZ9yW8YUOi3WX~Y}vI4lH>kIw54n7MLc7;w{8`aE>)-h Oi@744C`fUnO}qdD+D=&j diff --git a/man/agg_sim_data.Rd b/man/agg_sim_dat.Rd similarity index 72% rename from man/agg_sim_data.Rd rename to man/agg_sim_dat.Rd index 06e549c..f221ecb 100644 --- a/man/agg_sim_data.Rd +++ b/man/agg_sim_dat.Rd @@ -1,15 +1,14 @@ % Generated by roxygen2: do not edit by hand -% Please edit documentation in R/agg_sim_data.R +% Please edit documentation in R/agg_sim_dat.R \docType{data} -\name{agg_sim_data} -\alias{agg_sim_data} +\name{agg_sim_dat} \alias{agg_sim_dat} \title{Aggregate simulated annuity data} \format{ A data frame containing summarized experience study results grouped by policy year, income guarantee presence, tax-qualified status, and product. -An object of class \code{tbl_df} (inherits from \code{tbl}, \code{data.frame}) with 180 rows and 12 columns. +An object of class \code{tbl_df} (inherits from \code{tbl}, \code{data.frame}) with 180 rows and 16 columns. } \usage{ agg_sim_dat @@ -23,17 +22,25 @@ specific product. \details{ \describe{ \item{pol_yr}{Policy year} -\item{inc_guar}{Indicates whether the policy was issued with an income guarantee} -\item{qual}{Indicates whether the policy was purchased with tax-qualified funds} +\item{inc_guar}{Indicates whether the policy was issued with an income +guarantee} +\item{qual}{Indicates whether the policy was purchased with tax-qualified +funds} \item{product}{Product: a, b, or c} \item{exposure_n}{Sum of policy year exposures by count} \item{claims_n}{Sum of claim counts} -\item{wd}{Sum of partial withdrawal transactions} \item{av}{Sum of account value} \item{exposure_amt}{Sum of policy year exposures weighted by account value} \item{claims_amt}{Sum of claims weighted by account value} \item{av_sq}{Sum of squared account values} -\item{n}{Sum of records} +\item{n}{Number of exposure records} +\item{wd}{Sum of partial withdrawal transactions} +\item{wd_n}{Count of partial withdrawal transactions} +\item{wd_flag}{Count of exposure records with partial withdrawal +transactions} +\item{wd_sq}{Sum of squared partial withdrawal transactions} +\item{av_w_wd}{Sum of account value for exposure records with partial +withdrawal transactions} } } \seealso{ From ec712191c29a38fd67b332939e7fd3441acb3b7a Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sat, 25 Nov 2023 09:51:20 -0500 Subject: [PATCH 14/18] added as_trx_df helper functions --- NAMESPACE | 2 + R/trx_df_helpers.R | 153 +++++++++++++++++++++++++++++++++++++++++++++ man/as_trx_df.Rd | 133 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 288 insertions(+) create mode 100644 R/trx_df_helpers.R create mode 100644 man/as_trx_df.Rd diff --git a/NAMESPACE b/NAMESPACE index 38c945a..31c3118 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -69,6 +69,7 @@ export(anti_join) export(arrange) export(as_exp_df) export(as_exposed_df) +export(as_trx_df) export(autoplot) export(autotable) export(bake) @@ -92,6 +93,7 @@ export(inner_join) export(is_exp_df) export(is_exposed_df) export(is_split_exposed_df) +export(is_trx_df) export(left_join) export(mutate) export(plot_actual_to_expected) diff --git a/R/trx_df_helpers.R b/R/trx_df_helpers.R new file mode 100644 index 0000000..624cf95 --- /dev/null +++ b/R/trx_df_helpers.R @@ -0,0 +1,153 @@ +#' Transaction summary helper functions +#' +#' Convert aggregate transaction experience studies to the `trx_df` class. +#' +#' `is_trx_df()` will return `TRUE` if `x` is a `trx_df` object. +#' +#' `as_trx_df()` will coerce a data frame to a `trx_df` object if that +#' data frame has the required columns for transaction studies listed below. +#' +#' `as_trx_df()` is most useful for working with aggregate summaries of +#' experience that were not created by actxps where individual policy +#' information is not available. After converting the data to the `trx_df` +#' class, [summary()] can be used to summarize data by any grouping variables, +#' and [autoplot()] and [autotable()] are available for reporting. +#' +#' At a minimum, the following columns are required: +#' +#' - Transaction amounts (`trx_amt`) +#' - Transaction counts (`trx_n`) +#' - The number of exposure records with transactions (`trx_flag`). This number +#' is not necessarily equal to transaction counts. If multiple transactions +#' are allowed per exposure period, `trx_flag` will be less than `trx_n`. +#' - Exposures (`exposure`) +#' +#' If transactions amounts should be expressed as a percentage of another +#' variable (i.e. to calculate utilization rates or actual-to-expected ratios), +#' additional columns are required: +#' +#' - A denominator "percent of" column. For example, the sum of account value. +#' - A denominator "percent of" column for exposure records with transactions. +#' For example, the sum of account value across all records with non-zero +#' transaction amounts. +#' +#' If confidence intervals are desired and "percent of" columns are passed, an +#' additional column for the sum of squared transaction amounts (`trx_amt_sq`) +#' is also required. +#' +#' The names in parentheses above are expected column names. If the data +#' frame passed to `as_trx_df()` uses different column names, these can be +#' specified using the `col_*` arguments. +#' +#' `start_date`, and `end_date` are optional arguments that are +#' only used for printing the resulting `trx_df` object. +#' +#' Unlike [trx_stats()], `as_trx_df` only permits a single transaction type and +#' a single `percent_of` column. +#' +#' @param x An object. For `as_trx_df()`, `x` must be a data frame. +#' @param col_trx_amt Optional. Name of the column in `x` containing transaction +#' amounts. +#' @param col_trx_n Optional. Name of the column in `x` containing transaction +#' counts. +#' @param col_trx_flag Optional. Name of the column in `x` containing the number +#' of exposure records with transactions. +#' @param col_exposure Optional. Name of the column in `x` containing exposures. +#' @param col_percent_of Optional. Name of the column in `x` containing a +#' numeric variable to use in "percent of" calculations. +#' @param col_percent_of_w_trx Optional. Name of the column in `x` containing a +#' numeric variable to use in "percent of" calculations with transactions. +#' @param col_trx_amt_sq Optional and only required when `col_percent_of` is +#' passed. Name of the column in `x` containing squared transaction amounts. +#' @param conf_int If `TRUE`, future calls to [summary()] will include +#' confidence intervals around the observed utilization rates and any +#' `percent_of` output columns. +#' @param conf_level Confidence level for confidence intervals +#' @inheritParams expose +#' +#' @return For `is_trx_df()`, a length-1 logical vector. For `as_trx_df()`, +#' a `trx_df` object. +#' +#' @seealso [trx_stats()] for information on how `trx_df` objects are typically +#' created from individual exposure records. +#' +#' @examples +#' # convert pre-aggregated experience into a trx_df object +#' dat <- as_trx_df(agg_sim_dat, +#' col_exposure = "n", +#' col_trx_amt = "wd", +#' col_trx_n = "wd_n", +#' col_trx_flag = "wd_flag", +#' col_percent_of = "av", +#' col_percent_of_w_trx = "av_w_wd", +#' col_trx_amt_sq = "wd_sq", +#' start_date = 2005, end_date = 2019, +#' conf_int = TRUE) +#' dat +#' is_trx_df(dat) +#' +#' # summary by policy year +#' summary(dat, pol_yr) +#' +#' @export +as_trx_df <- function(x, + col_trx_amt = "trx_amt", + col_trx_n = "trx_n", + col_trx_flag = "trx_flag", + col_exposure = "exposure", + col_percent_of = NULL, + col_percent_of_w_trx = NULL, + col_trx_amt_sq = "trx_amt_sq", + start_date = NULL, + end_date = NULL, + conf_int = FALSE, + conf_level = 0.95) { + + if (is_trx_df(x)) return(x) + + if (!is.data.frame(x)) { + rlang::abort("`x` must be a data frame.") + } + + # column name alignment + req_names <- c("exposure", "trx_amt", "trx_n", "trx_flag") + if (!missing(col_exposure)) x <- x |> rename(exposure = {{col_exposure}}) + if (!missing(col_trx_amt)) x <- x |> rename(trx_amt = {{col_trx_amt}}) + if (!missing(col_trx_n)) x <- x |> rename(trx_n = {{col_trx_n}}) + if (!missing(col_trx_flag)) x <- x |> rename(trx_flag = {{col_trx_flag}}) + + if (conf_int && !missing(col_percent_of)) { + req_names <- c(req_names, "trx_amt_sq") + if (!missing(col_trx_amt_sq)) x <- x |> + rename(trx_amt_sq = {{col_trx_amt_sq}}) + } + + if (!missing(col_percent_of)) { + req_names <- c(req_names, col_percent_of, paste0(col_percent_of, "_w_trx")) + } + if (!missing(col_percent_of_w_trx)) { + if (missing(col_percent_of)) { + rlang::abort("`col_percent_of_w_trx` was supplied without passing anything to `col_percent_of`") + } + pct_w_trx_name <- rlang::parse_expr(paste0(col_percent_of, "_w_trx")) + x <- x |> rename(!!pct_w_trx_name := {{col_percent_of_w_trx}}) + } + + # check required columns + verify_col_names(names(x), req_names) + + new_trx_df(x |> mutate(trx_type = col_trx_amt), + .groups = list(), + trx_types = col_trx_amt, + start_date = start_date, + percent_of = col_percent_of, + end_date = end_date, + conf_level = conf_level, + conf_int = conf_int) +} + +#' @export +#' @rdname as_trx_df +is_trx_df <- function(x) { + inherits(x, "trx_df") +} diff --git a/man/as_trx_df.Rd b/man/as_trx_df.Rd new file mode 100644 index 0000000..e87aac1 --- /dev/null +++ b/man/as_trx_df.Rd @@ -0,0 +1,133 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/trx_df_helpers.R +\name{as_trx_df} +\alias{as_trx_df} +\alias{is_trx_df} +\title{Transaction summary helper functions} +\usage{ +as_trx_df( + x, + col_trx_amt = "trx_amt", + col_trx_n = "trx_n", + col_trx_flag = "trx_flag", + col_exposure = "exposure", + col_percent_of = NULL, + col_percent_of_w_trx = NULL, + col_trx_amt_sq = "trx_amt_sq", + start_date = NULL, + end_date = NULL, + conf_int = FALSE, + conf_level = 0.95 +) + +is_trx_df(x) +} +\arguments{ +\item{x}{An object. For \code{as_trx_df()}, \code{x} must be a data frame.} + +\item{col_trx_amt}{Optional. Name of the column in \code{x} containing transaction +amounts.} + +\item{col_trx_n}{Optional. Name of the column in \code{x} containing transaction +counts.} + +\item{col_trx_flag}{Optional. Name of the column in \code{x} containing the number +of exposure records with transactions.} + +\item{col_exposure}{Optional. Name of the column in \code{x} containing exposures.} + +\item{col_percent_of}{Optional. Name of the column in \code{x} containing a +numeric variable to use in "percent of" calculations.} + +\item{col_percent_of_w_trx}{Optional. Name of the column in \code{x} containing a +numeric variable to use in "percent of" calculations with transactions.} + +\item{col_trx_amt_sq}{Optional and only required when \code{col_percent_of} is +passed. Name of the column in \code{x} containing squared transaction amounts.} + +\item{start_date}{Experience study start date. Default value = 1900-01-01.} + +\item{end_date}{Experience study end date} + +\item{conf_int}{If \code{TRUE}, future calls to \code{\link[=summary]{summary()}} will include +confidence intervals around the observed utilization rates and any +\code{percent_of} output columns.} + +\item{conf_level}{Confidence level for confidence intervals} +} +\value{ +For \code{is_trx_df()}, a length-1 logical vector. For \code{as_trx_df()}, +a \code{trx_df} object. +} +\description{ +Convert aggregate transaction experience studies to the \code{trx_df} class. +} +\details{ +\code{is_trx_df()} will return \code{TRUE} if \code{x} is a \code{trx_df} object. + +\code{as_trx_df()} will coerce a data frame to a \code{trx_df} object if that +data frame has the required columns for transaction studies listed below. + +\code{as_trx_df()} is most useful for working with aggregate summaries of +experience that were not created by actxps where individual policy +information is not available. After converting the data to the \code{trx_df} +class, \code{\link[=summary]{summary()}} can be used to summarize data by any grouping variables, +and \code{\link[=autoplot]{autoplot()}} and \code{\link[=autotable]{autotable()}} are available for reporting. + +At a minimum, the following columns are required: +\itemize{ +\item Transaction amounts (\code{trx_amt}) +\item Transaction counts (\code{trx_n}) +\item The number of exposure records with transactions (\code{trx_flag}). This number +is not necessarily equal to transaction counts. If multiple transactions +are allowed per exposure period, \code{trx_flag} will be less than \code{trx_n}. +\item Exposures (\code{exposure}) +} + +If transactions amounts should be expressed as a percentage of another +variable (i.e. to calculate utilization rates or actual-to-expected ratios), +additional columns are required: +\itemize{ +\item A denominator "percent of" column. For example, the sum of account value. +\item A denominator "percent of" column for exposure records with transactions. +For example, the sum of account value across all records with non-zero +transaction amounts. +} + +If confidence intervals are desired and "percent of" columns are passed, an +additional column for the sum of squared transaction amounts (\code{trx_amt_sq}) +is also required. + +The names in parentheses above are expected column names. If the data +frame passed to \code{as_trx_df()} uses different column names, these can be +specified using the \verb{col_*} arguments. + +\code{start_date}, and \code{end_date} are optional arguments that are +only used for printing the resulting \code{trx_df} object. + +Unlike \code{\link[=trx_stats]{trx_stats()}}, \code{as_trx_df} only permits a single transaction type and +a single \code{percent_of} column. +} +\examples{ +# convert pre-aggregated experience into a trx_df object +dat <- as_trx_df(agg_sim_dat, + col_exposure = "n", + col_trx_amt = "wd", + col_trx_n = "wd_n", + col_trx_flag = "wd_flag", + col_percent_of = "av", + col_percent_of_w_trx = "av_w_wd", + col_trx_amt_sq = "wd_sq", + start_date = 2005, end_date = 2019, + conf_int = TRUE) +dat +is_trx_df(dat) + +# summary by policy year +summary(dat, pol_yr) + +} +\seealso{ +\code{\link[=trx_stats]{trx_stats()}} for information on how \code{trx_df} objects are typically +created from individual exposure records. +} From 91f7c5895b8fcc00eadc2ef6becf605f27c5cf54 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sat, 25 Nov 2023 09:55:02 -0500 Subject: [PATCH 15/18] doc update --- R/trx_df_helpers.R | 8 ++++---- man/as_trx_df.Rd | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/R/trx_df_helpers.R b/R/trx_df_helpers.R index 624cf95..726d58c 100644 --- a/R/trx_df_helpers.R +++ b/R/trx_df_helpers.R @@ -22,13 +22,13 @@ #' are allowed per exposure period, `trx_flag` will be less than `trx_n`. #' - Exposures (`exposure`) #' -#' If transactions amounts should be expressed as a percentage of another +#' If transaction amounts should be expressed as a percentage of another #' variable (i.e. to calculate utilization rates or actual-to-expected ratios), #' additional columns are required: #' -#' - A denominator "percent of" column. For example, the sum of account value. +#' - A denominator "percent of" column. For example, the sum of account values. #' - A denominator "percent of" column for exposure records with transactions. -#' For example, the sum of account value across all records with non-zero +#' For example, the sum of account values across all records with non-zero #' transaction amounts. #' #' If confidence intervals are desired and "percent of" columns are passed, an @@ -42,7 +42,7 @@ #' `start_date`, and `end_date` are optional arguments that are #' only used for printing the resulting `trx_df` object. #' -#' Unlike [trx_stats()], `as_trx_df` only permits a single transaction type and +#' Unlike [trx_stats()], `as_trx_df()` only permits a single transaction type and #' a single `percent_of` column. #' #' @param x An object. For `as_trx_df()`, `x` must be a data frame. diff --git a/man/as_trx_df.Rd b/man/as_trx_df.Rd index e87aac1..60cbf2b 100644 --- a/man/as_trx_df.Rd +++ b/man/as_trx_df.Rd @@ -84,13 +84,13 @@ are allowed per exposure period, \code{trx_flag} will be less than \code{trx_n}. \item Exposures (\code{exposure}) } -If transactions amounts should be expressed as a percentage of another +If transaction amounts should be expressed as a percentage of another variable (i.e. to calculate utilization rates or actual-to-expected ratios), additional columns are required: \itemize{ -\item A denominator "percent of" column. For example, the sum of account value. +\item A denominator "percent of" column. For example, the sum of account values. \item A denominator "percent of" column for exposure records with transactions. -For example, the sum of account value across all records with non-zero +For example, the sum of account values across all records with non-zero transaction amounts. } @@ -105,7 +105,7 @@ specified using the \verb{col_*} arguments. \code{start_date}, and \code{end_date} are optional arguments that are only used for printing the resulting \code{trx_df} object. -Unlike \code{\link[=trx_stats]{trx_stats()}}, \code{as_trx_df} only permits a single transaction type and +Unlike \code{\link[=trx_stats]{trx_stats()}}, \code{as_trx_df()} only permits a single transaction type and a single \code{percent_of} column. } \examples{ From f24a367d2699378efef912410d5cf7d28cd2ff04 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sat, 25 Nov 2023 09:58:57 -0500 Subject: [PATCH 16/18] updated start_date default --- R/exp_df_helpers.R | 2 +- R/trx_df_helpers.R | 2 +- man/as_exp_df.Rd | 2 +- man/as_trx_df.Rd | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/R/exp_df_helpers.R b/R/exp_df_helpers.R index 3c99800..535bb38 100644 --- a/R/exp_df_helpers.R +++ b/R/exp_df_helpers.R @@ -109,7 +109,7 @@ as_exp_df <- function(x, expected = NULL, wt = NULL, col_claims, col_exposure, col_n_claims, col_weight_sq, col_weight_n, target_status = NULL, - start_date = NULL, end_date = NULL, + start_date = as.Date("1900-01-01"), end_date = NULL, credibility = FALSE, conf_level = 0.95, cred_r = 0.05, conf_int = FALSE) { diff --git a/R/trx_df_helpers.R b/R/trx_df_helpers.R index 726d58c..69417bc 100644 --- a/R/trx_df_helpers.R +++ b/R/trx_df_helpers.R @@ -98,7 +98,7 @@ as_trx_df <- function(x, col_percent_of = NULL, col_percent_of_w_trx = NULL, col_trx_amt_sq = "trx_amt_sq", - start_date = NULL, + start_date = as.Date("1900-01-01"), end_date = NULL, conf_int = FALSE, conf_level = 0.95) { diff --git a/man/as_exp_df.Rd b/man/as_exp_df.Rd index 130711b..06a6e5d 100644 --- a/man/as_exp_df.Rd +++ b/man/as_exp_df.Rd @@ -15,7 +15,7 @@ as_exp_df( col_weight_sq, col_weight_n, target_status = NULL, - start_date = NULL, + start_date = as.Date("1900-01-01"), end_date = NULL, credibility = FALSE, conf_level = 0.95, diff --git a/man/as_trx_df.Rd b/man/as_trx_df.Rd index 60cbf2b..c7128be 100644 --- a/man/as_trx_df.Rd +++ b/man/as_trx_df.Rd @@ -14,7 +14,7 @@ as_trx_df( col_percent_of = NULL, col_percent_of_w_trx = NULL, col_trx_amt_sq = "trx_amt_sq", - start_date = NULL, + start_date = as.Date("1900-01-01"), end_date = NULL, conf_int = FALSE, conf_level = 0.95 From 8269e6041781fc7123477c74b1fd2be4fa424e65 Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sat, 25 Nov 2023 10:21:31 -0500 Subject: [PATCH 17/18] tests for as_exp_df, is_exp_df --- tests/testthat/test-exp_df_helpers.R | 3 +- tests/testthat/test-trx_df_helpers.R | 46 ++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+), 2 deletions(-) create mode 100644 tests/testthat/test-trx_df_helpers.R diff --git a/tests/testthat/test-exp_df_helpers.R b/tests/testthat/test-exp_df_helpers.R index 74236f4..2d980da 100644 --- a/tests/testthat/test-exp_df_helpers.R +++ b/tests/testthat/test-exp_df_helpers.R @@ -17,7 +17,7 @@ test_that("as_exp_df works", { res5 <- res4 |> rename(clms = claims) - expect_error(as_exp_df(data.frame(a = 1:3), Sys.Date()), + expect_error(as_exp_df(data.frame(a = 1:3)), regexp = "The following columns are missing") expect_true(is_exp_df(as_exp_df(res))) @@ -48,7 +48,6 @@ res_wt3 <- as_exp_df(res_wt2, wt = "premium", expected = "ex", test_that("as_exp_df with weights works", { - res_wt4 <- res_wt2 |> rename(expo = exposure) res_wt5 <- res_wt4 |> diff --git a/tests/testthat/test-trx_df_helpers.R b/tests/testthat/test-trx_df_helpers.R new file mode 100644 index 0000000..6c9fd2d --- /dev/null +++ b/tests/testthat/test-trx_df_helpers.R @@ -0,0 +1,46 @@ +res <- expose(census_dat, "2019-12-31", target_status = "Surrender") |> + add_transactions(withdrawals) |> + left_join(account_vals, by = c("pol_num", "pol_date_yr")) |> + group_by(pol_yr, inc_guar) |> + trx_stats(percent_of = "av_anniv", trx_types = "Base", conf_int = TRUE) + +test_that("is_trx_df works", { + expect_true(is_trx_df(res)) + expect_false(is_trx_df(mtcars)) +}) + +res2 <- as.data.frame(res) +res3 <- as_trx_df(res2, col_percent_of = "av_anniv", conf_int = TRUE) + +test_that("as_trx_df works", { + + res4 <- res2 |> + rename(expo = exposure) + res5 <- res4 |> + rename(tamt = trx_amt, + tn = trx_n) + + expect_error(as_trx_df(data.frame(a = 1:3)), + regexp = "The following columns are missing") + + expect_true(is_trx_df(as_trx_df(res))) + + expect_false(is_trx_df(res2)) + + expect_true(is_trx_df(res3)) + + expect_error(as_trx_df(res4), regexp = "The following columns are missing") + expect_no_error(as_trx_df(res4, col_exposure = "expo")) + expect_no_error(as_trx_df(res5, col_exposure = "expo", col_trx_amt = "tamt", + col_trx_n = "tn")) + + expect_error(as_trx_df(1), regexp = "`x` must be a data frame.") + +}) + + +test_that("as_trx_df summary matches an object created by trx_stats", { + x <- summary(res, inc_guar) |> select(-inc_guar, -trx_type) + y <- summary(res3, inc_guar) |> select(-inc_guar, -trx_type) + expect_true(dplyr::near(x - y, 0) |> all()) +}) From 50938818b0f2d88336e0ebc523a9c2814126156a Mon Sep 17 00:00:00 2001 From: Matt Heaphy Date: Sat, 25 Nov 2023 15:00:14 -0500 Subject: [PATCH 18/18] news and misc vignette update --- NEWS.md | 7 +++++++ vignettes/misc.Rmd | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) diff --git a/NEWS.md b/NEWS.md index 3d1d0d0..3b7c0d7 100644 --- a/NEWS.md +++ b/NEWS.md @@ -18,6 +18,13 @@ when `exp_stats()` is passed a weighting variable. - Added a `summary()` method for `exposed_df` objects that calls `exp_stats()`. - The assumed default status in `expose()` functions was changed from the first observed status to the most common status. +- The functions `as_exp_df()` and `as_trx_df()` were added to convert +pre-aggregated experience studies to the `exp_df` and `trx_df` formats, +respectively. +- `agg_sim_dat` - a new simulated data set of pre-aggregated experience was +added for testing `as_exp_df()` and `as_trx_df()`. +- `is_exp_df()` and `as_trx_df()` were added to test for the `exp_df` and +`trx_df` classes. # actxps 1.3.0 diff --git a/vignettes/misc.Rmd b/vignettes/misc.Rmd index 24c43b3..5c3e53a 100644 --- a/vignettes/misc.Rmd +++ b/vignettes/misc.Rmd @@ -24,6 +24,42 @@ library(actxps) library(lubridate) ``` +## Working with aggregate experience data + +Seriatim-level policy experience data is often not available for analysis. This is almost always the case with industry studies that contain experience data submitted by multiple parties. In these cases, experience is grouped by a several common policy attributes and aggregated accordingly. + +The typical workflow in actxps of `expose() |> exp_stats()` for termination studies or `expose() |> add_transactions() |> trx_stats()` for transaction studies doesn't apply if the starting data is aggregated. That is because another party has already gone through the steps of creating exposure records and performing an initial level of aggregation. + +Actxps provides two functions designed to work with aggregate experience data. + +- For termination studies, `as_exp_df()` converts a data frame of aggregate experience into an `exp_df` object, which is the class returned by `exp_stats()` += For transaction studies, `as_trx_df()` converts a data frame of aggregate experience into a `trx_df` object, which is the class returned by `trx_stats()` + +Both object classes have a `summary()` method which summarizes experience across any grouping variables passed to the function. The output of `summary()` will always be another `exp_df` (or `trx_df`) object, and will look just like the results of `exp_stats()` (or `trx_stats()`). For downstream reporting, summary results can be passed to the visualization functions `autoplot()` and `autotable()`. + +The `agg_sim_dat` data set contains aggregate experience on a theoretical block of deferred annuity contracts. Below, `as_exp_df()` is used to convert the data to an `exp_df`, and `summary()` is called using multiple grouping variables. + +```{r agg-exp-1} +agg_sim_exp_df <- agg_sim_dat |> + as_exp_df(col_exposure = "exposure_n", col_claims = "claims_n", + conf_int = TRUE, + start_date = 2005, end_date = 2019, target_status = "Surrender") +``` + +Results summarized by policy year + +```{r agg-exp-2} +summary(agg_sim_exp_df, pol_yr) +``` + +Results summarized by income guarantee presence and product + +```{r agg-exp-3} +summary(agg_sim_exp_df, inc_guar, product) +``` + +`as_exp_df()` and `as_trx_df()` contain several arguments for optional calculations like confidence intervals, expected values, weighting variables, and more. These arguments mirror the functionality in `exp_stats()` and `trx_stats()`. Both functions also contain multiple arguments for specifying column names associated with required values like exposures and claims. + ## Policy duration functions The `pol_()` family of functions calculates policy years, months, quarters, weeks, or any other arbitrary duration. Each function accepts a vector of dates and a vector of issue dates.