Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor for predict_delay_dpar and add_mean_sd functions #471

Open
athowes opened this issue Nov 22, 2024 · 2 comments
Open

Refactor for predict_delay_dpar and add_mean_sd functions #471

athowes opened this issue Nov 22, 2024 · 2 comments

Comments

@athowes
Copy link
Collaborator

athowes commented Nov 22, 2024

In issues #428 and #241 we discuss changes to our predict_delay_dpar and add_mean_sd functions. In this issue we summarise our plan for dealing with both of these somewhat duplicate issues.

Current status:

  • add_mean_sd uses S3 to dispatch on the family in order to get analytic solutions for the mean and variance
  • tidybayes can do expected mean .epred (from samples) but doesn't currently have expected variance

Plan:

  • We alter our predict_delay_parameters function to be inline with tidybayes style and do add_delay_parameter_draws instead. Here we ensure outputs are as with tidybayes
  • We move add_mean_sd to using a generic approach based on samples to work for all families. Ideally, we retain a way to use the analytic solutions when we have them
  • We extend the add_mean_sd to be add_summaries and do quantiles etc. as well (up to options)
  • To avoid users generating many samples under identical conditions (e.g. a model which is intercept only, and using the default newdata as data used to fit model) we provide some functionality for "unique strata in model". This was previously implemented in Issue 213: Create functionality for passing in newdata #231. Remains to guide users how to do this without imposing too much structure on top of packages we rely on (tidybayes). This could be done here or new issue.

Uncertainties:

  • What does it mean for something to be the delay parameters?
  • At the moment predict_delay_parameters does the distributional parameters and also adds in the summaries. Is this too many steps? How hard is it for users to get the samples of the distributional parameters themselves?
@athowes
Copy link
Collaborator Author

athowes commented Nov 22, 2024

Here is the function used in #231 which would do unique strata. Would like to extend this to be model agnostic suggest by extract out LHS of equation bits from bterms.

#' Generate newdata to predict on all unique strata in the model
#'
#' @param fit A model fit with `epidist::epidist`
#' @family postprocess
#' @autoglobal
#' @export
all_strata_newdata <- function(fit) {
  bterms <- brms::brmsterms(fit$formula)
  vars <- lapply(bterms$dpars, function(x) all.vars(x$formula))
  vars <- unique(unlist(vars))
  var_values <- lapply(vars, function(var) unique(fit$data[, var]))
  names(var_values) <- vars
  newdata <- expand.grid(var_values)
  newdata$delay_central <- 0
  newdata$obs_t <- NA
  newdata$pwindow_upr <- NA
  newdata$swindow_upr <- NA
  return(newdata)
}

@seabbs
Copy link
Contributor

seabbs commented Nov 22, 2024

Still thinking about this but...

I think we want something like the first part in order to filter and aggregate data on the fly in the marginal model. I also think maybe we want a model specific function to add the required model data (i.e the second part

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants