Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality for generating individual or strata level delay distribution samples #188

Closed
athowes opened this issue Jul 22, 2024 · 6 comments · Fixed by #210
Closed
Labels
enhancement New feature or request

Comments

@athowes
Copy link
Collaborator

athowes commented Jul 22, 2024

In #170 we added functionality to create mean and sd columns to data.tables which contain parameters from either the lognormal or gamma distributional parameters (with the possibility to implement any other required distributions).

Now we want to add functionality to allow users to easily generate the delay distributions they most care about. A first example of this would be the delay distribution estimate for each individual, or strata, in the model. This object should then naturally be able to be passed into add_natural_scale_ mean_sd (and indeed we will likely call it within the added function). I think plotting functionality can also be built around the output.

One question to answer here is whether this should be

  1. Individual-level: i.e. as many outputs as there are data points
  2. Strata-level: i.e. as many outputs as there are strata, e.g. 1 if intercept only, or say the number of areas if it's a spatial model

In the current unit test:

test_that("add_natural_scale_mean_sd.lognormal_samples works with posterior samples from the latent lognormal model", { # nolint: line_length_linter.
  skip_on_cran()
  set.seed(1)
  prep_obs <- as_latent_individual(sim_obs)
  fit <- epidist(data = prep_obs, seed = 1)
  ld_mu <- brms::posterior_linpred(fit, transform = TRUE, dpar = "mu")
  ld_sigma <- brms::posterior_linpred(fit, transform = TRUE, dpar = "sigma") 

if found a way to do the individual one. Perhaps we should provide both, but broadly I think the strata is preferred. Also, we should really be enabling people to predit whatever they want if we can. Need to think about how to do this / play with available brms functionality.

Notes

# Below contains implementation notes:

# Need to modify this (i.e. "latent_lognormal")
fit$family

# (Which includes these names for the link functions needed)
fit$family$link
fit$family$link_sigma

# To work with these existing brms prediction functions
brms::posterior_epred(fit)
brms::posterior_predict(fit)

pp <- brms::prepare_predictions(fit)
pp$dpars$mu$fe$b
pp$dpars$sigma$fe$b

fit$family$family <- "lognormal"
fit$family$name <- "lognormal"

str(fit$family)
brms::posterior_predict(fit)
  • Is brms::prepare_predictions(fit) used inside posterior_linpred?
  • Check whether order of classes matter
@athowes athowes added the enhancement New feature or request label Jul 22, 2024
@athowes
Copy link
Collaborator Author

athowes commented Jul 24, 2024

  • Through investigation of brms prediction functionality / and how much we can plug into it
    • newdata option
    • Create every possible strata then plug that into newdata
  • Implementation of indiviudal level prediction (as in test)
  • Implementation of strate level prediction (not yet done anywhere) -- could be in the same function

@seabbs
Copy link
Contributor

seabbs commented Jul 25, 2024

yes I agree we should use the newdata approach. We can also provide some functionality to either help users extract unique data points from there data (the simple version of all strata) and potentially to grid expand this to all combinations (i.e observed/unobserved) but I think the first pass should just be creating a nice interface to the predictive abilities of brms that works with new data and with our custom probability family distributions.

@athowes
Copy link
Collaborator Author

athowes commented Jul 29, 2024

  • Possibility of adding a summary function here? I suggest this is out of scope and a new issue using the output of this function
  • We do have existing summary functionality

we are using this inside a function called lognormal and so we could just dispatch on the specific version i.e add_mean_sd.lognormal_samples vs the approach taken here. As this extraction function isn't really log normal specific apart for this I guess it doesn't matter but its not ideal

Originally posted by @seabbs in #170 (comment)

Given that we are now moving towards being more family agnostic we probably want to refactor this extract_lognormal_draws function to be more generic e.g. extract_draws (and it works for any family).

@seabbs
Copy link
Contributor

seabbs commented Jul 30, 2024

yes agree

@athowes
Copy link
Collaborator Author

athowes commented Jul 31, 2024

Use functionality from https://paul-buerkner.github.io/brms/articles/brms_customfamilies.html#post-processing-custom-family-models as much as possible.

@athowes
Copy link
Collaborator Author

athowes commented Jul 31, 2024

The code prep <- brms::prepare_predictions(fit) works.

To fit into the way brms handles things what we would need to do is write custom functions like:

log_lik_latent_lognormal <- function(i, prep) {
}
posterior_predict_latent_lognormal <- function(i, prep, ...) {
}
posterior_epred_latent_lognormal <- function(prep) {

}

With these three things defined then all the standard brms post-processing should work.

Downside: have to write these functions by family class.

Suggestion: write this strata prediction function still (in a way which is 100% class agnostic) and then also write these above functions for the most common families (could just start with lognormal).

I'd say that these functions are more likely to be used by expert users of the package, and then the delay_samples function would be more likely to be used for "off the shelf" analysis by most users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants