Issue #394: Add information about predictions to FAQ (#395)

* Add package links * Add documentation on how to do prediction
epinowcast · Oct 21, 2024 · 4d84ee0 · 4d84ee0
1 parent 8b53356
commit 4d84ee0
Showing 1 changed file with 36 additions and 6 deletions.
diff --git a/vignettes/faq.Rmd b/vignettes/faq.Rmd
@@ -29,6 +29,7 @@ library(dplyr)
 library(ggplot2)
 library(scales)
 library(tidyr)
+library(tidybayes)
 
 set.seed(1)
 
@@ -57,7 +58,7 @@ fit <- epidist(
 ## I would like to work with the samples output
 
 The output of a call to `epidist` is compatible with typical Stan workflows.
-We recommend use of the `posterior` package for working with samples from MCMC or other sampling algorithms.
+We recommend use of the [`posterior`](https://mc-stan.org/posterior/) package for working with samples from MCMC or other sampling algorithms.
 For example, the function `posterior::as_draws_df()` may be used to obtain a dataframe of MCMC draws for specified parameters.
 
 ```{r message = FALSE}
@@ -69,7 +70,7 @@ head(draws)
 ## How can I assess if sampling has converged?
 
 The output of a call to `epidist` is compatible with typical Stan workflows.
-We recommend use of the `bayesplot` package for sampling diagnostic plots.
+We recommend use of the [`bayesplot`](http://mc-stan.org/bayesplot/) package for sampling diagnostic plots.
 For example, the function `bayesplot::mcmc_trace()` can be used to produce traceplots for specified parameters.
 
 ```{r message = FALSE}
@@ -85,7 +86,7 @@ epidist_diagnostics(fit)
 
 ## I'd like to run a simulation study
 
-We recommend use of the `purrr` package for running many `epidist` models, for example as a part of a simulation study.
+We recommend use of the [`purrr`](https://purrr.tidyverse.org/) package for running many `epidist` models, for example as a part of a simulation study.
 We particularly highlight two functions which might be useful:
 
 1. `purrr::map()` (and other similar functions) for iterating over a list of inputs.
@@ -96,7 +97,7 @@ For an example use of these functions, have a look at the [`epidist-paper`](http
 
 ## How did you choose the default priors for `epidist`?
 
-`brms` provides default priors for all parameters.
+[`brms`](http://paulbuerkner.com/brms/) provides default priors for all parameters.
 However, some of those priors do not make sense in the context of our application.
 Instead, we used [prior predictive checking](https://mc-stan.org/docs/stan-users-guide/posterior-predictive-checks.html) to set `epidist`-specific default priors which produce epidemiological delay distribution mean and standard deviation parameters in a reasonable range.
 
@@ -160,7 +161,7 @@ quantile(pred$sd, c(0.01, 0.1, 0.25, 0.5, 0.75, 0.9, 0.99))
 
 # How can I assess how sensitive the fitted posterior distribution is to the prior distribution used?
 
-We recommend use of the [`priorsense`](https://github.com/n-kall/priorsense) R package [@kallioinen2024detecting] to check how sensitive the posterior distribution is to perturbations of the prior distribution and likelihood using power-scaling analysis:
+We recommend use of the [`priorsense`](https://github.com/n-kall/priorsense) package [@kallioinen2024detecting] to check how sensitive the posterior distribution is to perturbations of the prior distribution and likelihood using power-scaling analysis:
 
 ```{r}
 library(priorsense)
@@ -170,7 +171,7 @@ powerscale_plot_dens(fit, variable = c("Intercept", "Intercept_sigma")) +
 
 # What do the parameters in my model output correspond to?
 
-The `epidist` package uses `brms` to fit models.
+The `epidist` package uses [`brms`](http://paulbuerkner.com/brms/) to fit models.
 This means that the model output will include `brms`-style names for parameters.
 Here, we provide a table giving the correspondence between the distributional parameter names used in `brms` and those used in standard R functions for some common likelihood families.
 
@@ -184,4 +185,33 @@ Here, we provide a table giving the correspondence between the distributional pa
 Note that all families in `brms` are parameterised with some measure of centrality `mu` as their first parameter.
 This parameter does not necessarily correspond to the mean: hence the provision of a function `add_mean_sd()` within `epidist` to add columns containing the natural scale mean and standard deviation to a `data.frame` of draws.
 
+# How can I generate predictions with my fitted `epidist` model?
+
+It is possible to generate predictions manually by working with [samples from the model output](https://epidist.epinowcast.org/articles/faq.html#i-would-like-to-work-with-the-samples-output).
+However this is tricky to do, and so where possible we recommend using the [`tidybayes`](http://mjskay.github.io/tidybayes/) package.
+In particular, following functions may be useful:
+
+1. `tidybayes::add_epred_draws()` for predictions of the expected value of a delay.
+2. `tidybayes::add_linpred_draws()` for predictions of the delay distributional parameter linear predictors.
+3. `tidybayes::add_predicted_draws()` for predictions of the observed delay.
+
+To see these functions demonstrated in a vignette, see ["Advanced features with Ebola data"](https://epidist.epinowcast.org/articles/ebola.html).
+As a short example, to generate 4000 predictions (equal to the number of draws) of the delay that would be observed with a double censored observation process (in which the primary and secondary censoring windows are both one) then:
+
+```{r}
+draws_pmf <- data.frame(relative_obs_time = 1000, pwindow = 1, swindow = 1) |>
+  add_predicted_draws(fit, ndraws = 4000)
+
+ggplot(draws_pmf, aes(x = .prediction)) +
+  geom_bar(aes(y = after_stat(count / sum(count)))) +
+  labs(x = "Delay", y = "PMF") +
+  scale_x_continuous(limits = c(0, 30)) +
+  theme_minimal()
+```
+
+Importantly, this functionality is only available for `epidist` models using custom `brms` families that have `posterior_predict` and `posterior_epred` methods implemented.
+For example, for the `latent_individual` model, currently methods are implemented for the [lognormal](https://github.com/epinowcast/epidist/blob/main/R/latent_lognormal.R) and [gamma](https://github.com/epinowcast/epidist/blob/main/R/latent_gamma.R) families.
+If you are using another family, consider [submitting a pull request](https://github.com/epinowcast/epidist/pulls) to implement these methods!
+In doing so, you may find it useful to use the [`primarycensored`](https://primarycensored.epinowcast.org/) package.
+
 ## Bibliography