Add vignette on benchmarking model options #695

jamesmbaazam · 2024-06-13T19:07:51Z

Description

This PR closes #629.

Initial submission checklist

My PR is based on a package issue and I have explicitly linked it.
I have tested my changes locally (using devtools::test() and devtools::check()).
I have added or updated unit tests where necessary.
I have updated the documentation if required and rebuilt docs if yes (using devtools::document()).
I have followed the established coding standards (and checked using lintr::lint_package()).
I have added a news item linked to this PR.

After the initial Pull Request

I have reviewed Checks for this PR and addressed any issues as far as I am able.

jamesmbaazam · 2024-06-17T14:42:32Z

As an update, the {rstan} models are working fine but the {cmdstanr} models are giving various errors that I will report on.

Error when using laplace() method from {cmdstanr}.

library(EpiNow2)
library(cmdstanr)
# Set the number of cores to use
options(mc.cores = 4)

# Generation time
generation_time <- Gamma(
  shape = Normal(1.3, 0.3),
  rate = Normal(0.37, 0.09),
  max = 14
)

# Incubation period
incubation_period <- LogNormal(
  meanlog = Normal(1.6, 0.05),
  sdlog = Normal(0.5, 0.05),
  max = 14
)

# Reporting delay
reporting_delay <- LogNormal(
  meanlog = 0.5,
  sdlog = 0.5,
  max = 10
)

# Combine the incubation period and reporting delay into one delay
delay <- incubation_period + reporting_delay

# Observation model options
obs <- obs_opts(
  scale = list(mean = 0.1, sd = 0.025),
  return_likelihood = TRUE
)

# Run model
epinow(
  data = example_confirmed,
  generation_time = generation_time_opts(generation_time),
  delays = delay_opts(delay),
  obs = obs,
  horizon = 0,
  rt = NULL,
  stan = stan_opts(
    method = "laplace",
    backend = "cmdstanr"
  )
)

Error in fit_model_approximate(args, id = id) : 
  Approximate inference failed due to: Error: 'jacobian' argument to optimize and laplace must match!
laplace was called with jacobian=TRUE
optimize was run with jacobian=TRUE

Not an informative error message from {cmdstanr}, I should say.

After playing around with the example above with different combinations of mode=NULL or unspecified and jacobian, and the example in laplace(), there seems to be a weird/erroneous interaction between the two arguments. It seems that the option to set mode = NULL may not be implemented? (I am yet to confirm this).

Moreover, according to the documentation of laplace(), setting mode = NULL and jacobian = TRUE/FALSE (through stan_opts() in EpiNow2) should work but I get the same error. It seems we may have to run optimise() first and pass the output to laplace().

Am I missing anything? Thoughts?? @sbfnk @seabbs

sbfnk · 2024-07-12T09:40:13Z

I can't reproduce this - do you need to update cmdstanr or cmdstan (using cmdstanr::install_cmdstan()) possibly?

My versions are:

devtools::package_info("cmdstanr") |>
  dplyr::filter(package == "cmdstanr")
#>  package  * version    date (UTC) lib source
#>  cmdstanr   0.8.1.9000 2024-06-23 [1] Github (stan-dev/cmdstanr@9878dda)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
cmdstanr::cmdstan_version()
#> [1] "2.35.0"

^{Created on 2024-07-12 with reprex v2.1.0}

jamesmbaazam · 2024-07-12T15:15:43Z

Thanks. It's fixed now after updating.

vignettes/speedup_options.Rmd.orig

vignettes/speedup_options.Rmd

jamesmbaazam · 2024-07-15T12:30:38Z

I have now pushed the vignette with the run of all the models using MCMC (a472439).

To do:

Add a model using "vb" from cmdstanr.
- UPDATE: This now runs and I've added a custom function to extract the samples for downstream analyses.
Run the {cmdstanr} approximate methods ("pathfinder" and "laplace").
- Blockers: the two methods are not even initialising. (cc: @sbfnk @seabbs)
- UPDATE: Have now gotten to the root of one of the issues and created an issue here pathfinder fails for large case reports #728.

Settings and errors so far

Pathfinder errors

`num_paths = 1`

stan = stan_opts(
      method = "pathfinder",
      backend = "cmdstanr",
      num_paths = 1,
    )
# Error in fit_model_approximate(args, id = id) : 
#   Approximate inference failed due to: Optimization terminated with error: Line search failed to achieve a sufficient decrease, no more progress can be made Optimization failed to start, pathfinder cannot be run.

`num_paths > 1`


stan = stan_opts(
      method = "pathfinder",
      backend = "cmdstanr",
      num_paths = 2,
)

# Error in fit_model_approximate(args, id = id) : 
# Approximate inference failed due to: No pathfinders ran successfully

Increase `trials` from default of 10 to 50 (with multipathfinder default)

stan = stan_opts(
      method = "pathfinder",
      backend = "cmdstanr",
      trials = 50
)

# Error in fit_model_approximate(args, id = id) : 
# Approximate inference failed due to: No pathfinders ran successfully

jamesmbaazam · 2024-07-24T17:11:14Z

I noticed something interesting about pathfinder struggling to fit large case reports and have created a separate issue for it here #728.

@sbfnk The only thing in the way of this vignette getting merged is the struggle to get pathfinder and laplace working. If we don't want to delay this any further, we could finalise this version, mention that a future enhancement will include those two methods, and get this merged, then add them as an enhancement when I've figured them out. All the models including vb from {rstan} and {cmdstanr} are working currently. I will precompile them all and push.

jamesmbaazam · 2024-07-25T13:04:27Z

From meeting with Seb today:

Finalise current vignette by running all models, leaving out Pathfinder and Laplace for future enhancement.
Add more metrics alongside run time:
- total CRPS for estimates on complete data, partial data, and forecasting.
- CRPS at the last time point (real-time performance)

vignettes/speedup_options.Rmd

jamesmbaazam · 2024-12-19T16:40:33Z

I am away until Jan 3 and will reboot this vignette on my return.

jamesmbaazam · 2024-12-19T17:56:42Z

vignettes/benchmarks.Rmd.orig

+  lapply(
+    results_by_snapshot,
+    function(model_results) {
+      lapply(model_results, extract_results, variable)
+    }
+  )


Simplify these nested lapply() calls with purrr::map_depth().

jamesmbaazam mentioned this pull request Jul 12, 2024

Issue 44: Approximate inference vignette epinowcast/epidist#69

Merged

14 tasks

jamesmbaazam commented Jul 12, 2024

View reviewed changes

vignettes/speedup_options.Rmd.orig Outdated Show resolved Hide resolved

jamesmbaazam commented Jul 12, 2024

View reviewed changes