Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vignette on benchmarking model options #695

Draft
wants to merge 66 commits into
base: main
Choose a base branch
from

Conversation

jamesmbaazam
Copy link
Contributor

@jamesmbaazam jamesmbaazam commented Jun 13, 2024

Description

This PR closes #629.

Initial submission checklist

  • My PR is based on a package issue and I have explicitly linked it.
  • I have tested my changes locally (using devtools::test() and devtools::check()).
  • I have added or updated unit tests where necessary.
  • I have updated the documentation if required and rebuilt docs if yes (using devtools::document()).
  • I have followed the established coding standards (and checked using lintr::lint_package()).
  • I have added a news item linked to this PR.

After the initial Pull Request

  • I have reviewed Checks for this PR and addressed any issues as far as I am able.

@jamesmbaazam
Copy link
Contributor Author

jamesmbaazam commented Jun 17, 2024

As an update, the {rstan} models are working fine but the {cmdstanr} models are giving various errors that I will report on.

Error when using laplace() method from {cmdstanr}.

library(EpiNow2)
library(cmdstanr)
# Set the number of cores to use
options(mc.cores = 4)

# Generation time
generation_time <- Gamma(
  shape = Normal(1.3, 0.3),
  rate = Normal(0.37, 0.09),
  max = 14
)

# Incubation period
incubation_period <- LogNormal(
  meanlog = Normal(1.6, 0.05),
  sdlog = Normal(0.5, 0.05),
  max = 14
)

# Reporting delay
reporting_delay <- LogNormal(
  meanlog = 0.5,
  sdlog = 0.5,
  max = 10
)

# Combine the incubation period and reporting delay into one delay
delay <- incubation_period + reporting_delay

# Observation model options
obs <- obs_opts(
  scale = list(mean = 0.1, sd = 0.025),
  return_likelihood = TRUE
)

# Run model
epinow(
  data = example_confirmed,
  generation_time = generation_time_opts(generation_time),
  delays = delay_opts(delay),
  obs = obs,
  horizon = 0,
  rt = NULL,
  stan = stan_opts(
    method = "laplace",
    backend = "cmdstanr"
  )
)

Error in fit_model_approximate(args, id = id) : 
  Approximate inference failed due to: Error: 'jacobian' argument to optimize and laplace must match!
laplace was called with jacobian=TRUE
optimize was run with jacobian=TRUE

Not an informative error message from {cmdstanr}, I should say.

After playing around with the example above with different combinations of mode=NULL or unspecified and jacobian, and the example in laplace(), there seems to be a weird/erroneous interaction between the two arguments. It seems that the option to set mode = NULL may not be implemented? (I am yet to confirm this).

Moreover, according to the documentation of laplace(), setting mode = NULL and jacobian = TRUE/FALSE (through stan_opts() in EpiNow2) should work but I get the same error. It seems we may have to run optimise() first and pass the output to laplace().

Am I missing anything? Thoughts?? @sbfnk @seabbs

@sbfnk
Copy link
Contributor

sbfnk commented Jul 12, 2024

I can't reproduce this - do you need to update cmdstanr or cmdstan (using cmdstanr::install_cmdstan()) possibly?

My versions are:

devtools::package_info("cmdstanr") |>
  dplyr::filter(package == "cmdstanr")
#>  package  * version    date (UTC) lib source
#>  cmdstanr   0.8.1.9000 2024-06-23 [1] Github (stan-dev/cmdstanr@9878dda)
#> 
#>  [1] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library
cmdstanr::cmdstan_version()
#> [1] "2.35.0"

Created on 2024-07-12 with reprex v2.1.0

@jamesmbaazam
Copy link
Contributor Author

Thanks. It's fixed now after updating.

@jamesmbaazam
Copy link
Contributor Author

jamesmbaazam commented Jul 15, 2024

I have now pushed the vignette with the run of all the models using MCMC (a472439).

To do:

  • Add a model using "vb" from cmdstanr.
    - UPDATE: This now runs and I've added a custom function to extract the samples for downstream analyses.
  • Run the {cmdstanr} approximate methods ("pathfinder" and "laplace").
    - Blockers: the two methods are not even initialising. (cc: @sbfnk @seabbs)
    - UPDATE: Have now gotten to the root of one of the issues and created an issue here pathfinder fails for large case reports #728.

Settings and errors so far

Pathfinder errors

num_paths = 1

stan = stan_opts(
      method = "pathfinder",
      backend = "cmdstanr",
      num_paths = 1,
    )
# Error in fit_model_approximate(args, id = id) : 
#   Approximate inference failed due to: Optimization terminated with error: Line search failed to achieve a sufficient decrease, no more progress can be made Optimization failed to start, pathfinder cannot be run.

num_paths > 1


stan = stan_opts(
      method = "pathfinder",
      backend = "cmdstanr",
      num_paths = 2,
)

# Error in fit_model_approximate(args, id = id) : 
# Approximate inference failed due to: No pathfinders ran successfully

Increase trials from default of 10 to 50 (with multipathfinder default)

stan = stan_opts(
      method = "pathfinder",
      backend = "cmdstanr",
      trials = 50
)

# Error in fit_model_approximate(args, id = id) : 
# Approximate inference failed due to: No pathfinders ran successfully

@jamesmbaazam
Copy link
Contributor Author

I noticed something interesting about pathfinder struggling to fit large case reports and have created a separate issue for it here #728.

@sbfnk The only thing in the way of this vignette getting merged is the struggle to get pathfinder and laplace working. If we don't want to delay this any further, we could finalise this version, mention that a future enhancement will include those two methods, and get this merged, then add them as an enhancement when I've figured them out. All the models including vb from {rstan} and {cmdstanr} are working currently. I will precompile them all and push.

@jamesmbaazam
Copy link
Contributor Author

jamesmbaazam commented Jul 25, 2024

From meeting with Seb today:

  • Finalise current vignette by running all models, leaving out Pathfinder and Laplace for future enhancement.
  • Add more metrics alongside run time:
    • total CRPS for estimates on complete data, partial data, and forecasting.
    • CRPS at the last time point (real-time performance)

@jamesmbaazam jamesmbaazam force-pushed the vig-speedup-options branch from a6f38ba to fef8991 Compare July 25, 2024 13:23
@jamesmbaazam
Copy link
Contributor Author

I am away until Jan 3 and will reboot this vignette on my return.

Comment on lines +453 to +458
lapply(
results_by_snapshot,
function(model_results) {
lapply(model_results, extract_results, variable)
}
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify these nested lapply() calls with purrr::map_depth().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change Getting started documentation to point users to faster models Vignette on approximations and speedups
3 participants