Bayesian data analysis usually incurs long runtimes and cumbersome
custom code, and the process of prototyping and deploying custom
Stan models can become a daunting software
engineering challenge. To ease this burden, the stantargets
R package
creates Stan pipelines that are concise,
efficient, scalable, and tailored to the needs of Bayesian
statisticians. Leveraging
targets
, stantargets
pipelines
automatically parallelize the computation and skip expensive steps when
the results are already up to date. Minimal custom user-side code is
required, and there is no need to manually configure branching, so
stantargets
is easier to use than
targets
and
CmdStanR
directly. stantargets
can
access all of cmdstanr
’s major
algorithms (MCMC, variational Bayes, and optimization) and it supports
both single-fit workflows and multi-rep simulation studies.
- The prerequisites of the
targets
R package. - Basic familiarity with
targets
: watch minutes 7 through 40 of this video, then read this chapter of the user manual. - Familiarity with Bayesian Statistics and
Stan. Prior knowledge of
cmdstanr
helps.
Read the stantargets
introduction
and
simulation
vignettes, and use https://docs.ropensci.org/stantargets/ as a
reference while constructing your own workflows. Visit
https://github.com/wlandau/stantargets-example-validation for an
example project based on the simulation
vignette.
The example has an RStudio Cloud
workspace which allows you to
run the project in a web browser.
Description | Link |
---|---|
Validating a minimal Stan model | https://github.com/wlandau/targets-stan |
Using Target Markdown and stantargets to validate a Bayesian longitudinal model for clinical trial data analysis |
https://github.com/wlandau/rmedicine2021-pipeline |
Install the GitHub development version to access the latest features and patches.
remotes::install_github("ropensci/stantargets")
The CmdStan command line interface is also required.
cmdstanr::install_cmdstan()
If you have problems installing
CmdStan, please consult the
installation guide of
cmdstanr
and the
installation guide of
CmdStan.
Alternatively, the Stan discourse is a
friendly place to ask Stan experts for help.
First, write a _targets.R
file that loads
your packages, defines a function to generate
Stan data, and lists a pipeline of targets. The
target list can call target factories like
tar_stan_mcmc()
as well as ordinary targets with
tar_target()
.
The following minimal example is simple enough to contain entirely
within the _targets.R
file, but for larger projects, you may wish to
store functions in separate files as in the
targets-stan
example.
# _targets.R
library(targets)
library(stantargets)
generate_data <- function() {
true_beta <- stats::rnorm(n = 1, mean = 0, sd = 1)
x <- seq(from = -1, to = 1, length.out = n)
y <- stats::rnorm(n, x * true_beta, 1)
list(n = n, x = x, y = y, true_beta = true_beta)
}
list(
tar_stan_mcmc(
name = example,
stan_files = "x.stan",
data = generate_data()
)
)
Run
tar_visnetwork()
to check _targets.R
for correctness, then call
tar_make()
to run the pipeline. Access the results using
tar_read()
,
e.g. tar_read(example_summary_x)
. Visit the introductory
vignette
to read more about this example.
stantargets
supports specialized target
factories
that create ensembles of target
objects
for cmdstanr
workflows. These
target
factories
abstract away the details of
targets
and
cmdstanr
and make both
packages easier to use. For details, please read the introductory
vignette.
Please first read the help guide to learn how best to ask for help.
If you have trouble using stantargets
, you can ask for help in the
GitHub discussions
forum.
Because the purpose of stantargets
is to combine
targets
and
cmdstanr
, your issue may have
something to do with one of the latter two packages, a dependency of
targets
,
or Stan itself. When you troubleshoot, peel back
as many layers as possible to isolate the problem. For example, if the
issue comes from cmdstanr
,
create a reproducible example that
directly invokes cmdstanr
without invoking stantargets
. The GitHub discussion and issue forums
of those packages, as well as the Stan
discourse, are great resources.
Development is a community effort, and we welcome discussion and contribution. Please note that this package is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
citation("stantargets")
#>
#> To cite stantargets in publications use:
#>
#> Landau, W. M., (2021). The stantargets R package: a workflow
#> framework for efficient reproducible Stan-powered Bayesian data
#> analysis pipelines. Journal of Open Source Software, 6(60), 3193,
#> https://doi.org/10.21105/joss.03193
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> title = {The stantargets {R} package: a workflow framework for efficient reproducible {S}tan-powered {B}ayesian data analysis pipelines},
#> author = {William Michael Landau},
#> journal = {Journal of Open Source Software},
#> year = {2021},
#> volume = {6},
#> number = {60},
#> pages = {3193},
#> url = {https://doi.org/10.21105/joss.03193},
#> }