diff --git a/.github/workflows/pkgdown.yaml b/.github/workflows/pkgdown.yaml index 5853ace6..087f0b05 100644 --- a/.github/workflows/pkgdown.yaml +++ b/.github/workflows/pkgdown.yaml @@ -43,4 +43,4 @@ jobs: with: clean: false branch: gh-pages - folder: ./ + folder: docs diff --git a/.nojekyll b/.nojekyll deleted file mode 100644 index 8b137891..00000000 --- a/.nojekyll +++ /dev/null @@ -1 +0,0 @@ - diff --git a/404.html b/404.html deleted file mode 100644 index 15b2d67b..00000000 --- a/404.html +++ /dev/null @@ -1,90 +0,0 @@ - - -
- - - - -This outlines how to propose a change to serofoi.
-If you want to make a change, it’s a good idea to first file an issue and make sure someone from the team agrees that it’s needed. If you’ve found a bug, please file an issue that illustrates the bug with a minimal reprex (this will also help you write a unit test, if needed). See bug report template. If you have a feature request see feature request.
-Fork the package and clone onto your computer. If you haven’t done this before, we recommend using usethis::create_from_github("epiverse-trace/serofoi", fork = TRUE)
.
Install all development dependencies with devtools::install_dev_deps()
, and then make sure the package passes R CMD check by running devtools::check()
. If R CMD check doesn’t pass cleanly, it’s a good idea to ask for help before continuing.
Create a Git branch for your pull request (PR). We recommend using usethis::pr_init("brief-description-of-change")
.
Make your changes, commit to git, and then create a PR by running usethis::pr_push()
, and following the prompts in your browser. The title of your PR should briefly describe the change. The body of your PR should contain Fixes #issue-number
.
For user-facing changes, add a bullet to the top of NEWS.md
(i.e. just below the first header). Follow the style described in https://style.tidyverse.org/news.html.
New code should follow the tidyverse style guide. You can use the styler package to apply these styles, but please don’t restyle code that has nothing to do with your PR.
We use roxygen2, with Markdown syntax, for documentation.
We use testthat for unit tests. Contributions with test cases included are easier to accept.
Please note that the serofoi project is released with a Contributor Code of Conduct. By contributing to this project you agree to abide by its terms.
-MIT License - -Copyright (c) 2023 TRACE-LAC - -Permission is hereby granted, free of charge, to any person obtaining a copy -of this software and associated documentation files (the "Software"), to deal -in the Software without restriction, including without limitation the rights -to use, copy, modify, merge, publish, distribute, sublicense, and/or sell -copies of the Software, and to permit persons to whom the Software is -furnished to do so, subject to the following conditions: - -The above copyright notice and this permission notice shall be included in all -copies or substantial portions of the Software. - -THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR -IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, -FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE -AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER -LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, -OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE -SOFTWARE. -- -
foi_models.Rmd
The current version of serofoi supports -three different models for estimating the Force-of-Infection -(FoI), including constant and time-varying trajectories. For -fitting the model to the sero-prevalence data we use a suit of bayesian -models that include prior and upper prior distributions
-The force of infection, also known as the hazard rate or the -infection pressure, is a key concept in mathematical modelling of -infectious diseases. It represents the rate at which susceptible -individuals become infected, given their exposure to a pathogen. In -simple terms, the force of infection quantifies the risk of a -susceptible individual becoming infected over a period of time. It is -usually expressed as a rate per unit of time (e.g., per day or per -year).
-The FoI is one of the most important parameters in -epidemiology, but it is often incorrectly assumed to be constant over -time. Identifying whether the FoI follows a constant or a -time-varying trend can be important in the identification and -characterization of the spread of disease. In Table 1 there is a summary -of the models currently supported by serofoi.
-Model Option | -Description and usage | -
---|---|
constant |
-Constant FoI | -
tv_normal |
-Time-varying normal FoI: slow change in FoI | -
tv_normal_log |
-Time-varying normal-log FoI: fast epidemic change in -FoI | -
Table 1. Model options and descriptions.
-The endemic constant model is a simple mathematical model -used in epidemiology to describe the seroprevalence of an infectious -disease within a population, as a product of a long-term -transmission.
-For a constant FoI endemic model, the rate of infection acquisition
-\(\lambda\) is constant over time for
-each trajectory, and the seroprevalence \(P\) behaves as a cumulative process
-increasing monotonically with age. For the seroprevalence at age \(a\) and time \(t\), we have: \[
-P(a,t) = 1-\exp\left(-\lambda a\right)
-\] The number of positive cases follows a binomial distribution,
-where \(n\) is the number of trials
-(size of the age group) and \(P\) is
-the probability of successes (seroprevalence) for a certain age group:
-\[
-p(a,t) \sim binom(n(a,t), P(a,t))
-\] In serofoi, for the constant model,
-the FoI (\(\lambda\)) is
-modelled within a Bayesian framework using a uniform prior distribution
-\(\sim U(0,2)\). Future versions of the
-package may allow to choose different default distributions. This model
-can be implemented for the previously prepared dataset
-data_test
by means of the run_seromodel
-function specifying run_seromodel="constant"
.
The object simdata_constant
contains a minimal simulated
-dataset that emulates an hypothetical endemic situation where the
-FoI is constant with value 0.2 and includes data for 250
-samples of individuals between 2 and 47 years old with a number of
-trials \(n=5\). The following code
-shows how to implement the constant model to this simulated
-serosurvey:
-data("simdata_constant")
-serodata_constant <- prepare_serodata(simdata_constant)
-model_1 <- run_seromodel(serodata = serodata_constant,
- foi_model = "constant",
- n_iters = 800)
-plot_seromodel(model_1, size_text = 6)
-Figure 1. Constant serofoi model plot. Simulated (red) vs modelled -(blue) FoI.
-In this case, 800 iterations are enough to ensure convergence. The
-plot_seromodel
method provides a visualisation of the
-results, including a summary where the expected log pointwise predictive
-density (elpd
) and its standard error (se
) are
-shown. We say that a model converges if all the R-hat estimates are
-below 1.1.
For the time-varying FoI models, the probability for a case -to be positive at age a at time \(t\) -also follows a binomial distribution, as described above. However, the -seroprevalence is obtained from a cumulative of the yearly-varying -values of the FoI over time: \[ -P(a,t) = 1 - \exp\left(-\sum_{i=t-a+1}^{t}\lambda_i\right) -\] The corresponding serosurvey completed at time \(t_{sur}\) is informative for the interval -\([t_{sur}-a_{max}, t_{sur}]\).
-The time-varying slow normal model relies on the following
-prior distributions for the FoI to describe the spread of a
-given infectious disease within a population over time: \[
-\lambda(t)\sim normal(\lambda(t-1), \sigma) \\
-\lambda(t=1) \sim normal(0, 1)
-\] The object simdata_sw_dec
contains a minimal
-simulated dataset that emulates a situation where the FoI
-follows a stepwise decreasing tendency (FoI panel in Fig. 2).
-The simulated dataset contains information about 250 samples of
-individuals between 2 and 47 years old with a number of trials \(n=5\). The following code shows how to
-implement the slow time-varying normal model to this simulated
-serosurvey:
-data("simdata_sw_dec")
-serodata_sw_dec <- prepare_serodata(simdata_sw_dec)
-model_2 <- run_seromodel(serodata = serodata_sw_dec,
- foi_model = "tv_normal",
- n_iters = 1500)
-plot_seromodel(model_2, size_text = 6)
-Figure 2. Slow time-varying serofoi model plot. Simulated (red) vs -modelled (blue) FoI.
-The number of iterations required may depend on the number of years, -reflected by the difference between the year of the serosurvey and the -maximum age-class sampled.
-The time-varying fast epidemic model, relies on normal prior -distributions for the FoI in the logarithmic scale, i.e: \[ -\lambda(t)\sim normal(\log(\lambda(t-1)), \sigma) \\ -\lambda(t=1) \sim normal(-6, 4) -\] This is done in order to capture fast changes in the -FoI trend. Importantly, the standard deviation parameter of -this normal distribution of the FoI \(\lambda(t)\) is set using an upper prior -that follows a Cauchy distribution.
-In order to test this model we use the minimal simulated dataset
-contained in the simdata_large_epi
object. This dataset
-emulates a hypothetical situation where a three-year epidemic occurs
-between 2032 and 2035. The simulated serosurvey tests 250 individuals
-from 0 to 50 years of age in the year 2050. The implementation of the
-fast epidemic model can be obtained running the following lines of
-code:
-data("simdata_large_epi")
-serodata_large_epi <- prepare_serodata(simdata_large_epi)
-model_3 <- run_seromodel(serodata = serodata_large_epi,
- foi_model = "tv_normal_log",
- n_iters = 1500)
-model_3_plot <- plot_seromodel(model_3, size_text = 6)
-plot(model_3_plot)
-Figure 3. Time-varying fast epidemic serofoi model plot. -Simulated (red) vs modelled (blue) FoI.
-In Fig 3 we can see that the fast epidemic serofoi model is
-able to identify the large epidemic simulated on the
-simdata_large_epi
dataset.
The statistical details of the three models are described in Table -2.
-Model Option | -Probability of positive case at age \(a\) - | -Prior distribution | -Upper priors | -
---|---|---|---|
constant |
-\(\sim binom(n(a,t), -P(a,t))\) | -\(\lambda\sim -uniform(0,2)\) | -- |
tv_normal |
-\(\sim binom(n(a,t), -P(a,t))\) | -\(\lambda\sim -normal(\lambda(t-1),\sigma)\\ \lambda(t=1)\sim -normal(0,1)\) | -\(\sigma\sim -Cauchy(0,1)\) | -
tv_normal_log |
-\(\sim binom(n(a,t), -P(a,t))\) | -\(\lambda\sim -normal(log(\lambda(t-1)),\sigma)\\ \lambda(t=1)\sim -normal(-6,4)\) | -\(\sigma\sim -Cauchy(0,1)\) | -
Table 2. Statistical characteristics of -serofoi’s currently supported models for the -FoI (\(\lambda\)). Here \(n\) is the size of an age group \(a\) at time-step \(t\) and \(P\) is its corresponding -seroprevalence.
-Above we showed that the fast epidemic model
-(tv_normal_log
) is able to identify the large epidemic
-outbreak described by the simdata_large_epi
dataset, which
-was simulated according to a step-wise decreasing FoI (red line
-in Fig 3).
Now, we would like to know whether this model actually fits this
-dataset better than the other available models in
-serofoi. For this, we also implement both the
-endemic model (constant
) and the slow time-varying normal
-model (tv_normal
):
Using the function cowplot::plot_grid
we can visualise
-the results of the three models simultaneously:
-cowplot::plot_grid(model_1_plot, model_2_plot, model_3_plot,
- nrow = 1, ncol = 3, labels = "AUTO")
-Figure 4. Model comparison between the three serofoi models for a -large-epidemic simulated dataset.
-A common criterion to decide what model fits the data the best is to
-choose the one with the larger elpd
. According to this
-criterion, in this case the best model is the fast epidemic model, which
-is the only one that manages to identify the large epidemic (see the
-second row of panel C in Figure 4).
NOTE: Running the serofoi models for the -first time on your local computer may take a few minutes for the rstan -code to compile locally. However, once the initial compilation is -complete, there is no further need for local compilation.
-serofoi.Rmd
When informing the public health response we want to know how many -individuals have been infected up to a certain point in time, which -relates to the level of immunity for a given pathogen in a population. -Based on this information, it is possible to estimate the speed at which -susceptible individuals have been infected over time in that population. -We call this parameter the Force-of-Infection (FoI). To -estimate this parameter serofoi uses a suit of catalytic models.
-serofoi is a package designed to be used -for any infectious disease for which we could measure population -immunity using IgG antibodies, such as: arboviruses (dengue, Zika, -chikungunya), Chagas, alphaviruses, among many others. However, not for -all diseases serofoi may be applicable with its current features. Please -check the model assumptions below for each case.
-A serosurvey is an epidemiological study that involves the collection -and analysis of blood samples from a representative population to -determine the prevalence of antibodies against a specific pathogen. -These antibodies are typically produced by the immune system in response -to an infection, and their presence in the blood can serve as an -indicator of previous exposure to the pathogen. Serosurveys are valuable -tools for public health researchers and policymakers, as they provide -insights into the spread of infections within communities, the -proportion of individuals with immunity, and the effectiveness of -vaccination programs. Serosurveys can help guide targeted interventions, -inform disease control strategies,and evaluate the success of public -health measures.
-This package is designed to be used for serosurveys that follow these -inclusion criteria:
-Current version of serofoi includes the -following assumptions on the underlying biological process:
-To assess how well the mathematical representation of the different -FoI models describes the seroprevalence data, we implement a -fitting process that relies on a Bayesian framework. The priors and -upper priors for each model indicate the assumed trajectory of the -FoI (either constant or time-varying). serofoi provides a set -of Bayesian comparison methods for the user to choose between various -possible FoI models (trajectories) that best fit the seroprevalence -data. For details see FoI -Models.
-Is common to assume that the FoI is constant in time. -However this is not always the case; serofoi -addresses this by enabling the possibility to implement different -time-varying models for the FoI and offering tools to compare -them to the constant case in order to decide what model fits a -serological survey the best.
-The integrated dataset serodata_test
provides a minimal
-example of the input of the package. The following code can be used to
-prepare this dataset for model implementation and implements the basic
-constant model:
-library(serofoi)
-# Loading and preparing data for modelling
-data("serodata")
-serodata_test <- prepare_serodata(serodata)
-# Model implementation
-model_constant <- run_seromodel(serodata = serodata_test,
- foi_model = "constant")
-# Visualisation
-plot_seromodel(model_constant, size_text = 6)
-For details on the implementation of time-varying models and model -comparison see FoI -Models.
-use_cases.Rmd
The serofoi package is a tool for estimating the -Force-of-Infection (FoI) from population-based serosurvey data. -In this article, we present three real-life epidemiological scenarios -from Latin America to demonstrate the utility of -serofoi and time-varying models in describing -the trajectory of the FoI. For inclusion criteria about -serosurvey data and model assumptions, please check Get -Started.
-The scenarios were chosen to showcase the versatility of the serofoi -package in different epidemiological contexts:
-Chikungunya is a viral disease that was first described during an -outbreak in Tanzania in 1952. For several decades, it was primarily -found in Africa and Asia. However, in 2004, the first Chikungunya -outbreak outside of these regions occurred on the island of Réunion in -the Indian Ocean. Since then, Chikungunya has spread rapidly throughout -the world, including to the Americas, Europe, and the Pacific region. In -2013, the first cases of Chikungunya were reported in the Americas, and -the virus has since become endemic in several countries in Latin -America. The transmission of Chikungunya is primarily through the bites -of infected Aedes mosquitoes, with humans serving as the primary -amplifying host. The symptoms of Chikungunya include fever, joint pain, -headache, muscle pain, and rash, and the disease can range from mild to -severe. Although Chikungunya is not typically fatal, it can cause -significant morbidity and has the potential to cause large-scale -outbreaks, making it an important public health concern. The -methodological challenge is how best to estimate the disease burden -untangling the endemic and epidemic patterns in several locations around -the world. Here serofoi can assist with these -estimates.
-To gain insights into the transmission dynamics of Chikungunya in the -Americas, we used a dataset from a population-based study conducted in -Bahia, Brazil in October-December 2015. This study, conducted by Dias et -al. (2018), involved household interviews and age-disaggregated -serologic surveys to measure IgG antibodies against the Chikungunya -virus. The survey was conducted immediately after a large Chikungunya -epidemic in the area.
-serofoi was used to compare three potential scenarios of Chikungunya -transmission: constant endemic, epidemic slow, and epidemic fast. Figure -3 displays the comparison between the three serofoi models. The results -reveal strong statistical support for model 3 (fast epidemic model) -suggesting a sudden increase in the transmission of Chikungunya close to -the year of the serosurvey (2015). The exact year is difficult to -estimate due to the large level of aggregation of the data, which is -divided into 20-year age groups. Nevertheless, these results are -consistent with the empirical evidence from Dias et al. (2018), who used -both interviews and IgM testing to show a similar increase in -transmission during this period.
-
-# Load and prepare data
-data("chik2015")
-chik2015p <- prepare_serodata(chik2015)
-
-# Implementation of the models
-m1_chik <- run_seromodel(serodata = chik2015p,
- foi_model = "constant",
- n_iters = 1000,
- n_thin = 2)
-
-m2_chik <- run_seromodel(serodata = chik2015p,
- foi_model = "tv_normal",
- n_iters = 1500,
- n_thin = 2)
-
-m3_chik <- run_seromodel(serodata = chik2015p,
- foi_model = "tv_normal_log",
- n_iters = 1500,
- n_thin = 2)
-
-# Visualisation of the results
-p1_chik <- plot_seromodel(m1_chik, size_text = 6)
-p2_chik <- plot_seromodel(m2_chik, size_text = 6)
-p3_chik <- plot_seromodel(m3_chik, size_text = 6)
-
-cowplot::plot_grid(p1_chik, p2_chik, p3_chik, ncol=3)
-Figure 1. Serofoi models for FoI estimates of Chikungunya virus -transmission in an urban remote area of Brazil.
-emerging alphaviruses, including Venezuelan Equine Encephalitis -Virus (VEEV), are RNA viruses that can cause disease in both humans -and animals. They are primarily transmitted by mosquitoes and have a -complex transmission cycle that involves human and non-human hosts, -including birds and mammals. Alphaviruses can cause significant -morbidity and mortality. Hidden epidemics and endemic transmission of -alphaviruses have been occurring in small and remote communities of -Eastern Panama for decades without major notice (carrera2020?). The main -concern with alphaviruses is their potential to spill over into human -populations and reach highly populated cities and urban areas where -humans are more susceptible. The Darien province in Eastern Panama, -bordering the north of Colombia to the south and the Pacific Ocean, is -home to several indigenous communities who live in traditional and -remote villages. Notably, the area is also a critical crossing point for -illegal immigration from Africa and South America to the north of the -Americas. Estimating the temporal trends of the incidence of -alphaviruses in this region is a methodological challenge but critical -to inform control strategies. serofoi can -assist with these estimations.
-From (carrera2020?), we use a
-dataset measuring IgG antibodies against VEEV in a rural
-village in Panamá in 2012. VEEV is primarily transmitted by
-mosquitoes and can cause disease in horses and humans. This dataset,
-veev2012
is included in
-serofoi.
serofoi was used to compare three potential
-scenarios of VEEV transmission: constant endemic,
-epidemic slow, and epidemic fast. The results showed a
-significant increase in the estimated Force-of-Infection (FoI)
-in the region, indicating a rise in VEEV transmission. The
-study found that there was much higher statistical support for a
-time-varying rather than a constant scenario based on higher elpd and
-lower se values of the two time-varying models compared to the constant
-one (Figure 2). The results also suggest slightly (yet relevant) better
-support for model 3 (tv-nomal-log
), compared to model 2
-(tv-normal
), suggesting a recent increase in transmission
-in the study area.
-# Load and prepare data
-data("veev2012")
-veev2012p <- prepare_serodata(veev2012)
-
-# Implementation of the models
-m1_veev <- run_seromodel(serodata = veev2012p,
- foi_model = "constant",
- n_iters = 500,
- n_thin = 2)
-
-m2_veev <- run_seromodel(serodata = veev2012p,
- foi_model = "tv_normal",
- n_iters = 500,
- n_thin = 2)
-
-m3_veev <- run_seromodel(serodata = veev2012p,
- foi_model = "tv_normal_log",
- n_iters = 500,
- n_thin = 2)
-
-# Visualisation of the results
-p1_veev <- plot_seromodel(m1_veev, size_text = 6)
-p2_veev <- plot_seromodel(m2_veev, size_text = 6)
-p3_veev <- plot_seromodel(m3_veev, size_text = 6)
-
-cowplot::plot_grid(p1_veev, p2_veev, p3_veev, ncol=3)
-Figure 2. serofoi models for FoI -estimates of Venezuelan Equine Encephalitis Virus (VEEV) -transmission in a rural remote area of Panama.
-Chagas disease is a parasitic infection caused by the protozoan -Trypanosoma cruzi. It is only endemic to Latin America, where -it is transmitted to humans through the bite of infected triatomine -bugs, which have been present in the Americas for thousands of years. -Triatomine bugs have established domiciliary habits, living inside -houses and biting humans. Insecticide spraying is the primary control -strategy for Chagas disease, as it effectively reduces the population of -triatomine bugs, the main vector of the disease, in domestic -environments. According to (cucunubá2017?), -interventions for Chagas disease control have been ongoing in Colombia -since the 1980s, with a heterogeneous impact depending on the type of -setting, environment, and population. There is a methodological -challenge in how best to estimate the historical effectiveness -of these control strategies across endemic areas. Here -serofoi can assist with these estimations.
-Based on the data and analysis shown in (cucunubá2017?), we use
-one of the datasets that measure the seroprevalence of IgG antibodies
-against Trypanosoma cruzi infection in rural areas of Colombia.
-The dataset is part of the serofoi package as
-chagas2012
. This dataset corresponds to a serosurvey
-conducted in 2012 for a rural indigenous community known to have
-long-term endemic transmission, where some control interventions have
-taken place over the years. ### The result
Because Chagas is an endemic disease, we should use only the
-serofoi endemic models (1.
-constant
, 2. tv-normal
) on the
-chagas2012
dataset and compare which model is better
-supported. The results are shown in Figure 3. We found that for this
-serosurvey, both serofoi models converged
-(based on R-hat values not crossing 1.1), but the comparison of the two
-models shows a relevant slow decreasing trend, which was consistent with
-model 2 - tv-normal
. This model was statistically better
-supported based on the highest elpd
and lowest
-se
values compared to the constant model. These results
-suggest a slow, still relevant decrease in the FoI values over
-the last decades which may have been a consequence of some of the
-interventions or local environmental changes that have occurred in the
-study area over the years, up to the point (2012) when the serosurvey
-was conducted.
-# Load and prepare data
-data("chagas2012")
-chagas2012p <- prepare_serodata(chagas2012)
-
-# Implementation of the models
-m1_cha <- run_seromodel(serodata = chagas2012p,
- foi_model = "constant",
- n_iters = 800)
-m2_cha <- run_seromodel(serodata = chagas2012p,
- foi_model = "tv_normal",
- n_iters = 800)
-
-# Visualisation of the results
-p1_cha <- plot_seromodel(m1_cha, size_text = 6)
-p2_cha <- plot_seromodel(m2_cha, size_text = 6)
-cowplot::plot_grid(p1_cha, p2_cha, ncol=2)
-Figure 3. serofoi endemic models for -FoI estimates of Trypanosoma cruzi in a rural area of -Colombia. ## References
-