-
Notifications
You must be signed in to change notification settings - Fork 15
/
NewYorkInfluenza.Rmd
73 lines (56 loc) · 2.93 KB
/
NewYorkInfluenza.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
---
title: "New York Influenza"
author: "Michael Karcher"
date: "October 12, 2015"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{New York Influenza}
%\VignetteEngine{knitr::rmarkdown}
\usepackage[utf8]{inputenc}
---
This vignette shows a case study of New York influenza data, using `phylodyn`.
We start by loading the `phylodyn` package.
```{r message=FALSE}
library(phylodyn)
```
In preparation, we aligned the sequences using the software MUSCLE,
and inferred a maximum clade credibility genealogy using the software BEAST[^1].
We packaged this genealogy in a `phylo` object `NY_flu`, and we load it now.
[^1]: We infer the genealogy branch lengths in units of years using a strict molecular clock, a constant effective population size prior, and an HKY substitution model with the first two nucleotides of a codon sharing the same estimated transition matrix, while the third nucleotide's transition matrix is estimated separately.
```{r}
data(NY_flu)
```
We use BNPR and BNPR_PS to calculate approximate marginals (without and with a sampling model).
```{r}
NY_cond = BNPR(data = NY_flu, lengthout = 100)
NY_pref = BNPR_PS(data = NY_flu, lengthout = 100)
```
We plot the results (we use a scaling factor because we stored the object with units of weeks, and we standardize on units of years).
```{r, fig.width=8.5, fig.height=3.5}
axlabs = list(x = seq(0, 12*52, by=52) + 10, labs = seq(2005, 1993, by=-1))
par(mfrow=c(1,3), cex=0.9, cex.lab=1.5, cex.main=1.7, oma=c(2.5, 2, 0, 0)+0.1,
mar=c(2,1.5,2,1), mgp = c(2.5,1,0), xpd=NA,
fig=c(0, 0.32, 0, 1))
plot_BNPR(NY_cond, main="BNPR", ylim = c(10, 500)/52, yscale = 1/52,
col = rgb(0.829, 0.680, 0.306), axlabs = axlabs, heatmap_labels_side = "left")
par(fig=c(0.32, 0.64, 0, 1), new=TRUE)
plot_BNPR(NY_pref, main="BNPR-PS", ylim = c(10, 500)/52, yscale = 1/52, ylab = "",
col = rgb(0.330, 0.484, 0.828), axlabs = axlabs, heatmap_labels = FALSE)
par(mar=c(2,3.5,2,1), fig=c(0.64, 1.0, 0, 1), new=TRUE)
plot_mrw(list(NY_cond, NY_pref), axlabs = axlabs, ylab = "Mean Relative Width",
cols = c(rgb(0.829, 0.680, 0.306), rgb(0.330, 0.484, 0.828)),
legends = c("BNPR", "BNPR-PS"), legend_place = "topright", legend_cex = 0.8)
```
## References
1. R. C. Edgar.
MUSCLE: Multiple sequence alignment with high accuracy and high through- put.
*Nucleic Acids Research*, 32:1792–1797, 2004.
2. A. J. Drummond, M. A. Suchard, D. Xie, and A. Rambaut.
Bayesian phylogenetics with BEAUti and the BEAST 1.7.
*Molecular Biology and Evolution*, 29:1969–1973, 2012.
3. A. Rambaut, O. G. Pybus, M. I. Nelson, C. Viboud, J. K. Taubenberger, and E. C. Holmes.
The genomic and epidemiological dynamics of human influenza A virus.
*Nature*, 453 (7195):615–619, 2008.
4. M. D. Karcher, J. A. Palacios, T. Bedford, M. A. Suchard, and V. N. Minin.
Quantifying and mitigating the effect of preferential sampling on phylodynamic inference.
*arXiv preprint arXiv*:1510.00775, 2015.