Bethany's Roadmap #16

ha0ye · 2017-04-18T23:23:30Z

Objectives

1. Replicate figure 5 A

"Fig. 5
Environmental drivers of surface microbial community composition.

(A) Principal coordinate (PC) analysis of surface samples shows that samples are not clearly grouped by their regional origin (top), but rather separated by the local temperatures as shown by the strong correlation (R2: 0.76) between the first PC and temperature (bottom). "

2. Replicate figure 5 B
"(B) Pairwise comparisons of environmental factors are shown, with a color gradient denoting Spearman’s correlation coefficients. Taxonomic [based on two independent methods: mitags (12) and mOTUs (13)] and functional (based on biochemical KEGG modules) community composition was related to each environmental factor by partial (geographic distance–corrected) Mantel tests. Edge width corresponds to the Mantel’s r statistic for the corresponding distance correlations, and edge color denotes the statistical significance based on 9,999 permutations."

3. Replicate figure 6

"Fig. 6
Temperature as main environmental driver for microbial community composition in the epipelagic layer.
(A) The strength of association between (meta)genomic and environmental data was tested by statistical models that were first generated with a subset of data for training and then validated on the remaining data. The prediction accuracy was used as a measure for the strength of association. Models that were trained on subsets of taxonomic data from surface water (SRF) samples could predict with high accuracy temperature and dissolved oxygen of samples used for validation (left). Models trained with subsets of taxonomic data from deep chlorophyll maximum (DCM) samples could predict temperature with high accuracy, but could predict dissolved oxygen with only moderate accuracy (middle). To demonstrate across-depth conservation of associations, we show that models trained on data from SRF samples could highly predict temperature, but failed to predict dissolved oxygen in DCM samples. (B) To illustrate prediction accuracy, and thus, strength of association between taxonomic composition (using 16S mitag abundances) and temperature, we show that in situ measured temperature could be predicted with 86% explained variance. The red diagonal shows the theoretical curve for perfect predictions. Sanger sequencing reads from the GOS project were used to calculate relative genus abundance tables. Using temperature prediction models trained at genus level using Tara Oceans data, we show (inset) that the results could be validated at relatively high accuracy given the large differences in sampling and sequencing methods between these two studies."

Tasks

Focus on Figure 5A to start

Get environmental metadata (associated with paper)
Get OTU data (not easily accessible)
We need mOTU relative abundances
Revisit PCA notes from stats class
Make R script to replicate PCA methods as described in Sungawa et. al.
Make R script to correlate PC1 with temperature
Compare my results to published results

ha0ye closed this as completed Apr 19, 2017

ha0ye reopened this Apr 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bethany's Roadmap #16

Bethany's Roadmap #16

ha0ye commented Apr 18, 2017 •

edited by bck243

Loading

Bethany's Roadmap #16

Bethany's Roadmap #16

Comments

ha0ye commented Apr 18, 2017 • edited by bck243 Loading

Objectives

Tasks

Focus on Figure 5A to start

ha0ye commented Apr 18, 2017 •

edited by bck243

Loading