Skip to content
Leo Lahti edited this page Aug 12, 2024 · 67 revisions

Roadmap

External resources

  • Tomaz Bastiaanssen microbiome tutorials & packages provide good references for standard methods.

Comparison points

Additional functionality

  • microsud packages

  • specificity

  • OGU support

  • Optimized Faith's index: Efficient computation of Faith's phylogenetic diversity with applications in characterizing microbiomes. "Our BSD-licensed implementation of this algorithm is available in the unifrac package (via PyPI and bioconda (Grüning et al. 2018)), which has 50,714 total conda downloads and 34,141 conda downloads since the introduction of SFPhD, as of the time of writing (May 13, 2021). The package produces a C/C++ shared library with Python bindings and is additionally linkable by any programming language (https://github.com/biocore/unifrac)."

Sample size

Sample size calculator for microbiome data would be helpful for many.

  • mPower by Lu Yang is now in the making

  • Ref 1

  • Ref 2

  • Multiple testing post-hoc correction for PERMANOVA, examples (CR): mctoolsr::calc_pairwise_permanovas

Distances and clustering

GUniFrac: Generalized UniFrac Distances, Distance-Based Multivariate Methods and Feature-Based Univariate Methods for Microbiome Data Analysis CRAN package with A suite of methods for powerful and robust microbiome data analysis including data normalization, data simulation, community-level association testing and differential abundance analysis. It implements generalized UniFrac distances, Geometric Mean of Pairwise Ratios (GMPR) normalization, semiparametric data simulator, distance-based statistical methods, and feature-based statistical methods. The distance-based statistical methods include three extensions of PERMANOVA: (1) PERMANOVA using the Freedman-Lane permutation scheme, (2) PERMANOVA omnibus test using multiple matrices, and (3) analytical approach to approximating PERMANOVA p-value. Feature-based statistical methods include linear model-based methods for differential abundance analysis of zero-inflated high-dimensional compositional data.

POSTm: Phylogeny-Guided OTU-Specific Association Test for Microbiome Data - Implements the Phylogeny-Guided Microbiome OTU-Specific Association Test method, which boosts the testing power by adaptively borrowing information from phylogenetically close OTUs (operational taxonomic units) of the target OTU. This method is built on a kernel machine regression framework and allows for flexible modeling of complex microbiome effects, adjustments for covariates, and can accommodate both continuous and binary outcomes.

Embeddings

GMEmbeddings: An R Package to Apply Embedding Techniques to Microbiome Data

Probabilistic workflows

  • alto nested topic models
  • fido (stray) for Bayesian analysis of balances
  • Probabilistic topic models / LDA and differential topic analysis from Holmes & Jeganathan
  • Probabilistic PCA implementation and examples for miaverse
  • Multiple testing correction vs. hierarchical probabilistic testing
  • Power calculations

Differential abundance

  • LinDA
  • songbird
  • More examples on ratio-based tests
  • ANCOM-II not mentioned in OMA examples (ANCOM-BC is but it is not in this review; which one is better..?)

Balances

Tasks

Phylogenetic tree support

  • OGU support
  • phylofactor; support for phylogenetic factorization based on phILR
  • Using tidytree? Could be processed and visualized via tidytree, treeio, ggtree and ggtreeExtra

Heatmaps

Add a link to iheatmapr; consider if an example would be useful (or not) in addition to pheatmap examples.

Generic points on heatmaps:

  • meltAssay is readily useful to prepare data for ggplot-based heatmaps; is there a need for other tse data converters for pheatmap or some other heatmap packages?

  • Check if there are good existing resources for heatmaps on the single cell experiment side (OSCA book?) or other

  • Consider heatmap example with philr balances, in addition to typical abundance-based heatmaps. First finalize the philr PR, however.

  1. Add neat & neatsort in miaViz and then also add heatmap examples to OMA using these?

Data resources

  • qiitr; add support for data retrieval and analysis from QIITA

Multi-assay association & integration

Sankaran & Holmes (2019) Multitable Methods for Microbiome Data Integration

POMS for Integrating phylogenetic and functional data in microbiome studies. Available as an R package

Prospective analysis

Interactivity

  • iSEE

  • iSEEde for DA analysis

  • Interactive tables

  • 3D visualizations

  • Interactive heatmap example with heatmaply after we have this on a server?

Review

Time series

BiomeHorizon: Visualizing Microbiome Time Series Data in R

Turnover indices

Visualization

Additional documentation

  • Walkthrough to TreeSE from microbiome perspective? First review resources in other sources, including SE, TreeSE, SCE, OSCA, scater, seurat etc.
  • Rank abundance curve
  • Forest plot
  • UpSet plots
  • Power calculations
  • RDA & other supervised ordination techniques
  • Table examples
  • Density plot
  • Volcano plot
  • Horseshoe effects
  • Visualization artefacts (enterotype w/o colors; forced clustering; biased groups etc.) for educational purposes
  • Cross-sectional
  • Case-control
  • Intervention
  • Individual
  • Longitudinal time series analysis & simulation (seqtime/microsimR integration)?
  • Spatial
  • Networks
  • Tipping elements & bimodality
  • Forensics / source tracking ref1; ref2

Request support to external packages

  • file2meco
  • maaslin2 (Mallick H et al., 2020)
  • specificity

Manuscript

Write the manuscript collaboratively using Manubot

Training material and additional content

Containers

  • Add comparison between TreeSE vs. phyloseq systematically; what are the main differences, what are key similarities, what are the pros & cons of each, how to convert? MAE to bind.

  • Speed comparisons?

Beta diversity

Explain Difference between PLS-DA, (dbRDA), ... good short explanation

Power calculations

Tree-based methods

  • Examples on tree based alpha, beta, diffab (hierarchical tests?)

Summary slides

  • microbiome workflow
  • analysis types

Packages with useful tools

Go through for useful functionality (supporting SE/SCE):

Alpha indices