Skip to content
Pierre Barbera edited this page Feb 16, 2018 · 9 revisions

Before using epa-ng, it is important to know where it sits in the bigger picture. On this page you will find an example of what a full placement pipeline might look like.

There are two major components to phylogenetic placement:

  • a set of sequences to place (called query sequences)
  • a set of sequences that represent the context within which we want to place (called reference sequences)

If you've landed at this page, you probably have your query sequences already. Typically these come from metagenomic or metabarcoding sequencing, and are already filtered to belong to some genetic region (like 16S, 18S, or other common barcodes).

The most common question then is: for my query sequences, I want to know where they belong in terms of taxonomy. What is the taxonomic composition of my environmental sample?

epa-ng can answer this question, but it needs a handful of other programs to prepare the data and to perform any in-depth post analysis you might require for your research.

Step 1: Selecting the reference sequences

This is the most biologically involved step (apart from the wet-lab work), as

Step 2: Building a reference alignment and tree

Step 3: Aligning the query sequences

Step 4: Placing the query sequences

Step 5 and onward: Visualization, post-analysis

Clone this wiki locally