Python tool to reduce size and redundancy of phylogenetic datasets
Here you find the software, the data and scripts used in the manuscript (Influenza_data, MTB_data and stochastic_component), and a short tutorial on fine tuning the behaviour of Treemmer with the pruning options.
From version 0.3 Treemmer is compatible with both python 2 and python 3
You need to install the following dependencies:
ETE3
Joblib
Numpy
Matplotlib
Treemmer can be used with singularity (instructions for installation here).
For example:
- download the image of the container:
singularity pull --arch amd64 library://fmenardo/treemmer/treemmer:0.3
- build the container
singularity build --sandbox treemmer_sb/ treemmer_0.3.sif
- run Treemmer
singularity exec treemmer_sb/ python3 /Treemmer_v0.3.py -h
Plot relative tree length decay.
python3 Treemmer_v0.3.py tree_file.nwk
Prune tree until the realative tree length is 90% of starting tree.
python3 Treemmer_v0.3.py tree_file.nwk -RTL 0.9
Prune tree down to 100 tips.
python3 Treemmer_v0.3.py tree_file.nwk -X 100
If you use Treemmer please cite:
Menardo et al. (2018). Treemmer: a tool to reduce large phylogenetic datasets with minimal loss of diversity. BMC Bioinformatics 19:164.
https://doi.org/10.1186/s12859-018-2164-8.
Check also treemmer-animate, a script by Thomas Hackl to visualize the "treemming" process: https://github.com/thackl/treemmer-animate