Skip to content

Latest commit

 

History

History
642 lines (548 loc) · 20.4 KB

README.md

File metadata and controls

642 lines (548 loc) · 20.4 KB

Gotree

build Anaconda-Server Badge Docker hub downloads DOI:10.1093/nargab/lqab075

Gotree Logo

Gotree is a set of command line tools to manipulate phylogenetic trees. It is implemented in Go language.

Gotree handles phylogenetic trees in Newick, Nexus, PhyloXML and Nextstrain/Augur v2 formats, through several basic commands. Each command may print result (a tree for example) in the standard output, and thus can be piped to the standard input of the next gotree command.

Input files may be local or remote files:

  • If file name is of the form http://<URL>, the file is download from the given URL.
  • If file name is of the form itol://<ID>, the tree having the given ID is downloaded from iTOL using the iTOL api.
  • If file name is of the form treebase://<ID>, the tree having the given ID is downloaded from TreeBase.
  • Otherwise, the file is considered local.

Gzipped input files (.gz extension) are supported.

Note:

To manipulate multiple alignments, See also Goalign.

Examples:

$ echo "(1,(2,(3,4,5,6)polytomy)internal)root;" | gotree draw text --with-node-labels -w 50
+--------------- 1                                          
|                                                           
root            +---------------- 2                         
|               |                                           
+---------------|internal        +--------------- 3         
                |                |                          
                |                |--------------- 4         
                +----------------|polytomy                  
                                 |--------------- 5         
                                 |                          
                                 +--------------- 6         

$ echo "(1,(2,(3,4,5,6)polytomy)internal)root;" | gotree labels --internal --tips
root
1
internal
2
polytomy
3
4
5
6
$ gotree generate uniformtree -l 100 -n 10 | gotree stats

|tree  |  nodes  |  tips  |  edges  |  meanbrlen   |  sumbrlen     |  meansupport  |  mediansupport  |  rooted    |
|------|---------|--------|---------|--------------|---------------|---------------|-----------------|------------|
|0     |  198    |  100   |  197    |  0.09029828  |  17.78876078  |  NaN          |  NaN            |  unrooted  |
|1     |  198    |  100   |  197    |  0.08391711  |  16.53167037  |  NaN          |  NaN            |  unrooted  |
|2     |  198    |  100   |  197    |  0.08369861  |  16.48862662  |  NaN          |  NaN            |  unrooted  |
|3     |  198    |  100   |  197    |  0.08652623  |  17.04566698  |  NaN          |  NaN            |  unrooted  |
|4     |  198    |  100   |  197    |  0.07970206  |  15.70130625  |  NaN          |  NaN            |  unrooted  |
|5     |  198    |  100   |  197    |  0.09145831  |  18.01728772  |  NaN          |  NaN            |  unrooted  |
|6     |  198    |  100   |  197    |  0.08482117  |  16.70977068  |  NaN          |  NaN            |  unrooted  |
|7     |  198    |  100   |  197    |  0.08470308  |  16.68650662  |  NaN          |  NaN            |  unrooted  |
|8     |  198    |  100   |  197    |  0.08646811  |  17.03421732  |  NaN          |  NaN            |  unrooted  |
|9     |  198    |  100   |  197    |  0.07088132  |  13.96362091  |  NaN          |  NaN            |  unrooted  |

This will generate 10 random unrooted uniform binary trees, each having 100 tips, and print statistics about them.

Reference

If you use Gotree or Goalign, please cite:

Frédéric Lemoine, Olivier Gascuel

Gotree/Goalign: toolkit and Go API to facilitate the development of phylogenetic workflows,

NAR Genomics and Bioinformatics, Volume 3, Issue 3, September 2021, lqab075, doi

Installation

Easy way: Binaries

You can download ready to run binaries for the latest release in the release section. Binaries are available for MacOS, Linux, and Windows (32 and 64 bits).

Once downloaded, you can just run the executable without any other downloads.

Docker

Gotree Docker image is accessible from docker hub. You may use it as following:

# Display gotree help
docker run -v $PWD:$PWD -w $PWD -i -t evolbioinfo/gotree:v0.2.8b -h

Singularity

Gotree Docker image is usable from singularity . You may use it as following:

# Pull image from docker hub
singularity pull docker://evolbioinfo/gotree:v0.2.8b
# Display gotree help
./gotree-v0.2.8b.simg -h

Conda

Gotree is also available on bioconda. Just type:

conda install -c bioconda gotree

From sources

To build gotree, you must first download and install Go on your system ($1.21.6$).

Then you just have to type :

git clone [email protected]:evolbioinfo/gotree.git
cd gotree
make && make install
# or go get . && go build .
# or go get . && go install .

The gotree executable should be located in the current folder (or the $GOPATH/bin).

To test the executable:

./test.sh

Auto completion

gotree uses cobra, and therefore proposes a command to generate auto completion scripts:

gotree completion -h

Usage

gotree implements several tree manipulation commands.

You may go to the doc for a more detailed documentation of the commands.

List of commands

  • annotate: Annotate internal nodes of a tree with given data
  • brlen: Modify branch lengths
    • clear: Clear lengths from input trees
    • cut: Cut branches whose length is greater than or equal to the given length
    • round: Round branch lengths from input trees with a given precision
    • scale: Scale lengths from input trees by a given factor
    • setmin: Set a min branch length to all branches with length < cutoff
    • setrand: Assign a random length to edges of input trees
    • set: Assign a given length to edges of input trees
  • collapse: Collapse branches or clades of input trees
    • clades
    • depth
    • length
    • support
  • comment: Modify branch/node comments
    • clear: Remove node/tip comments
  • compare: Compare full trees, edges, or tips
    • edges: Individually compare edges of the reference tree to a compared tree
    • tips: Compare the set of tips of the reference tree to a compared tree
    • trees: Compare 2 trees in terms of common and specific branches
  • compute: Computations such as consensus and supports
    • bipartitiontree: Builds one tree with only one given bipartition
    • consensus: Compute the consensus from a set of input trees
    • edgetrees: Write one output tree per branch of the input tree, with only one branch
    • support: Compute bootstrap supports
  • divide: Divide an input tree file into several tree files
  • download: Download a tree image from a server
    • itol: download a tree image from iTOL, with given image options
    • ncbitax: Download the full ncbi taxonomy in newick format
    • panther: Download a tree from Panther database (http://pantherdb.org/)
  • draw: Draw tree(s) with different layouts
    • text: Display tree(s) in ASCII text format
    • png : Draw tree(s) in png format, with normal, radial/unrooted or circular layout
    • svg : Draw tree(s) in svg format, with normal, radial/unrooted or circular layout
    • cyjs: Draw tree(s) in a html file, using cytoscape js
  • generate: Generate random trees, branch lengths are simply drawn from an expontential(1) law
    • balancedtree
    • caterpillartree
    • startree
    • topologies: all possible topologies
    • uniformtree
    • yuletree
  • graft: Graft a tree on an input tree in place of a given tip
  • labels: Lists labels (names) of all tips
  • ltt: Compute lineage through time data/plot
  • matrix: Print (patristic) distance matrix associated to the input tree
  • merge: Merges two rooted trees
  • nni: Generate all NNI neighbors from a given tree
  • prune: Remove tips of the input tree that are not in the compared tree, or that are given on the command line
  • reformat: Convert input file between nexus and newick formats
    • newick
    • nexus
  • rename: Rename tips of the input tree, given a map file, or a regexp, or automatically
  • repopulate: Re populate the tree with identical tips (having the exact same sequence)
  • reroot: Reroot trees using an outgroup or at midpoint
    • midpoint
    • outgroup
  • rotate: Reorders neighbors of internal nodes. Does not change the topology, but just traversal order
    • rand: Randomly reorders neighbors of internal nodes
    • sort: Sort neighbors of internal nodes by ascending number of tips
  • resolve: Resolve multifurcations by adding 0 length branches
  • sample: Takes a sample (with or without replacement) from the set of input trees
  • shuffletips: Shuffle tip names of an input tree
  • subtree: extract a subtree
  • support: Modify branch supports
    • clear Clear supports from input trees
    • round Round branch lengths from input trees with a given precision
    • setrand Assign a random support to edges of input trees
    • scale Scale branch supports from input trees by a given factor
  • stats: Print statistics about the tree, its edges, its nodes, if it is rooted, and its tips
    • edges
    • monophyletic : Print wether input tips form a monophyletic group in each of the input trees
    • nodes
    • rooted
    • tips
    • splits
  • unroot: Unroot input tree
  • upload: Upload a tree to a given server
    • itol : Upload a tree to itol, with given annotations
  • version: Display version of gotree

Gotree commandline examples

  • Generate 10 random unrooted uniform binary trees
$ gotree generate uniformtree -l 100 -n 10 | gotree stats
  • Generate 1 Yule-Harding tree with 50 tips, and display it on the terminal (width 100)
$ gotree generate yuletree -l 50 | gotree draw text -w 100
  • Generate 1 tree with 50 tips, and draw it on a SVG image
$ gotree generate yuletree -l 50 | gotree draw svg -w 1000 -H 1000 -o tree.svg
$ gotree generate yuletree -l 50 | gotree draw svg -w 1000 -H 1000 -r -o tree_radial.svg
  • Reformating 4 input random trees into Nexus format:
$ gotree generate yuletree -n 4 -l 8 --seed 10 | gotree brlen clear | gotree reformat nexus

Will output:

#NEXUS
BEGIN TAXA;
 TAXLABELS Tip4 Tip7 Tip2 Tip0 Tip3 Tip6 Tip5 Tip1;
END;
BEGIN TREES;
  TREE tree0 = ((Tip4,(Tip7,Tip2)),Tip0,(Tip3,((Tip6,Tip5),Tip1)));
  TREE tree1 = (Tip5,Tip0,((Tip6,Tip4),((Tip3,Tip2),(Tip7,Tip1))));
  TREE tree2 = (((Tip7,Tip3),(Tip4,Tip2)),Tip0,((Tip6,Tip5),Tip1));
  TREE tree3 = (Tip4,Tip0,((Tip5,Tip2),(Tip3,(Tip6,(Tip7,Tip1)))));
END;
  • Unrooting a tree
$ gotree unroot -i tree.tre -o unrooted.tre
  • Collapsing short branches
$ gotree collapse length -i tree.tre -l 0.001 -o collapsed.tre
  • Collapsing lowly supported branches
$ gotree collapse support -i tree.tre -s 0.8 -o collapsed.tre
  • Removing length information
$ gotree brlen clear -i tree.nw -o nolength.nw
  • Removing support information
$ gotree support clear -i tree.nw -o nosupport.nw

Note that you can pipe the two previous commands:

$ gotree support clear -i tree.nw | gotree clear lengths -o nosupport.nw
  • Printing tree statistics
$ gotree stats -i tree.tre
  • Printing edge statistics
$ gotree stats edges -i tree.tre

Example of result:

tree brid length support terminal depth topodepth rightname
0 0 0.107614 N/A false 1 6
0 1 0.149560 N/A true 0 1 Tip51
0 2 0.051126 N/A false 1 5
0 3 0.003992 N/A false 1 4
0 4 0.030974 N/A false 1 3
0 5 0.270017 N/A true 0 1 Tip84
0 6 0.029931 N/A false 1 2
0 7 0.001136 N/A true 0 1 Tip70
0 8 0.011658 N/A true 0 1 Tip45
0 9 0.104188 N/A true 0 1 Tip34
0 10 0.003361 N/A true 0 1 Tip16
0 11 0.021988 N/A true 0 1 Node0
  • Printing tips
$ gotree stats tips -i tree.tre

Example of result:

tree id nneigh name
0 1 1 Tip8
0 2 1 Node0
0 5 1 Tip4
0 8 1 Tip9
0 9 1 Tip7
0 11 1 Tip6
0 13 1 Tip5
0 14 1 Tip3
0 16 1 Tip2
0 17 1 Tip1
  • Comparing tips of two trees
$ gotree compare tips -i tree.tre -c tree2.tre

This will compare the two sets of tips.

Example:

$ gotree compare tips -i <(gotree generate uniformtree -l 10 -n 1) \
                      -c <(gotree generate uniformtree -l 11 -n 1)
(Tree 0) > Tip10
(Tree 0) = 10

10 tips are equal, and "Tip10" is present only in the second tree.

  • Removing tips that are absent from another tree
$ gotree prune -i tree.tre -c other.tre -o pruned.tre

You can test with

$ gotree prune -i <(gotree generate uniformtree -l 1000 -n 1) \
               -c <(gotree generate uniformtree -l 100 -n 1) \
               | gotree stats

It should print 100 tips.

  • Comparing bipartitions Count the number of common/specific bipartitions between two trees.
$ gotree compare trees -i tree.tre -c other.tre

You can test with random trees (there should be very few common bipartitions)

$ gotree compare trees -i <(gotree generate uniformtree -l 100 -n 1) \
                       -c <(gotree generate uniformtree -l 100 -n 1)
Tree reference common compared
0 97 0 97
  • Renaming tips of the tree If you have a file containing the mapping between current names and new names of the tips, you can rename the tips:
$ gotree rename -i tree.tre -m mapfile.txt -o newtree.tre

You can try by doing:

$ gotree generate uniformtree -l 100 -n 1 -o tree.tre
$ gotree stats tips -i tree.tre | awk '{if(NR>1){print $4 "\tNEWNAME" $4}}' > mapfile.txt
$ gotree rename -i tree.tre -m mapfile.txt | gotree stats tips

Gotree api usage examples

  • Parsing a newick string
package main

import (
	"fmt"
	"strings"

	"github.com/evolbioinfo/gotree/io/newick"
	"github.com/evolbioinfo/gotree/tree"
)

func main() {
	var treeString string
	var t *tree.Tree
	var err error
	treeString = "(Tip2,Tip0,(Tip3,(Tip4,Tip1)));"
	t, err = newick.NewParser(strings.NewReader(treeString)).Parse()
	if err != nil {
		panic(err)
	}
	fmt.Println(t.Newick())
}
  • Parsing a newick file
package main

import (
	"fmt"
	"os"

	"github.com/evolbioinfo/gotree/io/newick"
	"github.com/evolbioinfo/gotree/tree"
)

func main() {

	var t *tree.Tree
	var err error
	var f *os.File
	if f, err = os.Open("t.nw"); err != nil {
		panic(err)
	}
	t, err = newick.NewParser(f).Parse()
	if err != nil {
		panic(err)
	}
	fmt.Println(t.Newick())
}
  • Helper functions to parse multi tree newick file
package main

import (
	"bufio"
	"fmt"
	"io"

	"github.com/evolbioinfo/gotree/io/utils"
	"github.com/evolbioinfo/gotree/tree"
)

func main() {
	var t tree.Trees
	var err error
	var ntrees int = 0
	var trees <-chan tree.Trees
	var treefile io.Closer
	var treereader *bufio.Reader

	/* File reader (plain text or gzip) */
	if treefile, treereader, err = utils.GetReader("trees.nw"); err != nil {
		panic(err)
	}
	defer treefile.Close()
	// format may be FORMAT_NEWICK, FORMAT_NEXUS, FORMAT_PHYLOXML
	trees = utils.ReadMultiTrees(treereader, utils.FORMAT_NEWICK)
	for t = range trees {
		if t.Err != nil {
			panic(t.Err)
		}
		ntrees++
		fmt.Println(t.Tree.Newick())
	}
	fmt.Printf("Number of trees: %d\n", ntrees)
}
  • Tree functions
// Getting edges
var edges []*tree.Edge = t.Edges()
// Internal edges only
var iedges []*tree.Edge = t.InternalEdges()
// Tip edges only
var tedges []*tree.Edge = t.TipEdges()
// Getting Nodes
var nodes []*tree.Node = t.Nodes()
// Tips only
var tips []*tree.Node = t.Tips()
// Getting tip names
var tipnames []string = t.AllTipNames()
// Root/Pseudoroot node
var root *tree.Node = t.Root()
// If the tree is rooted or not
var rooted bool = t.Rooted()
  • Branch functions
// Branch length
var l float64 = e.Length()
// Branch support
var s float64 = e.Support()
// Node on the "right"
var n1 *tree.Node = e.Right()
// Node on the "left"
var n2 *tree.Node = e.Left()
// Number of leaves under this edge
var nt uint = e.NumTips()
  • Node functions
// Node name
var n string = n.Name()
// Number of neighbors
var nn int = n.Nneigh()
// List of neighbors (including "parent")
var neighb []*tree.Node = n.Neigh()
// If a node is a tip or not
var tip bool = n.Tip()
// List of branches going from this node (including "parent")
var edges []*tree.Edge = n.Edges()
  • Removing tips
if err = t.RemoveTips(false, "Tip0"); err != nil {
	panic(err)
}
fmt.Println(t.Newick())
  • Knowning if a tip exists in the tree
var exists bool
var err error
exists,err = t.ExistsTip("Tip0")
  • Shuffling tip names of the tree
t.ShuffleTips()
  • Removing branches
// Short branches
t.CollapseShortBranches(0.01)
// Lowly supported branches
t.CollapseLowSupport(0.7)
// Branches with "depth" <=10 && >= 1
t.CollapseTopoDepth(1,10)
  • Randomly resolving multifurcations
t.Resolve()
  • Removing branch informations
// Branch lengths
t.ClearLengths()
// Branch supports
t.ClearSupports()
  • Unrooting the tree
t.Unroot()
  • Cloning the tree
t.Clone()
  • Rerooting at midpoint
t.RerootMidPoint()
  • Generating random trees
var ntips int = 100
var rooted bool = true
// Uniform tree
t,err = tree.RandomUniformBinaryTree(ntips, rooted)
// Balanced tree
t,err = tree.RandomBalancedBinaryTree(ntips, rooted)
// Yule-Harding tree
t,err = tree.RandomYuleBinaryTree(ntips, rooted)
  • SVG Tree drawing
import "github.com/evolbioinfo/gotree/draw"
...
var d draw.TreeDrawer
var l draw.TreeLayout
f, err := os.Create("image.svg")
d = draw.NewSvgTreeDrawer(f, 800, 800, 30, 30, 30, 30)
l = draw.NewRadialLayout(d, false, false, false, false)
// or l = draw.NewCircularLayout(d, false, false, false, false)
// or l = draw.NewNormalLayout(d, false, false, false, false)
l.DrawTree(t)
f.Close()
  • PNG Tree drawing
import "github.com/evolbioinfo/gotree/draw"
...
var d draw.TreeDrawer
var l draw.TreeLayout
f, err := os.Create("image.svg")
d = draw.NewPngTreeDrawer(f, 800, 800, 30, 30, 30, 30)
l = draw.NewRadialLayout(d, false, false, false, false)
// or l = draw.NewCircularLayout(d, false, false, false, false)
// or l = draw.NewNormalLayout(d, false, false, false, false)
l.DrawTree(t)
f.Close()