Skip to content

Cookbook

Niema Moshiri edited this page Jan 4, 2023 · 53 revisions

This page contains many examples of TreeSwift usage. In all of the examples, I will assume that the tree is in the Newick format in a file called example.tre and the loaded Tree object is stored in a variable called tree (unless stated otherwise).

Loading and Exporting a Tree

A tree can be loaded from a Newick string:

from treeswift import read_tree_newick
tree_string = "((A,B),(C,D));
tree = read_tree_newick(tree_string)

A tree can be loaded from a plain-text Newick file:

from treeswift import read_tree_newick
tree_file = "example.tre"
tree = read_tree_newick(tree_file)

A tree can be loaded from a gzipped Newick file:

from treeswift import read_tree_newick
import gzip
tree_file_gzipped = "example.tre.gz"
tree = read_tree_newick(tree_file_gzipped)

TreeSwift also supports loading trees in the Nexus format:

from treeswift import read_tree_nexus
tree_file = "example.nex"
tree1 = read_tree_nexus(tree_file)
tree_file_gzipped = "example.nex.gz"
tree2 = read_tree_nexus(tree_file_gzipped)

TreeSwift also supports loading trees in the NeXML format:

from treeswift import read_tree_nexml
tree_file = "example.nexml"
tree1 = read_tree_nexml(tree_file)
tree_file_gzipped = "example.nexml.gz"
tree2 = read_tree_nexml(tree_file_gzipped)

TreeSwift also supports manually creating a tree from scratch:

from treeswift import Tree, Node
tree = Tree()
tree.root.add_child(Node("A"))
tree.root.add_child(Node("B"))

Trees can be exported as Newick strings using the newick function or using the built-in Python str command:

tree_string = tree.newick()
tree_string == str(tree) # this is True

Trees can also be written to a Newick file using the write_tree_newick function:

tree.write_tree_newick('output.tre') # output as plain-text file
tree.write_tree_newick('output.tre.gz') # output as gzipped file

For a more easily human-readable output, trees can also be exported as indented Newick strings like in Newick Utilities using the indent function:

print(tree.indent())

To iterate over the labels of the Tree, you can use the labels function:

for label in tree.labels():               # include all nodes
    print(label)
for label in tree.labels(leaves=False):   # exclude leaves
    print(label)
for label in tree.labels(internal=False): # exclude internal nodes
    print(label)

For easy searching and manipulation given a label, TreeSwift can create a dictionary mapping labels (strings) to Node objects in the Tree. By default, the function label_to_node only selects leaves:

map_leaves = tree.label_to_node()

Alternatively, you can select all nodes, only internal nodes, or the nodes specified by a given set of strings (labels):

map_all = tree.label_to_node(selection='all')
map_internal = tree.label_to_node(selection='internal')
map_specific = tree.label_to_node(selection={'A','B','C'})

Performing a Tree Traversal

  • Pre-Order (documentation)
    for node in tree.traverse_preorder():
        print(node)
  • Post-Order (documentation)
    for node in tree.traverse_postorder():
        print(node)
  • In-Order (only for fully-bifurcating trees, i.e., all nodes have 0 or 2 children) (documentation)
    for node in tree.traverse_inorder():
        print(node)
  • Level-Order (documentation)
    for node in tree.traverse_levelorder():
        print(node)
  • Root-Distance-Order (in ascending or descending order of distance from the root) (documentation)
    for node in tree.traverse_rootdistorder():
        print(node)

Tree Properties

  • Average Branch Length (documentation)
    all_branches  = tree.avg_branch_length()               # include all branches
    only_internal = tree.avg_branch_length(terminal=False) # exclude terminal branches
    only_terminal = tree.avg_branch_length(internal=False) # exclude internal branches
  • Closest Leaf to Root (documentation)
    leaf,distance = tree.closest_leaf_to_root() # returns a (leaf,distance) tuple
  • Diameter (i.e., maximum leaf-to-leaf distance) (documentation)
    d = tree.diameter()
  • Distance Between Pair of Leaves (documentation)
    d = tree.distance_between(u,v) # u and v are Node objects
  • Distance Matrix (patristic distances between leaves) (documentation)
    M = tree.distance_matrix()
  • Edge Length Sum (i.e., Tree Length) (documentation)
    all_edges = tree.edge_length_sum()                   # include all branches
    only_internal = tree.edge_length_sum(terminal=False) # exclude terminal branches
    only_terminal = tree.edge_length_sum(internal=False) # exclude internal branches
  • Furthest Node from Root (documentation)
    node,distance = tree.furthest_from_root() # returns a (node,distance) tuple
  • Gamma Statistic (Pybus and Harvey, 2000) (documentation)
    gamma = tree.gamma_statistic()
  • Height (i.e., maximum distance from root) (documentation)
    h = tree.height()
  • Most Recent Common Ancestor (documentation)
    leaves_of_interest = {'A','B','C'}
    m = tree.mrca(leaves_of_interest) # returns MRCA of A, B, and C
  • Number of Lineages at Given Distance from Root (documentation)
    num = tree.num_lineages_at(0.5)
  • Number of Nodes (documentation)
    all_nodes = tree.num_nodes()                 # include all nodes
    only_internal = tree.num_nodes(leaves=False) # exclude terminal nodes
    only_leaves = tree.num_nodes(internal=False) # exclude internal nodes
  • Sackin Index (Sackin 1972) (documentation)
    sackin_by_leaves = tree.sackin()               # default normalizes by number of leaves
    sackin_by_yule = tree.sackin(normalize='yule') # normalize to Yule model
    sackin_by_pda = tree.sackin(normalize='pda')   # normalize to the PDA model
    sackin_raw = tree.sackin(normalize=None)       # don't normalize
  • Treeness (sum of internal branch lengths / sum of all branch lengths) (documentation)
    t = tree.treeness()

Iterating Over Lengths

  • Branch Lengths (documentation)
    for branch_length in tree.branch_lengths():               # include all branches
        print(branch_length)
    for branch_length in tree.branch_lengths(terminal=False): # exclude terminal branches
        print(branch_length)
    for branch_length in tree.branch_lengths(internal=False): # exclude internal branches
        print(branch_length)
  • Coalescence Times (documentation)
    for time in tree.coalescence_times():               # backward in time (leaves to root)
        print(time)
    for time in tree.coalescence_times(backward=False): # forward in time (root to leaves)
        print(time)
  • Coalescence Waiting Times (documentation)
    for delta in tree.coalescence_waiting_times():               # backward in time (leaves to root)
        print(delta)
    for delta in tree.coalescence_waiting_times(backward=False): # forward in time (root to leaves)
        print(delta)
  • Distances from Parent (documentation)
    for d in tree.distances_from_parent():               # include all nodes
        print(d)
    for d in tree.distances_from_parent(leaves=False):   # exclude leaves
        print(d)
    for d in tree.distances_from_parent(internal=False): # exclude internal nodes
        print(d)
    for d in tree.distances_from_parent(unlabeled=True): # include unlabeled nodes
        print(d)
  • Distances from Root (documentation)
    for d in tree.distances_from_root():               # include all nodes
        print(d)
    for d in tree.distances_from_root(leaves=False):   # exclude leaves
        print(d)
    for d in tree.distances_from_root(internal=False): # exclude internal nodes
        print(d)
    for d in tree.distances_from_root(unlabeled=True): # include unlabeled nodes
        print(d)

Tree Modifications

  • Collapse Short Branches (documentation)
    tree.collapse_short_branches(0.05) # collapse all branches <= 0.05 in length
  • Condense Identically-Labeled Nodes (documentation)
    tree.condense()
  • Contract Low-Support Nodes (documentation)
    tree.contract_low_support(0.7)
  • Extracting Subtree With Labels (documentation)
    labels_to_keep = {'A','B','C'}
    tree2 = tree.extract_tree_with(labels_to_keep)
  • Extracting Subtree Without Labels (documentation)
    labels_to_remove = {'A','B','C'}
    tree2 = tree.extract_tree_without(labels_to_remove)
  • Rename Nodes (documentation)
    tree.rename_nodes({'foo':'bar'})
  • Reroot (documentation)
    tree.reroot(subtending_node,length)
  • Resolve Polytomies Arbitrarily with 0-length Branches (documentation)
    tree.resolve_polytomies()
  • Scale Edges by Multiplier (documentation)
    tree2 = tree.scale_edges(100)
  • Suppress Unifurcations (documentation)
    tree.suppress_unifurcations()

Visualizations

  • Drawing the Tree (documentation)
    tree.draw()
  • Lineages Through Time (LTT) Plot (documentation)
    tree.lineages_through_time()
    tree.ltt() # equivalent shorthand