Skip to content

elkebir-group/MACH2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MACH2

A mathematical framework for inferring migration histories of metastatic cancer from clone phylogeny and the location of extant clones.

Table of contents

  1. Installation
    - Prerequisite
    - Install using pip
  2. Usage instruction
    - From JupyterLab
    - From Command Line

1. Installation

Prerequisites

  • Python - MACH2 requires Python 3.7 or newer.
  • ILP solver - MACH2 requires an ILP solver installed to solve PMH-TR. Currently MACH2 only supports Gurobi optimizer, but we are going to add support for more ILP solvers in the future. MACH2 requires a valid Gurobi installation and license key. The location of Gurobi should be present in LD_LIBRARY_PATH (linux) or DYLD_LIBRARY_PATH (macOS) the license key should be saved in the environment variable GRB_LICENSE_KEY.

Install using pip

MACH2 can easily be installed using pip, the package installer for Python. Open a terminal or command prompt and run the following command:

            $ pip install mach2

If you want to use MACH2 with JupyterLab, you'll need additional dependencies. To install these optional dependencies, you can run the following command:

            $ pip install mach2[jupyter]

Usage Instruction

I/O formats

We describe various formats used by MACH2.

  1. Tree file : The tree file contains a list of edges that define the structure of a tree. Each line in the file represents an edge, and the edges should be in the format: node1 node2. For example:

     1   2
     2   3
     2   4 
     3   5
    
  2. Tree file with timing/comigrations : Tree file with timestamps. Edges with the same timestamp belong to the same comigration, and a timestamp with -1 represents non-migration. Each line corresponds to an edge in the format: node1 node2 timestamp. For example:

     1   2   -1
     2   3   1
     2   4   1
     3   5   2
    
  3. Observed labeling file : The observed labeling file contains zero or more location labels assigned to each node of the input clonal tree. Each line in the file corresponds to a node and the labels assigned to it in the format: node label1 label2 .... If a node is not observed anywhere, it may be skipped. For example:

     1   A   B
     3   B
     4   A   C
     5   C
    
  4. Location labeling file : The location labeling file contains the unique location label of origin assigned to each node. Each line in the file corresponds to a node and the location label of origin are in the format: node label. For example:

     1   A
     2   B
     3   C
    
  5. Node of origin file : The node of origin file maps the nodes of the refined tree to the nodes of the input tree. Each line in the file corresponds to a vertex and the labels are in the format: leaf label. For example:

     1   A
     2   B
     3   C
    

Additionaly, MACH2 can output files in Graphviz DOT format or JSON format.

Usage

MACH2 takes as input two files -

  1. Tree file : Tree file describing the input clone tree.
  2. Observed labeling file : Labeling file describing the observed labeling of input clone tree.

MACH2 Can be run using command line, or can be directly accessed from JupyterLab.

From JupyterLab

The following code snippet imports MACH2, runs it for input tree file input.tree and input observed labeling file input.observed.labeling, and saves the solutions to a variable solutions.

            import mach2
            tree = mach2.MultiLabeledTree.from_files('input.tree', 'input.observed.labeling')
            solutions = mach2.MACH2(tree, primary_location='primary', criteria_ordering='UMC').solve()
            print(len(solutions))
            solutions.summary()

solutions is a SolutionSet object that behaves as a set. print(len(solutions)) prints the number of retrived solutions. The last line draws the summary graph for `solutions. It is possible to inspect individual solutions too.

            solution1 = [sol for sol in solutions][0]
            solution1.draw()
            solution1.migration_graph().draw()

The second line draws the tree with node labeling, and the third line draws the corresponding migration graph. For more details, check the documentation for each function.

From Command Line

For each solution, MACH2 can output three types of files.

  1. Tree file with timing/comigrations : Refined tree file with timestamps/comigrations.
  2. Location labeling file : Location labeling file describing the location labeling of the refined tree.
  3. Node of origin file : Node of origin file mapping refined tree nodes to input tree nodes.

Additionaly MACH2 can return JSON file encoding all the solutions. The JSON file can be directly passed to MACH2-viz. The exact format of the JSON file is described here.

MACH2 also prints <primary location> <number of migrations> <number of comigrations> Optimal <running time (in seconds)> on console.

MACH2 can be run using python.

            usage: mach2 [-h] [-p PRIMARY] [-c COLORMAP] [--log] [-o OUTPUT] [-N NSOLUTIONS] [-C] [-t THREADS] [-s] [-S] clone_tree leaf_labeling

            MACH2

            positional arguments:
            clone_tree            Input clone tree
            leaf_labeling         Input leaf labeling

            options:
            -h, --help            show this help message and exit
            -p PRIMARY, --primary PRIMARY
                                    Primary anatomical site
            -c COLORMAP, --colormap COLORMAP
                                    Color map file
            --log                 Outputs Gurobi logging
            -o OUTPUT, --output OUTPUT
                                    Output folder
            -N NSOLUTIONS, --nsolutions NSOLUTIONS
                                    Maximum number of solutions retained
            -C, --count_solutions
                                    Only prints the number of solutions (default=False)
            -t THREADS, --threads THREADS
                                    Number of threads
            -s, --suboptimal      Returns suboptimal solutions without duplicates, may be slow (default=False)
            -S, --seeding_sites   Minimizes the number of seeding sites too (default=False)

An example execution

    $ mach2 data/mcpherson_2016/patient1.tree data/mcpherson_2016/patient1.labeling -c data/mcpherson_2016/coloring.txt

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages