Sapling is a method that infers a small set of backbone trees on a smaller subset of mutations that collectively summarize the entire set of possible phylogenies. Sapling can also grow a given backbone tree into a full phylogeny.
Sapling requires the following python packages:
- numpy
- cvxopt/gurobipy (at least one, using cvxopt by default, progressive LP (
-L pLP
) is faster but requires gurobipy)
Sapling takes a TSV(tab-separated values) file as input. The first line includes the names of the columns, the following colunms are required:
sample_index
(0, 1, ..., m-1) for m samplesmutation_index
(0, 1, ..., n-1) for n mutationsvar
variant readsdepth
total readscluster_index
(optional) (0, 1, ..., k-1) for k clusters
Here is an example of input.
The output is a text file.
The first line includes the the number backbone trees, in the format of "#
Usage:
python main.py [-h] -f str -o str -a float [-l int] [-t int] [-m] [-L str]
Where:
-h
Print a short help message
-f str
Input file
-o str
output file
-a float
rho: lower bound factor of likelihood
-l int
ell: size of the mutation set, will overide -t if provide any (default: None)
-t int
tau: upperbound of the number of backbone trees (default: 5)
-m
allow multiple evolutions from germline (GL)
-L str:
method to solve LLH: cvxopt/pLP (default: cvxopt)
Example command:
python main.py -f example.tsv -o example.txt -a 0.9
The output of the above command on the example input.
Usage:
python greedy_expand.py [-h] -f str -o str -r str -a float [-m] [-L str]
Where:
-h
Print a short help message
-f str
Input file
-o str
output file
-r str
result file: output of the main program
-a float
rho: lower bound factor of likelihood
-m
allow multiple evolutions from germline (GL)
-L str:
method to solve LLH: cvxopt/pLP (default: cvxopt)
Example command
python greedy_expand.py -f example.tsv -r example.txt -o example.full.txt -a 0.9
The output of the above command on the example input using the output generated above.