Skip to content

Phylo2Vec: a vector representation for binary trees

License

Notifications You must be signed in to change notification settings

madelinegordon/phylo2vec

 
 

Repository files navigation

Phylo2Vec

This repository contains an implementation of Phylo2Vec. It is distributed under the GNU Lesser General Public License v3.0 (LGPL).

PyPI version

Link to the paper: https://doi.org/10.1093/sysbio/syae030

Installation

Dependencies

  • python>=3.9
  • numba==0.56.4
  • numpy==1.23.5
  • biopython==1.80.0
  • joblib>=1.2.0
  • ete3==3.1.3

User installation

Pip

pip install phylo2vec

Manual installation

  • We recommend to setup an isolated environment, using conda, mamba or virtualenv.
  • Clone the repository and install using pip:
git clone https://github.com/Neclow/phylo2vec_dev.git
pip install -e .

Development

Additional test dependencies

  • pytest==7.4.2
  • six==1.16.0

Testing

After installation, you can launch the test suite from outside the source directory:

pytest phylo2vec

Warning! You might need to clear your __pycache__ folders beforehand:

rm -rf phylo2vec/__pycache__/
rm -rf phylo2vec/base/__pycache__/

Basic usage

Conversions

  • The base module contains elements to convert a Newick string (to_vector) to a Phylo2Vec vector and vice versa (to_newick)

Example:

import numpy as np
from phylo2vec.base import to_newick, to_vector

v = np.array([0, 1, 2, 3, 4])

newick = to_newick(v) # '(0,(1,(2,(3,(4,5)6)7)8)9)10;'

v_converted = to_vector(newick) # array([0, 1, 2, 3, 4], dtype=int16)

Optimization

Example:

from phylo2vec.opt import HillClimbingOptimizer

hc = HillClimbingOptimizer(raxml_cmd="/path/to/raxml-ng_v1.2.0_linux_x86_64/raxml-ng", verbose=True)
v_opt, taxa_dict, losses = hc.fit("/path/to/your_fasta_file.fa")

Citation and other work

@article{phylo2vec,
  title={Phylo2Vec: a vector representation for binary trees},
  author={Penn, Matthew J and Scheidwasser, Neil and Khurana, Mark P and Duch{\^e}ne, David A and Donnelly, Christl A and Bhatt, Samir},
  journal={arXiv preprint arXiv:2304.12693},
  year={2023}
}

About

Phylo2Vec: a vector representation for binary trees

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 61.5%
  • Rust 38.5%