Godon is codon models software written in Go.
Godon development was supported Swiss National Science Foundation (grant numbers CR32I3_143768, IZLRZ3_163872).
-
Godon supports rate variation (see the manuscript). There are three models which support rate variation: branch-site (model
BSG
), M8 (modelM8
) and M0 (modelM0G
). You need to specify the number of discrete categories. Otherwise, there will be no rate variation in the model. Use--ncat-site-rate
or--ncat-codon-rate
for site rate variation and codon rate variation respectively. -
Godon supports state aggregation (option
--aggregate
). See the paper for the details. For the paper, we used v0.5 (39bf774). Since then likelihood computations code were substantially changed. -
A heuristic to avoid LRT statistics overestimation, which often causes false positives in PAML. Also corrects for LRT underestimation. Use
godon test
to enable. -
A heuristic for fast branch-length estimation via M0 (
--m0-tree
). -
Multiple optimizers available: L-BFGS-B, downhill simplex, simulated annealing, SQP, and others via NLopt.
-
Markov chain Monte Carlo support (Metropolis-Hastings algorithm).
-
Export to machine-readable JSON format.
-
Multithreading support (unlike PAML).
-
Starting point specification (only some parameters in PAML) and randomization (disabled in PAML).
-
Testing multiple branches in one run for the branch-site model.
-
Wide range of models: M0, M1a, M2a, M7, M8, and branch-site.
-
Support for various genetic codes.
-
Checkpoints: in case your long computation was interrupted it is possible to continue. You need to specify checkpoint file to use this (
--checkpoint
). Warning: this might affect reproducibility when it comes to random number generator.
You can ask questions at the
bioinformatics stackexchange site.
Do not forget to use the [godon]
tag. Use issues
to report bugs.
The software was tested on GNU/Linux and Mac OS X.
You can fetch the latest statically compiled binary for GNU/Linux from
the downloads section; do not forget to make it executable prior to
running (chmod +x godon-master-linux-gnu-x86_64
).
Requirements:
- Go (preferably v1.7 or later)
- Git
- C and Fortran compilers
- NLopt
- BLAS (e.g. OpenBLAS)
- Gonum BLAS C-bindings
Once you have got all of that you can run:
$ bin/install.sh
-
Install Go v1.7 or later. You can start by installing Go v1.6 and then updating using godeb.
-
Install dependencies:
sudo apt-get install git libnlopt-dev libopenblas-dev build-essentials gfortran
-
Install Gonum BLAS:
CGO_LDFLAGS="-lopenblas" go install github.com/gonum/blas/cgo
-
(Optional) If your Go is older than v1.7 install go-lbfsg.
-
Install godon:
bin/install.sh
- Install Homebrew.
- Install dependentices:
brew install go gcc nlopt
(may take more than an hour). - If you don't have git, install it as well:
brew install git
. - Install godon:
curl -L https://bitbucket.org/Davydov/godon/raw/master/bin/install.sh | CC=gcc-7 bash
. You need to use gcc from Homebrew, in this casegcc-7
. - (Optional) Add the binary directory of Go to the
PATH
variable. E.g., putexport PATH=$PATH:$HOME/go/bin
into your~/.bash_profile
.
-
Make sure you have C compiler, build tools and gfortran.
-
Install Go (1.7 or later).
-
Install NLopt.
-
Get Godon source code with
go get -d bitbucket.org/Davydov/godon/godon
. -
Install godon. Depending on the installation, you may need to specify paths to nlopt library and include files and to the fortran library
libgfortran
(on the test system it was/usr/local/Cellar/gcc/6.2.0/lib/gcc/6
). Run:CGO_CFLAGS="-I/path/to/nlopt/include" CGO_LDFLAGS="-L/path/to/libgfortran -L/path/to/nlopt/lib" $GOPATH/src/bitbucket.org/Davydov/godon/bin/install.sh
Don't forget to check out the tutorial.
You can find sample datasets in godon/cmodel/testdata
.
You can tell Godon to run a pair of models (M8 vs. M8a or branch-site
H1 vs. H0). In this case, if the foreground branch for the branch-site
model is not labeled with #1
, Godon will test all the branches. To
force this behavior even in the presence of #1
labeled branch, use
--all-branches
. You can exclude terminal branches with
--no-leaves
. You can use branch lengths estimated with M0 using
--m0-tree
.
#!bash
$ godon test BS --m0-tree --all-branches EMGT00050000008747.Drosophila.002.fst EMGT00050000008747.Drosophila.002.nwk
Perform likelihood maximization using L-BFGS-B optimizer for the Branch-Site model without optimizing the branch lengths (use only a single CPU).
#!bash
$ godon -p 1 -n BS EMGT00050000000025.Drosophila.001.fst EMGT00050000000025.Drosophila.001.nwk
Run MCMC using M0 model with the downhill simplex optimization.
#!bash
$ godon -m mh M0 EMGT00050000000025.Drosophila.001.fst EMGT00050000000025.Drosophila.001.nwk
bin
installation scriptbio
reads fasta and translates genetic codecmodel
codon modelscodon
working with codon and transition matricesgodon
is MCMC sampler/maximum likelihood for M0 and branchsite modelmisc
various utilitiesoptimize
is the MCMC & downhill simplex and other algorithms implementationdist
functions related to discrete distributions, initially ported from PAMLtree
is tree manipulation library
codon_frequency.go
— F0, F3X4codon_sequences.go
— codon alignment classematrix.go
— matrix class which remembers its eigen decompositionmatrix.go
— transition matrix routines
aggregation.go
— codon aggregation codebranch_site.go
— branch site modelM0.go
— M0 modelmodel.go
— tree + alignment model base classtools.go
— misc helper functions
likelihood_test.go
— likelihood test (compare with codeml)mcmc_test.go
— MCMC benchmarkmcmcpar_test.go
— test that likelihood is consistent during chain evaluation
adaptive.go
— adaptive parameter classlbfgsb.go
— L-BFGS-B optimizermh.go
— metropolis hastings & simulated annealing implementationsnlopt_callback.go
— NLopt callback wrappernlopt.go
— NLopt wrapperoptimizer.go
— Optimizer and Optimizable intefacesparameter.go
— float64 parameter classprior.go
— prior functionsproposal.go
— proposal functionssimplex.go
— simplex methodutils.go
— helper functions
brexp
exports branch lengths and node labels in various formatsbrmatch
matches branch labels between two treesnorm
is a sampler for multiple normal distributions model