Copyright (c) 2019-Present, Ryan L. Collins and the Talkowski Laboratory.
Distributed under terms of the MIT License (see LICENSE
).
The recommended way to run Athena is from its dedicated Docker container hosted on Google Container Registry. This will handle all dependencies and installation for you, and ensure you are running the latest version.
$ docker pull us.gcr.io/broad-dsmap/athena
$ docker run --rm -it us.gcr.io/broad-dsmap/athena
If you would prefer to install Athena on your own system, you can do so with pip
.
$ git clone https://github.com/talkowski-lab/athena.git
$ cd athena
$ pip install -e .
Athena is called from the command line:
$ athena --help
Usage: athena [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
annotate-bins Annotate bins
annotate-pairs Annotate pairs
breakpoint-confidence Annotate breakpoint uncertainty
count-sv Intersect SV and 1D bins or 2D bin-pairs
eigen-bins Eigendecomposition of annotations
feature-hists Plot bin annotation distributions
feature-stats Compute feature distributions
make-bins Create sequential bins
mu-predict Predict mutation rates with a trained model
mu-query Query a mutation rate matrix
mu-train Train mutation rate model
pair-bins Create pairs of bins
slice-remote Localize slices of remote genomic data
transform Transform one or more annotations
vcf-filter Filter an input VCF
vcf-stats Get SV size & spacing
Athena has numerous subcommands. Specify --help
with any subcommand to see a list of options available.
This package was designed with canonical CNVs from the gnomAD-SV callset in mind.
To that end, it assumes input data follows gnomAD-SV formatting standards. This may cause issues for alternative styles of SV representation, for SV types other than canonical CNVs, or different metadata labels.
If using non-gnomAD data with Athena, please compare your VCF formatting standards, and the INFO
field in particular.
You can read more about the gnomAD-SV dataset in the corresponding preprint.
This package is named after Athena, the Greek goddess of wisdom, strategy, tactics, and mathematics. She was selected as the namesake for this package given that it relies on understanding the features that influence structural variation mutation rates (wisdom), incorporating those features into a statistical model (mathematics), and using these models to infer which components of the genome are vulnerable to changes in copy number (a kind of genomic tactics/strategy).