-
Notifications
You must be signed in to change notification settings - Fork 7
Subcommand: krd
Calculate the pairwise Kantorovich-Rubinstein (KR) distance matrix between samples.
Usage: gappa analyze krd [options]
Input | |
---|---|
--jplace-path |
Required. TEXT:PATH(existing)=[] ... List of jplace files or directories to process. For directories, only files with the extension .jplace[.gz] are processed. |
Settings | |
--exponent |
FLOAT=1 Exponent for KR integration. |
--normalize |
FLAG Divide the KR distance by the tree length to get normalized values. |
--point-mass |
FLAG Treat every pquery as a point mass concentrated on the highest-weight placement. In other words, ignore all but the most likely placement location (the one with the highest LWR), and set its LWR to 1.0. |
--ignore-multiplicities |
FLAG Set the multiplicity of each pquery to 1.0. The multiplicity is the equvalent of abundances for placements, and hence ignored with this flag. |
Matrix Output | |
--out-dir |
TEXT=. Directory to write output files to. |
--file-prefix |
TEXT File prefix for output files. Most gappa commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data. |
--file-suffix |
TEXT File suffix for output files. Most gappa commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data. |
--compress |
FLAG If set, compress the output files using gzip. Output file extensions are automatically extended by .gz . |
--matrix-format |
TEXT:{list,matrix,triangular}=matrix Format of the output matrix file. |
--omit-matrix-labels |
FLAG If set, the output matrix is written without column and row labels. |
Global Options | |
--allow-file-overwriting |
FLAG Allow to overwrite existing output files instead of aborting the command. |
--verbose |
FLAG Produce more verbose output. |
--threads |
UINT Number of threads to use for calculations. |
--log-file |
TEXT Write all output to a log file, in addition to standard output to the terminal. |
Calculates the Kantorovich-Rubinstein distance between a collection of jplace
samples. The command is a re-implementation of guppy kr
, see there for more details.
The command reads in the jplace
samples and calculates their pairwise KR distances. The result is printed to a symmetrical matrix by default, but can also be printed as a list or an upper triangular matrix.
When using this method, please do not forget to cite
Lucas Czech, Pierre Barbera, Alexandros Stamatakis. Genesis and Gappa: Processing, Analyzing and Visualizing Phylogenetic (Placement) Data. Bioinformatics, 2020. doi:10.1093/bioinformatics/btaa070
Steven Evans, Frederick Matsen. The phylogenetic Kantorovich-Rubinstein metric for environmental sequence samples. Journal of the Royal Statistical Society, 2012. doi:10.1111/j.1467-9868.2011.01018.x
Module analyze
- correlation
- dispersion
- edgepca
- imbalance-kmeans
- krd
- phylogenetic-kmeans
- placement-factorization
- squash
Module edit
Module examine
Module prepare
Module simulate
Module tools