-
Notifications
You must be signed in to change notification settings - Fork 7
Subcommand: imbalance kmeans
Lucas Czech edited this page Jun 8, 2018
·
10 revisions
Run Imbalance k-means clustering on a set of samples.
Usage: gappa analyze imbalance-kmeans [options]
Input | |
---|---|
--jplace-path |
Required. TEXT ... List of jplace files or directories to process. For directories, only files with the extension .jplace are processed. |
Settings | |
--k |
Required. TEXT Number of clusters to find. Can be a comma-separated list of multiple values or ranges for k: 1-5,8,10,12 |
--write-overview-file |
If provided, a table file is written that summarizes the average distance and variance of the clusters for each k. Useful for elbow plots. |
--point-mass |
Treat every pquery as a point mass concentrated on the highest-weight placement. |
--ignore-multiplicities |
Set the multiplicity of each pquery to 1. |
Color | |
--color-list |
TEXT=BuPuBk List of colors to use for the palette. Can either be the name of a color list, a file containing one color per line, or an actual list of colors. |
--reverse-color-list |
If set, the --color-list is reversed. |
--log-scaling |
If set, the sequential color list is logarithmically scaled instead of linearily. |
Tree Output | |
--write-newick-tree |
If set, the tree is written to a Newick file. |
--write-nexus-tree |
If set, the tree is written to a Nexus file. |
--write-phyloxml-tree |
If set, the tree is written to a Phyloxml file. |
--write-svg-tree |
If set, the tree is written to a Svg file. |
Svg Tree Output | |
--svg-tree-shape |
TEXT in {circular,rectangular}=circular Shape of the tree. |
--svg-tree-type |
TEXT in {cladogram,phylogram}=cladogram Type of the tree. |
--svg-tree-stroke-width |
FLOAT=5 Svg stroke width for the branches of the tree. |
--svg-tree-ladderize |
If set, the tree is ladderized. |
Output | |
--out-dir |
TEXT=. Directory to write files to |
--file-prefix |
TEXT=ikmeans_ File prefix for output files |
Imbalance k-means has almost the same usage as Phylogenetic k-means. See there for details. The difference is in the distance measure being used, which is a simple Euclidean distance of the edge imbalances of the samples, instead of using the more involved Phylogenetic KR distance between samples.
Module analyze
- correlation
- dispersion
- edgepca
- imbalance-kmeans
- krd
- phylogenetic-kmeans
- placement-factorization
- squash
Module edit
Module examine
Module prepare
Module simulate
Module tools