Subcommand: graft

Make a tree with each of the query sequences represented as a pendant edge.

Usage: gappa examine graft [options]

Options

Input
`--jplace-path`	Required. `TEXT:PATH(existing)=[] ...` List of jplace files or directories to process. For directories, only files with the extension `.jplace[.gz]` are processed.
Settings
`--fully-resolve`	`FLAG` If set, branches that contain multiple pqueries are resolved by creating a new branch for each of the pqueries individually, placed according to their distal/proximal lengths. If not set (default), all pqueries at one branch are collected in a subtree that branches off from the branch.
`--name-prefix`	`TEXT` Specify a prefix to be added to all new leaf nodes, i.e., to the query sequence names.
Output
`--out-dir`	`TEXT=.` Directory to write output files to.
`--file-prefix`	`TEXT` File prefix for output files. Most gappa commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
`--file-suffix`	`TEXT` File suffix for output files. Most gappa commands use the command name as the base name for file output. This option amends the base name, to distinguish runs with different data.
Newick Tree Output
`--newick-tree-quote-invalid-chars`	`FLAG` If set, node labels that contain characters that are invalid in the Newick format (i.e., spaces and `:;()[],{}`) are put into quotation marks. If not set (default), these characters are instead replaced by underscores, which changes the names, but works better with most downstream tools.
Global Options
`--allow-file-overwriting`	`FLAG` Allow to overwrite existing output files instead of aborting the command.
`--verbose`	`FLAG` Produce more verbose output.
`--threads`	`UINT` Number of threads to use for calculations.
`--log-file`	`TEXT` Write all output to a log file, in addition to standard output to the terminal.

Description

The command takes the reference tree of the provided placefile(s), and for each pquery, it attaches a new leaf node to the tree, positioned according to its proximal length and pendant length of the most likely placement. The resulting tree is useful to get an overview of the distribution of placements. It is mainly intended to view a few placements. For large samples, it might be a bit cluttered.

Similar trees are produced by RAxML-EPA, where the file is called RAxML_labelledTree, and by the guppy tog command. Both programs differ in the exact way the the placements are added as edges. To control this behaviour, use the --fully-resolve parameter.

Details

The provided jplace files are processed individually, producing a newick tree for each of them. They are named like the input files, but replace the file extension by .newick.

Important remark: Note that the grafting simply attaches the pqueries to the tree at their most likely placement position. The phylogeny of the pquries itself however is not resolved at all.

Without `--fully-resolve`

If --fully-resolve is not provided (default), all placements at one edge are collected as children of one central base edge:

Multifurcating grafted tree.

This method is similar to the way RAxML-EPA produces a grafted tree, which is there called "labelled tree".

The base edge is positioned on the original edge at the average proximal_length of the placements. The base edge has a multifurcation if there are more than two placements on the edge.

The pendant length of the placements is used to calculate the branch length of the new placement edges. This calculation subtracts the shortest pendant length of the placements on the edge, so that the base edge is maximally "moved" towards the placement edges. This also implies that at least one of the placement edges has branch length == 0.0. Furthermore, the placements are sorted by their pendant length.

Using this method, the new nodes of the resulting tree are easier to distinguish and collapse, as all placements are collected as children branching off from the base edge. However, this comes at the cost of losing the detailled information of the proximal length of the placements. If you want to keep this information, use --fully-resolve instead.

With `--fully-resolve`

If --fully-resolve is provided, all placements per branch are turned into individual single leaf nodes:

Fully resolved grafted tree.

This method is similar to the way guppy tog produces a grafted tree.

The original edge is split into separate parts where each placement edge is attached. The branch lengths between those parts are calculated using the proximal length of the placements, while the branch lengths of the placement edges use their pendant length.

Using this method gives the most detailled information, but results in a more crowded tree. The new placement edges are "sorted" along the original edge by their proximal length. For this reason in the example image above, "Query 2" is closer to "Node A" then "Query 1": it has a higher proximal length. This information was lost in the multifurcating tree shown before (without --fully-resolve).

Further Details

For edges that contain only a single placement (or none at all), both versions (with and without --fully-resolve) behave the same. In this case, the placement is simply attached using its proximal length and pendant length.

Pqueries with multiple names are treated as if each name is a separate placement, i.e., for each of them, a new (identical) edge is added to the Tree. If using --fully-resolve, this results in a branch length of 0.0 between the nodes of those placements.

`--name-prefix`

Specify a prefix to be added to all new leaf nodes (the ones that represent placements). This is useful if a pquery name also occurs as a name in the original tree. By default, empty. In order to get the same naming as grafted trees as produced by RAxML, use --name-prefix "QUERY___".

Citation

When using this method, please do not forget to cite

Lucas Czech, Pierre Barbera, Alexandros Stamatakis. Genesis and Gappa: Processing, Analyzing and Visualizing Phylogenetic (Placement) Data. Bioinformatics, 2020. doi:10.1093/bioinformatics/btaa070

Home

Citation and References

General Usage

Phylogenetic Placement

Module analyze

Module edit

Module examine

Module prepare

Module simulate

Module tools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subcommand: graft

Options

Description

Details

Without `--fully-resolve`

With `--fully-resolve`

Further Details

`--name-prefix`

Citation

Clone this wiki locally

Subcommand: graft

Options

Description

Details

Without --fully-resolve

With --fully-resolve

Further Details

--name-prefix

Citation

Clone this wiki locally

Without `--fully-resolve`

With `--fully-resolve`

`--name-prefix`