-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multitree #1442
Multitree #1442
Conversation
Variable names changed to better convey that these values represent node order - in rectangular layouts these are the same as y positions, but not for other layouts. Storing these on the <phyloNode> is consistent with other layout (position) variables. We now have: <phyloNode>.displayOrder .displayOrderRange .y: y position in domain. Depends on layout. .py: y position of parent node. .yTip: y position in pixels (i.e. the range) .yBase: yTip of parent node. Note that the untangling code is not currently used, but has been tested here by turning on `globals.attemptUntangle`.
This is so cool, @jameshadfield! I don't know if I'm using this new feature correctly yet, but I noticed a couple of odd behaviors in an H3N2 tree: |
This is all really cool, @jameshadfield! Just a heads up that the review app does not seem to be working in any of the test links provided. |
@jameshadfield I couldn't find an example dataset file with multiple trees in the links above (I think most are currently inactive/dead?), but I'm curious if this would mean a new major version of the dataset schema, i.e. to dataset v3? |
This introduces the ability to define multiple trees by allowing <json>.tree to be an array (of trees). Internally we add an extra root node (not displayed) whose children are the subtrees; this allows us to reuse all of our machinery which expects to traverse a single tree.
dc399ed
to
bd2e4a6
Compare
bd2e4a6
to
efcf523
Compare
Yeah, the heroku review app died after some amount of time (a week?) and when it gets recreated the URL changes so those links were dead. I've updated them in the above message.
I don't think so, however the schema will have to be extended to allow both a single tree and an array of trees, as well as allowing some extra |
07e8c42
to
0a868bd
Compare
We represent tree nodes by two different objects, a "PhyloNode" and a "Node", although these names are not formalised. The former holds information necessary to render the phylogenetic tree and is solely used by `PhyloTree` while the latter holds more general information. They are linked: `phyloNode = node.n` and `node = phyloNode.shell`. This commit tackles deeply embedded technical debt where each object held a copy of the same data. Specifically, * `children` is now only stored on a Node * `parent` is similarly stored on a Node * `terminal` is removed from PhyloNode (we can refer to node.hasChildren) There is (at least) one remaining duplicated property - `inView` - which is set on <Node> when the dataset loads due to the corresponding <PhyloNode> not yet existing. This is left for future work as the concept of `inView` may change as we develop further methods for tree zoom / pan.
Introduces the ability to cut trees at various branches and display the resulting trees as subtrees. These cut points are here defined to be a change in a (user-selected) state trait, which are simply the non-continuous colourings (to start with). Briefly, given a parent node (A) with children (B,C), and a change in trait moving from (A->B), we add B to the children of the overall tree root, and update the children of A to be only C. We store the original children (B,C) so that we can return to the starting state. This implementation, specifically the ability for PhyloTree to update the node ordering, should allow for branch rotations in the future. There are a number of future improvements to be made in subsequent commits: * The subtrees should be ordered on trait (or we may wish to make this a toggle) * the stem length of subtrees is inconsistent * exploded trait cannot be stored in URL state * we should limit the explode options to those colourings which are defined on internal nodes (currently a warning is shown) * genotype should be available to explode on * remove console logging introduced here * consider undefined (internal) traits, e.g. currently nodes (X -> undefined -> Y) would not be exploded * multiple trees are not considered here Prior art: I was first made aware of this style of tree rendering by Gytis Dudas. An implementation was attempted by @frogsquire<https://github.com/frogsquire> in #1105 and this commit uses ideas introduced there.
This introduces the concept of defining additional parents of the node, in addition to the parent used to link all nodes together into one tree. By defining the original parent via `<Node>.parentInfo.original → <Node>` the roots of exploded (sub-)trees keep track of their original parent. This information is used here for branch (stem) length calculations; future work may use visual cues such as dashed lines to link subtrees together. https://cse512-21s.github.io/FP-Pathogen-Phylogenies/ (Lewinsohn, Paredes, Russell and Wagner) is one example of this. This structure will allow definition of recombination events (e.g. donor nodes), original parents of individual trees (in a multi-tree-dataset) etc. All of this is "future work".
We default to zooming out completely whenever we explode the tree. There are nicer behaviours here, such as re-calculating the MRCA of visible nodes, but this comes at the cost of increased complexity. Recalculating tip counts is needed for (a) branch thicknesses and (b) forthcoming work which will hide subtrees with no tips.
Our previous behavior zoomed into the parent node, which made for a nicer UI. However when the zoom node is a subtree with a single tip, zooming into the parent node means zooming into the entire tree, which is not desirable.
Previous behavior would only split trees when the traits of a parent-child link were different _and_ defined. This let to problems where three connected nodes with traits "X -> undefined -> Y" would not be split.
Tanglegrams would be useful for exploded trees but are added complexity for an already complex feature. For the time being, it's acceptable to force users to choose one or the other.
The chosen value is based off testing a few datasets and we may need to tweak this in the future.
Note that in radial view separation is expressed by increasing the angle. If the root of a subtree is near the start (div or time) then changing the angle doesn't convey a sense of separation. One way to improve this would be to add a inner circle which the innermost nodes can then be spaced around, however this wouldn't be appropriate for non-exploded trees. Worth revisiting in the future, perhaps.
This separates out the subtrees in an unrooted subtree using a similar conceptual approach as for radial trees. Code was somewhat cleaned up during this work.
0a868bd
to
eb9eceb
Compare
This is now (finally) ready for release, however I've added an "experimental" label to the dropdown as these changes push auspice in new directions and this exposes some pre-existing limitations. There may also be some bugs / papercuts as this PR involves a lot of the different parts of the code.
Testing URLs
P.S. The final commit is to be dropped before merge. |
eb9eceb
to
06dfb6b
Compare
Multiple trees ("subtrees") have been available in Auspice since late 2021¹ and part of the associated schema since early 2022². Despite this there was no way to produce such datasets within Augur itself, and despite the schema changes the associated `augur validate` command was never updated to allow them. This commit adds multi-tree inputs to `augur export v2` as well as allowing them to validate with our associated validation commands. ¹ <nextstrain/auspice#1442> ² <#851>
Multiple trees ("subtrees") have been available in Auspice since late 2021¹ and part of the associated schema since early 2022². Despite this there was no way to produce such datasets within Augur itself, and despite the schema changes the associated `augur validate` command was never updated to allow them. This commit adds multi-tree inputs to `augur export v2` as well as allowing them to validate with our associated validation commands. ¹ <nextstrain/auspice#1442> ² <#851>
Multiple trees ("subtrees") have been available in Auspice since late 2021¹ and part of the associated schema since early 2022². Despite this there was no way to produce such datasets within Augur itself, and despite the schema changes the associated `augur validate` command was never updated to allow them. This commit adds multi-tree inputs to `augur export v2` as well as allowing them to validate with our associated validation commands. ¹ <nextstrain/auspice#1442> ² <#851>
Multiple trees ("subtrees") have been available in Auspice since late 2021¹ and part of the associated schema since early 2022². Despite this there was no way to produce such datasets within Augur itself, and despite the schema changes the associated `augur validate` command was never updated to allow them. This commit adds multi-tree inputs to `augur export v2` as well as allowing them to validate with our associated validation commands. ¹ <nextstrain/auspice#1442> ² <#851>
Description of proposed changes
This PR implements two major concepts:
dataset.tree -> Array<Tree>
In addition it includes the removal of a lot of technical debt within Auspice's code; as such extensive testing will be required. See commit messages for more information, however I'll use this PR description to foster discussion.
explode.mov
To-dos in this PR (Updated 2022-01-30)
(Not all of these need to be done in this PR, mind.)
undefined
; a subtree with no tips (e.g. one internal node with an undefined trait) is no longer displayed.__ROOT
and thus we can't zoom to selected! (as noticed by @huddlej, see below)To-dos in subsequent PRs (Updated 2022-01-30)
Testing (updated 2022-01-30)
I've tested this on a bunch of (local) datasets, although I've only created two multi-tree JSONs.
Descriptions of major/minor parents
Commit 1d5f725 introduces the concept of defining major / minor parents of nodes. This may include where in another tree this (sub-)tree originated (referred to here as “major”) or a recombination donor (“minor”). In the future, we may use solid/dashed lines to represent the connection to these parent nodes, using @mlewinsohn, @miparedes, Russell and @cassiawag's recombination project as inspiration. This commit only implements this concept for exploded trees (as we need to know how long the root stem should be). I'd like to make this user definable via a
branch_attrs.parent
annotation, the details of which haven't been fully worked out. The following situations are possible:(i) Single tree, not exploded: Minor parents (via
branch_attrs.parent
) may be used to visualise recombination events (see above). Major parents are not applicable.(ii) Multi tree, not exploded: Major parents (via
branch_attrs.parent
) can indicate the placement of the subtree (in another tree). Minor parents may also be applicable here.(iii) Single tree, exploded:
branch_attrs.parent
is ignored. We set the major parent of each subtree to be the branch it originated from.(iv) Multi tree, exploded: Same as (iii).
It'd be great to discuss this further before implementing it in a user-configurable fashion.
cc @evogytis
🙏 many thanks to @frogsquire who's work in #1105 was really helpful here