-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reversions/back-mutations as within-Auspice-computed branch label #1444
Comments
Thanks @corneliusroemer — I completely agree and think this feature will immensely help with interpreting trees, especially Omicron. I’m going to expand this issue slightly to encompass changes we've discussed regarding display of mutations more generally. Current situation for branch labels
Proposal for branch labels Simplest (and most realistic short-term) would be a small augur script within nCoV. The better long-term solution would be to compute this within Current situation for mutation display "branch_attrs": {
"mutations": {
"nuc": [ "T1N", "T2N", ...],
"S": ["T716I"] The tooltips used in auspice behave as follows:
¹ No grouping is performed, e.g. if we have deletions of pos a, a+1, a+2 then we report three "mutations" (to Proposed Display of Mutations
(Detecting runs of deletions/insertions which are homoplasic isn't trivial, but it is if we consider them as a series of individual events, as we currently do.) These could be computed within auspice itself, unless there is some reason to leverage nextclade for this? Relatedly, we should definitely move towards the aesthetics employed by nextclade for displaying mutations with badges! What about Insertions? It'd be wise to consider how insertions could be provided here, but this may be worthy of a separate issue (and shouldn't hold up implementing the previous sections). VCF-like style would be |
My 2c: please don't follow VCF off that cliff. :) I really, really wish VCF didn't include the base to the left of indels. It's distracting to include a base that does not change, it necessitated an additional special rule for insertions at the beginning of the sequence (the unchanging base to the right must be appended on the right, further ugh), and it complicates code that has to translate between VCF and other formats (for example requiring reference sequence input to convert to VCF when it would otherwise be unnecessary). The empty string is a perfectly valid The rest of it sounds great! :) |
I only worked with VCFs for a short while a few years ago but I'd second Angie here, it drove me nuts! |
in nextclade, we have use |
Update: I have this working for the on-click info panel, just need to extend it to the on-hover panel as well. I think subsequent PRs can then
|
This looks super awesome and incredibly useful James! |
These changes were motivated by issue #1444 [1] where separating mutations into categories can aid both QC and biological interpretation. I chose to use "mutations" to refer to mutations observed on a branch and "changes" to refer to the collection of mutations between a tip and the root. The categories are not necessarily disjoint, as a mutation back to the root will also be a homoplasy or a unique mutation. Note that changes between a tip sequence and the root aren't grouped into homoplasies, as a single change (A→C) may be the result of multiple mutations (e.g. A→B→C) and thus we would need to check the tip state of each position which is difficult with the current code. On-hover panels are left unchanged in this commit. [1] #1444
These changes were motivated by issue #1444 [1] where separating mutations into categories can aid both QC and biological interpretation. I chose to use "mutations" to refer to mutations observed on a branch and "changes" to refer to the collection of mutations between a tip and the root. The categories are not necessarily disjoint, as a mutation back to the root will also be a homoplasy or a unique mutation. Note that changes between a tip sequence and the root aren't grouped into homoplasies, as a single change (A→C) may be the result of multiple mutations (e.g. A→B→C) and thus we would need to check the tip state of each position which is difficult with the current code. On-hover panels are left unchanged in this commit. [1] #1444
I think this has been part of Auspice for a while now @jameshadfield It's a cool feature that's been super useful. But just wanted to check with you this is actually done before closing. |
Closed by #1449 |
Context
Reference backfilling is a big problem in SARS-CoV-2 sequences. All the information one needs to identify reversions back to reference is included in the auspice.json. This would for example allow me to quickly check that a Nextclade reference tree doesn't contain any reversions.
Description
As a user, I would like to be able to see nucleotide reversions (either only to reference, or to any previous state) be highlightable on the tree. For example as a branch label, like we do with clades or sometimes Spike mutations.
Examples
Usher already implements this feature, they must do it in the backend, so there's clearly some interest in this feature beyond me.
Possible solution
I could write a custom Python script that post-processes an auspice.json to add this as a branch annotation. But it's silly to do this with a script when it could be implemented within auspice.json for all trees, for all users.
The text was updated successfully, but these errors were encountered: