Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Split up deletions and substitutions in tooltip as deletions are overwhelming and uninformative #1537

Closed
corneliusroemer opened this issue Jul 22, 2022 · 1 comment · Fixed by #1542
Labels
enhancement New feature or request

Comments

@corneliusroemer
Copy link
Member

Context

When inspecting trees I often have issues with the way the presentation of deletions overwhelms the display.

Description

Deletions are often very noisy on trees as we don't account for them properly when building a tree.

Yet they take up most space because they come in stretches that aren't compressed.

Examples

So I end up with a result like this:

image

Where the needle (the two really important nuc substitutions) is really hard to find in the hay stack (long list of deletions).

It would be really great and quite high priority from my user perspective to make deletions less overwhelming.

This has actually impeded me for a while but I never had the idea of writing it up as an issue.

Possible solution

@victorlin do you think this is something you could have a look at? There are a few things we could try here:

  1. Simply split out substitutions and deletions (maybe easiest and quickest, maybe stopgap until we have)
  2. Compress deletion stretches (and maybe separate them out, too)
@jameshadfield
Copy link
Member

Related (private) slack thread here. A summary of this:

Auspice already separates out deletions nicely (e.g. G42-) and groups them into runs. This problem is due to "undeletions" which are almost certainly a bioinformatics problem, albeit a hard one to fix. @corneliusroemer suggested to call these a “reversion of deletion to reference" which seems like a good solution. Some considerations:

  1. We should group these together into runs, like we do with deletions.
  2. They should not be listed in the "unique mutations", "Homoplasies" and "Reversions to root" categories
  3. I don't think these should be restricted to reversion of deletion to reference - they should be any deletion to a base. Although this would obscure the interesting case when it's a base which differs from the ancestral node...

Repository owner moved this from Backlog to Done in Nextstrain planning (archived) Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants