Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion to designate XBB.1.16+T9991C+C16332T (111 seqs, 14 countries) #1983

Closed
xz-keg opened this issue May 5, 2023 · 11 comments
Closed
Labels
designated UShER not clean Lineage currently not clean in UShER tree
Milestone

Comments

@xz-keg
Copy link
Contributor

xz-keg commented May 5, 2023

I'm proposing this because this is hiding under a flip-flop reversion sub-branch on usher that might be missing when designating systematic lineages.

mutations on top of XBB.1.16: T9991C,C16332T
GISAID query: T9991C,C16332T,T28297C,A28447G,T12730A
No. of seqs: 111 (Australia 2 Austria 3 Canada 2 UK 1 Japan 2 Myanmar 3 Nepal 1 New Zealand 1 Singapore 9(1 from Japan) Switzerland 1 USA 8(1 imported from UAE), China 1 India 77 )
First seq: EPI_ISL_17073062 India, 2023-2-22
Latest seq:EPI_ISL_17596319 Singapore from Japan, 2023-4-25

usher

Screen Shot 2023-05-05 at 13 39 32

@corneliusroemer
Copy link
Contributor

I'm not quite sure why you propose this as a lineage - wouldn't this be better flagged to @AngieHinrichs so she can prune and regraft the sequences to try to fix the artefactual flip flop? The whole branch looks like it attracts bad QC samples, with 417 reverted, the other branches off that polytomy are then just more reversions. It looks like what you highlight is decent QC samples that were attracted by bad QC sequences that had 417 and 368 reverted but had the mutations 9991, 16332.

Thinking about how this could happen mechanistically: if a reverted sample of a particular lineage is uploaded first, it will be placed on a bad branch, future good QC samples then attach there in the bad place.

This could be partially helped by periodically removing such reversion branches. If they are real, they will just go in the same place. However if they are artefacts, the many more good QC samples will attach in the right place to start with, and the bad QC samples will descend from the right place.

@corneliusroemer corneliusroemer added the UShER not clean Lineage currently not clean in UShER tree label May 5, 2023
@xz-keg
Copy link
Contributor Author

xz-keg commented May 5, 2023

I'm not quite sure why you propose this as a lineage

It has similar number of seqs as XBB.1.16.3 so qualifies to be designated a systematic lineage. Especially for that some of its seqs come from most under-sampled areas like Nepal or Myanmar.

But as it is hiding under a flip-flop branch, it may be easily missed when looking for systematic lineages.

@xz-keg
Copy link
Contributor Author

xz-keg commented May 5, 2023

Thinking about how this could happen mechanistically: if a reverted sample of a particular lineage is uploaded first, it will be placed on a bad branch, future good QC samples then attach there in the bad place.

This could be partially helped by periodically removing such reversion branches. If they are real, they will just go in the same place. However if they are artefacts, the many more good QC samples will attach in the right place to start with, and the bad QC samples will descend from the right place.

According to most recent research, S:417 reversion may offer some immune escape to the immune background repeated omicron infection, so sometimes it may be real. However on the other hand S:417 reversion is a known common bad QC artefact.

The worst thing here is that it is not an "all artefact/all real" choice, so simply accepting everything/removing everything won't help. It is hard to distinguish even for individual samples.

Same for S:440/505 reversions, which the research predicts to have great immune evasion against repeated omicron infection background.

However, in this case most of the seqs in this lineage doesn't have either 368 or 417 reversion, they're just being placed on a flip-flop branch. They're simply XBB.1.16+T9991C,C16332T, which qualifies for a systematic designation under XBB.1.16.

@xz-keg
Copy link
Contributor Author

xz-keg commented May 5, 2023

I'm not quite sure why you propose this as a lineage - wouldn't this be better flagged to @AngieHinrichs so she can prune and regraft the sequences to try to fix the artefactual flip flop?

Of course the artefactual flip-flop shall also be fixed. Without this being fixed the designation won't work.

@corneliusroemer
Copy link
Contributor

Designation is theoretically independent of the Usher tree, Nextclade won't be affected by flip flop branches as it doesn't use the Usher tree - but yeah, flip flops pose challenges for designation as Usher is the main way we pick sequences to designate and it's the default method of pangolin.

Reversions are so commonly artefacts that they should be assumed to be artefacts unless there's clear evidence they aren't. This isn't so hard, if a lineage really has a reversion then the whole cluster should have the reversion and the cluster can be distinguished by mutations other than the reversion to verify. We've been able to designate lineages with insertions and deletions, even though they don't appear in the Usher tree at all - because they cooccurred with nucleotide mutations

@xz-keg
Copy link
Contributor Author

xz-keg commented May 5, 2023

Reversions are so commonly artefacts that they should be assumed to be artefacts unless there's clear evidence they aren't.

Yeah. And now we have that clear evidence that 417,440 or 505 reversions may produce great immune escape (for people with multiple omicron exposures, which means most of the people on earth). There may be convergent reversions on those sites. (which means many of them may not have co-occurred nuc mutations due to different selection pressures).

Screen Shot 2023-05-05 at 17 11 52

Combined with another fact that such reversions are also common artefacts, it becomes very challenging.

@silcn
Copy link

silcn commented May 5, 2023

I'm not quite sure why you propose this as a lineage

It has similar number of seqs as XBB.1.16.3 so qualifies to be designated a systematic lineage. Especially for that some of its seqs come from most under-sampled areas like Nepal or Myanmar.

@corneliusroemer why are so many branches distinguished only by one or two silent mutations getting designated these days? Apologies if I've missed some announcement. In my opinion this makes things more confusing because aliases appear more quickly.

@FedeGueli
Copy link
Contributor

@silcn i think for proposals the rule are the same as before (no nuc only lineages) , but Cornelius since months is systematically designating sub branches of main/dominant lineages to help clarifying the tree, and i truly appreciate that kind of back end work cause let us fastly identify where a subclade potentially of interest is.
the b side is that with aliases it is harder to distinguish fastly the root when someone propose something. So i think it is very good while proposing, and so and so when it produces a new alias,

@oobb45729
Copy link

I think that most cases of N417K in Omicron we've seen so far are not real. Both T22813A and T22813G result in N417K, but T22813A is very rare in Omicron so far.

Both G22882C and G22882T result in K440N. G->C is less likely happen than G->T. However, G22882C is too rare in Omicron for now, so I don't think there aren't many cases of true G22882T now. Plus, K440 seems to only like to further mutate to R440 so far, suggesting Omicron may like to retain a positively charged residue there.

For H505Y in Omicron, I think the reversion probably would mess up the inter-protomer interaction between H505 and the 371-375 part, especially P373.

The reversions at 417, 440, and 505 may provide immune escape in certain circumstances. However, there are other mutations that can also do those jobs, like N417Y, non-K440N and non-K440R K440Xs, and G504D, which in my opinion, may be more viable choices for Omicron since they may escape both WT-targeting and Omicron-targeting antibodies. Up till now, we haven't seen surges of those mutations yet.

@xz-keg
Copy link
Contributor Author

xz-keg commented May 6, 2023

The 3 Myanmar seqs seems to do having S:417K (22813G). These sequence quality are good and the position at 22813 is a known G.
I guess these 3 directs the placement of this branch.

Screen Shot 2023-05-06 at 13 56 40

@corneliusroemer
Copy link
Contributor

Thanks for finding this @aviczhl2 - this is indeed a clean lineage that is just misplaced on a flip-flop branch by Usher (@AngieHinrichs)

The 2 mutations cooccur cleanly, and the lineage makes up a sizeable share of XBB.1.16 so clearly worthy of designation as XBB.1.16.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
designated UShER not clean Lineage currently not clean in UShER tree
Projects
None yet
Development

No branches or pull requests

5 participants