Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BQ.1.25 Sublineage with S:L858I, ORF1a:A54V (128 seq) #1451

Closed
ryhisner opened this issue Dec 21, 2022 · 14 comments
Closed

BQ.1.25 Sublineage with S:L858I, ORF1a:A54V (128 seq) #1451

ryhisner opened this issue Dec 21, 2022 · 14 comments
Assignees
Milestone

Comments

@ryhisner
Copy link

ryhisner commented Dec 21, 2022

Description

Sub-lineage of: BQ.1.25
Earliest sequence: 2022-10-21, USA, Colorado — EPI_ISL_15638493
Most recent sequence: 2022-12-9, USA, Missouri — EPI_ISL_16207139
Countries circulating: USA (125), Australia (2), Canada (1)
Number of Sequences: 128
GISAID Query: Spike_L858I, Spike_R346T, NS3_A54V
CovSpectrum Query: Nextcladepangolineage:BQ.1.25* & S:L858I
Substitutions on top of BQ.1.25:
Spike: L858I
ORF3a: A54V
Nucleotide: C24134A, C25553T

USHER Tree
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/BQ.1.25%20%2B%20L858I%2C%20ORF3aA54V%20-%20subtreeAuspice1_genome_12fa6_242a50.json
image

Evidence
This lineage has 128 total sequences, and 103 of those have been uploaded in the past 15 days (Dec 5 or later), with 78/128 uploaded Dec 12 or later. It seems to be growing very fast, quite a bit faster than the rest of BQ.1.25*. Whether any of this apparent growth advantage is due to it being located primarily in California, which has a high level of sequencing, isn’t entirely clear.

The S:852-859 range has been something of a mutational hot spot lately, particularly in highly mutated sequences.

S:A852K, a 2-nucleotide mutation, that has repeatedly recurred, is particularly intriguing, and S:A852V and S:A852S pop up regularly as well. S:Q853X mutations have turned up in several chronic-infection sequences. S:K854N, S:K854R, and even a few S:K854T sequences have been popping up more frequently as well.

S:N856S has been cropping up independently in several lineages, including BQ.1.1 (#1419) and CK.1, and the same can be said for S:T859S, which had almost never been seen before October 2022 but which now is growing rapidly in BE.1.1.1 (#1413) and in a BA.5.2 lineage in New Zealand (#1414).

I’m not sure what the significance of all this is. Is it related to immune evasion, or do mutations in this S2 region provide some other advantage in recent variants that they didn’t in previous ones?

Enigmatic A->K Mutations in S2

Since I mentioned S:A852K, I also want to bring up an enigmatic pattern I noticed recently. There are at least three different S2 alanine residues that have undergone 2-nucleotide mutations to lysine: S:A783K, S:A852K, and S:A942K. The S:A942K mutation was particularly fascinating because it also included a synonymous nucleotide substitution immediately before and adjacent to the other two.
image

Even more remarkable, this S:A892K mutation appeared in a hypermutated BA.2.12.1 chronic-infection sequence in which every nucleotide mutation was nonsynonymous—except for the one immediately preceding & adjacent to the S:A852K nuc mutations. This synonymous mutation (A24385T) created a TAAA nucleotide sequence. Both of the other 2-nuc A->K S2 mutations also are preceded by a T nucleotide that creates a TAAA sequence. I have no idea why this should be, but it seems too unlikely to be a coincidence. If anyone has any idea what this all means, if anything, I'd love to know.

Genomes

Genomes EPI_ISL_15638493, EPI_ISL_15818682, EPI_ISL_15821819, EPI_ISL_15840809, EPI_ISL_15842320, EPI_ISL_15847398, EPI_ISL_15847427, EPI_ISL_15878761, EPI_ISL_15881022, EPI_ISL_15888535, EPI_ISL_15890574, EPI_ISL_15944185, EPI_ISL_15944774, EPI_ISL_15945073, EPI_ISL_15945201, EPI_ISL_15945965, EPI_ISL_15946043, EPI_ISL_15946535, EPI_ISL_15946647, EPI_ISL_15951722, EPI_ISL_15951770, EPI_ISL_15976517, EPI_ISL_15976621, EPI_ISL_15982445, EPI_ISL_16002532, EPI_ISL_16009478, EPI_ISL_16009636, EPI_ISL_16010016, EPI_ISL_16010824, EPI_ISL_16011363, EPI_ISL_16011823, EPI_ISL_16011829, EPI_ISL_16011867, EPI_ISL_16011984, EPI_ISL_16012137, EPI_ISL_16012213, EPI_ISL_16016535, EPI_ISL_16026965, EPI_ISL_16027001, EPI_ISL_16027031, EPI_ISL_16027087, EPI_ISL_16027096, EPI_ISL_16047351, EPI_ISL_16047369, EPI_ISL_16049105, EPI_ISL_16058492, EPI_ISL_16058625, EPI_ISL_16058655, EPI_ISL_16066825, EPI_ISL_16066847, EPI_ISL_16080745, EPI_ISL_16080773, EPI_ISL_16080845, EPI_ISL_16081356, EPI_ISL_16081404, EPI_ISL_16081519, EPI_ISL_16081527, EPI_ISL_16081609, EPI_ISL_16081835, EPI_ISL_16081886, EPI_ISL_16082151, EPI_ISL_16082233, EPI_ISL_16082958, EPI_ISL_16089497, EPI_ISL_16089533, EPI_ISL_16089545, EPI_ISL_16089550, EPI_ISL_16091088, EPI_ISL_16092175, EPI_ISL_16105499, EPI_ISL_16105719, EPI_ISL_16106529, EPI_ISL_16106586, EPI_ISL_16106608, EPI_ISL_16113453, EPI_ISL_16114814, EPI_ISL_16115132, EPI_ISL_16115921, EPI_ISL_16116993, EPI_ISL_16116994, EPI_ISL_16120062, EPI_ISL_16120261, EPI_ISL_16130701, EPI_ISL_16130841, EPI_ISL_16130913, EPI_ISL_16130930, EPI_ISL_16130989, EPI_ISL_16131081, EPI_ISL_16131279, EPI_ISL_16131347, EPI_ISL_16131417, EPI_ISL_16131518, EPI_ISL_16131781, EPI_ISL_16159134, EPI_ISL_16159166, EPI_ISL_16161836, EPI_ISL_16161851, EPI_ISL_16165477, EPI_ISL_16166221, EPI_ISL_16166227, EPI_ISL_16169134, EPI_ISL_16169409, EPI_ISL_16171370, EPI_ISL_16171371, EPI_ISL_16171372, EPI_ISL_16184061, EPI_ISL_16184094, EPI_ISL_16184134, EPI_ISL_16184209, EPI_ISL_16184498, EPI_ISL_16186963, EPI_ISL_16188753, EPI_ISL_16188969, EPI_ISL_16189080, EPI_ISL_16189114, EPI_ISL_16189206, EPI_ISL_16203277, EPI_ISL_16203667, EPI_ISL_16204430, EPI_ISL_16205163, EPI_ISL_16205749, EPI_ISL_16205827, EPI_ISL_16206035, EPI_ISL_16206522, EPI_ISL_16206528, EPI_ISL_16207013, EPI_ISL_16207017, EPI_ISL_16207139
@ryhisner ryhisner changed the title BQ.1.25 Sublineage with S:L858I, ORF1a:A54V BQ.1.25 Sublineage with S:L858I, ORF1a:A54V (128 seq) Dec 21, 2022
@InfrPopGen InfrPopGen self-assigned this Dec 21, 2022
InfrPopGen added a commit that referenced this issue Dec 21, 2022
Added new lineage BQ.1.25.1 from #1451 with 87 new sequence designations, and 1 updated from BQ.1.25
@InfrPopGen InfrPopGen added this to the BQ.1.25.1 milestone Dec 21, 2022
@InfrPopGen
Copy link
Contributor

Thanks for submitting. We've added lineage BQ.1.25.1 with 87 newly designated sequences, and 1 updated. Defining mutation C24134A (S:L858I) (following C25553T (ORF3a:A54V)).

@FedeGueli
Copy link
Contributor

Very fast one already at 194. I wish to highlight the same mutation was defining in maldivian lineage BA.2.39 spotted and proposed by @silcn and then designated from #575

@oobb45729
Copy link

For the TAAA sequences, maybe they are about sgmRNA. After mutations, they all become more like CTCTAAACGAAC.
I feel that the sequence is not emphasized enough. Mutations that are related to it are very common. Recently I think I found what Orf6:D61L is really about. It about the TRS sequence before Orf7a. The change GAT->CTC would change GATTAAACAAC to CTCTAAACGAAC. BA.5 has a mutation in the TRS sequence before Orf8, changing AAACGAAC to AAATGAAC. B.1.429 and XBC have a mutation in the TRS sequence before Orf8 as well, AAACGAAC to AAACTAAC.

@ryhisner
Copy link
Author

Whoa, if you're right, that's huge! I always wondered if ORF6:D61L was just some weird mistake that happened in the chronic patient that birthed BA.2/4/5, but this would explain it! I Think it may have been advantageous within host (maybe even just one host) but deleterious for between-host fitness. I think it's still a bit of a mystery why BA.5 has been so much more successful than BA.4, and ORF6:D61L seems like the top candidate to me. I'd love to see more detailed studies of what exactly it's doing. Maybe there are studies about it that I'm not aware of. If anyone knows of any, please post links.

The weird, multi-nucleotide mutations in ORF8 from residues 118-121 really confused me for a long time, but as @thomasppeacock recently pointed out, those mutations created extended homology for the sgmRNA leader—if this is the right term to use—for N. But I suspect something similar might be happening with ORF8 as what's happened with ORF6; it's a multi-nucleotide mutation that occurs independently again and again and again, but which does not seem to lead to any growth advantage at the population level. I have to wonder if it's another multi-nuc mutation that alters gene expression in a favorable way within certain hosts but is either neutral or deleterious outside of those specific contexts.
image

I'd really like to know how far back the extended homology can extend and what the ideal sequence for extended homology consists of. I'm pretty sure the three best nucleotides before AAACGAAC are TCT. Before that, I'm unsure. The ORF8 mutation suggests the most homologous sequence is GTTCTCT, but I'm not sure if that's right or not.

@oobb45729
Copy link

GTTCTCTAAACGAAC is exactly SL3 described here: https://www.nature.com/articles/s41467-022-28603-2/figures/2

It looks like the whole SL3 could be involved, which means not only ACGAAC but also TAAAC or CTAAA can be potential TRS sequences.

@FedeGueli
Copy link
Contributor

thx very informative

@oobb45729
Copy link

oobb45729 commented Jan 4, 2023

Whoa, if you're right, that's huge! I always wondered if ORF6:D61L was just some weird mistake that happened in the chronic patient that birthed BA.2/4/5, but this would explain it! I Think it may have been advantageous within host (maybe even just one host) but deleterious for between-host fitness. I think it's still a bit of a mystery why BA.5 has been so much more successful than BA.4, and ORF6:D61L seems like the top candidate to me. I'd love to see more detailed studies of what exactly it's doing. Maybe there are studies about it that I'm not aware of. If anyone knows of any, please post links.

The weird, multi-nucleotide mutations in ORF8 from residues 118-121 really confused me for a long time, but as @thomasppeacock recently pointed out, those mutations created extended homology for the sgmRNA leader—if this is the right term to use—for N. But I suspect something similar might be happening with ORF8 as what's happened with ORF6; it's a multi-nucleotide mutation that occurs independently again and again and again, but which does not seem to lead to any growth advantage at the population level. I have to wonder if it's another multi-nuc mutation that alters gene expression in a favorable way within certain hosts but is either neutral or deleterious outside of those specific contexts. image

I'd really like to know how far back the extended homology can extend and what the ideal sequence for extended homology consists of. I'm pretty sure the three best nucleotides before AAACGAAC are TCT. Before that, I'm unsure. The ORF8 mutation suggests the most homologous sequence is GTTCTCT, but I'm not sure if that's right or not.

I'm also curious about where the advantage of BA.5 over BA.4 comes from, so I checked differences between them. The result leave me even more confused:
Orf6:D61L is supposed to impair Orf6's functions (https://www.biorxiv.org/content/10.1101/2022.10.18.512708v2.full) but it would probably boost Orf7a's translations. I noticed that many BA.5s gained T27384C (GAT->GAC, so Orf6:D61 is intact) and BA.5*+T27384C+Orf6:D61D does have an advantage over BA.5*. B.1.1.7*(B.1.617.2*)+T27384C+Orf6:D61D also has an advantage over B.1.1.7*(B.1.617.2*), but B.1.1.7*(B.1.617.2*)+T27384C+Orf6:D61L may not do so.
Orf1a:141-143del also repeatedly happened many times but lineages with it seem to have neither an significant advantage nor an significant disadvantage.
B.1.617.2*+Orf7b:L11F may have an advantage over B.1.617.2*.
BE.3 (BA.5.3.1+N:P151S) did not do very well comparing to other BA.5, but had an advantage over BA.4 without S:R346T.
BA.4* or BA.2.75* +M:D3X does not happen very often and when it does happen, the lineage does not have an significant advantage.
I think the TRS mutation C27889T is the reason why BA.5 does not do well in MHC-I reduction comparing to BA.4 (https://www.biorxiv.org/content/biorxiv/early/2022/12/23/2022.05.04.490614/F5.large.jpg) (notably B.1.429 has G27890T).
I also noticed that many BA.5 have T27889C reversion.
C26858T and A27259C are supposed to be silent, although if the ORF-Mh described in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8317007/ is translated, C26858T may play a role.

@FedeGueli
Copy link
Contributor

FedeGueli commented Jan 4, 2023

27889C was in the recombinant lineage XAZ , i digged a bit on this reversion when BA.5 started to spread but the most of the sequences with it i think they are reference backfilling, just XAZ was clearly identificated as recombinant with BA.2.5 and it was possible because the recombination happened at the start of the wave in Portugal and for the luckily presence of 3 nuc mutations in a row toward the 5'end of the genome.
#797

cc @oobb45729

@oobb45729
Copy link

27889C was in the recombinant lineage XAZ , i digged a bit on this reversion when BA.5 started to spread but the most of the sequences with it i think they are reference backfilling, just XAZ was clearly identificated as recombinant with BA.2.5 and it was possible because the recombination happened at the start of the wave in Portugal and for the luckily presence of 3 nuc mutations in a row toward the 5'end of the genome. #797

cc @oobb45729

Thanks.

@oobb45729
Copy link

I narrowed down the possibilities. I think the advantage of BA.5 over BA.4 is due to Orf1a:141-143del or Orf6:D61L or C27889T or a combination of them.

Various naturally occurring mutations have been described in Nsp1 throughout the pandemic, including a 3-aa deletion in the Nsp1 linker region (Nsp1ΔKSF) detected in North America and Europe (29, 30). Given the importance of the Nsp1 linker length in regulating viral-to-host translation selectivity, we tested the function of this variant using our reporter assay. Although Nsp1ΔKSF induced a small significant decrease in CoV-2 reporter activity, it more than doubled control reporter activity compared to WT Nsp1, and significantly reduced the CoV-2/control translation ratio (P < 0.001) (SI Appendix, Fig. S3). Thus, while lengthening the linker alters regulation of both viral and host translation, the shortened linker in this variant mainly compromised suppression of host translation. Together, these results suggest that the Nsp1 linker length is optimized to coordinate host translational suppression and bypass by SARS-CoV-2 5′ UTR, and that the Nsp1ΔKSF mutant could be less virulent.

from https://www.pnas.org/doi/10.1073/pnas.2117198119

ORF6:D61L reduces the binding of ORF6 to Rae1-Nup98. https://www.nature.com/articles/s41467-022-32489-5/figures/3
This shows D61 is important for the binding.

Cell-surface Spike and syncytia formation are evident in COVID-19 patients [14] and may allow viral spread in a manner obviating the full viral replication cycle. However, syncytia formation in SARS-CoV-2 infection induces innate immune responses through the cGAS-STING pathway [25]. Our finding that ORF8 limits syncytia formation suggests that ORF8 limits the syncytia-mediated viral spread, but prevents syncytia-dependent induction of innate immune responses. That is consistent with our model that ORF8 creates a more secured viral replication environment at the expense of infectivity.

from https://www.biorxiv.org/content/10.1101/2022.11.09.515752v1.full
So the lower level of ORF8 induced by C27889T may actually make the virus more infectious. I suspect that it is what ORF8:G8* does in XBB.1.5 too. I also noticed that comparing to other Omicrons, ORF8:Q18* and ORF8:Q27* less likely happen in BA.5. Those mutations probably trade some innate immune evasion capability for more infectiousness.

@ryhisner

@oobb45729
Copy link

So ORF8 reduces cell-surface spike levels, limiting the reactivity of anti-SARS-CoV-2 human sera towards spike-producing cells at the expense of infectivity.

Could it mean that mutations like ORF8:G8* or C27889T work better when the spike is highly antibody-evasive that it can tolerate the ORF8 loss? Maybe this is the reason why ORF8:G8* is doing well on XBB. Could it also mean that if the spike is no longer so antibody-evasive, the ORF8 loss becomes detrimental to the virus? Could this be the reason that the vaccines did a much better job limiting the spread of Alpha (with ORF8:Q27*) than Delta?

@FedeGueli
Copy link
Contributor

27889C was in the recombinant lineage XAZ , i digged a bit on this reversion when BA.5 started to spread but the most of the sequences with it i think they are reference backfilling, just XAZ was clearly identificated as recombinant with BA.2.5 and it was possible because the recombination happened at the start of the wave in Portugal and for the luckily presence of 3 nuc mutations in a row toward the 5'end of the genome. #797
cc @oobb45729

Thanks.

Hi @oobb45729 could you contact @thomaspeacock in DM please? here: https://twitter.com/PeacockFlu

@oobb45729
Copy link

I don't use twitter. I don't understand. @FedeGueli

@FedeGueli
Copy link
Contributor

@oobb45729 if you can try to get in touch with @thomaspeacock unluckily github has not message. "We" would like to talk a bit about orf8. Have a good time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants