Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XBB.2.3 Sublineage with S:G184V, ORF1a:R2159W #1776

Closed
ryhisner opened this issue Mar 18, 2023 · 9 comments
Closed

XBB.2.3 Sublineage with S:G184V, ORF1a:R2159W #1776

ryhisner opened this issue Mar 18, 2023 · 9 comments
Labels
designated XBB proposed sublineage of XBB
Milestone

Comments

@ryhisner
Copy link

Description

Sub-lineage of: XBB.2.3
Earliest sequence: 2023-1-18, India, Gujarat — EPI_ISL_16743861
Most recent sequence: 2023-3-6, Singapore — EPI_ISL_17207252, EPI_ISL_17207254, EPI_ISL_17207255, EPI_ISL_17207257, EPI_ISL_17207258, EPI_ISL_17207261 (four local cases, two with no travel/local information)
Countries circulating: India (13), Singapore (13), USA (8)
Number of Sequences: 34
GISAID Query: C6740T, G22113T, T23018C
CovSpectrum Query: C6740T, G22113T, T23018C
Substitutions on top of XBB.2.3:
Spike: G184V
ORF1a: R2159W (NSP3_R1341W)
Nucleotide: C6740T, G22113T

USHER Tree
As usual, the long branches in the Usher tree below are due to artifactual reversions & should be disregarded.
https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/XBB.2.3_G184V_ORF1aR2159W_subtreeAuspice1_genome_1ec30_5de800.json
image

Evidence
This lineage only very recently appeared and appears to be growing quickly. It seems to have originated in India and subsequently spread to Singapore and the US. Singapore uploads sequences faster than anyone, so I think this likely has spread elsewhere in Asia and will start to show up in sequences in the coming weeks.
Mutations from S:180 to S:186 have been ubiquitous in recent months, so it seems likely they confer a very modest increase in antibody evasion.

ORF1a:R2159W is more interesting. Overall, it has been a rare mutation throughout the pandemic. It was in an undesignated B.1 saltation lineage that rose to about 5% prevalence in Canada in late 2020/early 2021, and it was also in DN.1 (BQ.1.1.5 + S:K147N) and DN.1.1 (DN.1 + S:Y453F, ORF1a:V721I, ORF1a:N2752S). It involves a C->T mutation, which is easily the most common type. However, C->T mutation frequency is mostly driven by APOBEC, and the nucleotide context at C6740 (CA upstream and GG downstream) is unfavorable for APOBEC, which prefers A or T in the two closest upstream & downstream nucleotides.
image
Image below from "Evidence for host-dependent RNA editing in the transcriptome of SARS-CoV-2," by Di Giorgi, et al.
https://www.science.org/doi/10.1126/sciadv.abb5813
image

Genomes

Genomes EPI_ISL_16743861, EPI_ISL_16743865, EPI_ISL_16975274, EPI_ISL_17078585-17078586, EPI_ISL_17094219, EPI_ISL_17094293, EPI_ISL_17146442, EPI_ISL_17180839, EPI_ISL_17181019, EPI_ISL_17182937, EPI_ISL_17190102, EPI_ISL_17191687, EPI_ISL_17198378, EPI_ISL_17206421, EPI_ISL_17207252-17207258, EPI_ISL_17207261, EPI_ISL_17207269, EPI_ISL_17229372, EPI_ISL_17229380, EPI_ISL_17236292, EPI_ISL_17236472, EPI_ISL_17236495, EPI_ISL_17236498, EPI_ISL_17237297, EPI_ISL_17238401, EPI_ISL_17238638, EPI_ISL_17240910
@FedeGueli
Copy link
Contributor

I was thinking to propose it too this morning with #1775 ! great you did find it and propose.
Def worth monitoring XBB.2.3

@ryhisner
Copy link
Author

Just took a closer look at the collection dates, and 18/34 sequences in this lineage have collection dates of March 1 or later, which is pretty remarkable and may indicate rapid growth.

@thomasppeacock thomasppeacock added the XBB proposed sublineage of XBB label Mar 19, 2023
@oobb45729
Copy link

Interestingly, the result from this paper suggests that C>U mutations prefer G at the +1 position.
https://www.mdpi.com/1999-4915/13/3/394

@ryhisner
Copy link
Author

Huh, that's interesting. I wasn't aware of that paper. Very different from the other one. @oobb45729, do you know of any other papers that measure the mutational context of C->T (and others)? I'd like to look at as many results as possible to see which of these two papers is more accurate as the results are very different.

Bloom Lab showed that the rate of G->T mutations was roughly halved in Omicron compared to pre-Omicron lineages. I wonder if there is any difference in favored mutational contexts for different mutations.

@oobb45729
Copy link

One paper counts from RNA sequencing datasets from bronchoalveolar lavage fluids obtained from patients diagnosed with COVID-19 while the other paper counts from a phylogenetic tree created from genome sequences from the GISAID.
Maybe this is why there's a difference?

There's another paper Mutation rates and selection on synonymous mutations in SARS-CoV-2.

@FedeGueli
Copy link
Contributor

56 as today big upload from Singapore Gujarat, one sequence from Guangdong.
@thomasppeacock i think this should be prioritized.

@ryhisner
Copy link
Author

34 sequences uploaded today of this, 27 of them from Singapore.

Also, could we get a milestone attached to this issue? Thanks.

@oobb45729
Copy link

I did my own calculation based on the data from https://github.com/jbloomlab/SARS2-mut-fitness.
The result is similar to the picture above. @ryhisner
A>G:
highly favor U at +1, G at +1; favor G at -2, C at +2; highly disfavor A at +1; disfavor C at +1, C at -2.
A>U:
highly favor C at -1; favor G at +1; highly disfavor C at +1; disfavor U at -1, A at -1, A at -2, G at -1.
A>C:
highly favor G at -1; favor G at +1, U at +1, G at +2; highly disfavor U at -1; disfavor C at +1, C at -1, A at +1.
C>U:
highly favor G at +1; favor A at -1; disfavor G at -1, C at -1.
C>A:
highly favor C at -1; favor U at +1, C at +2; highly disfavor G at -1; disfavor A at +1, U at -2, U at +2.
C>G:
favor A at +1, U at -1; disfavor G at -1, C at +1, G at -2.
G>U:
highly favor C at -1; favor A at -1, G at -2; highly disfavor G at -1; disfavor U at -2.
G>A:
highly favor C at -1, G at +1; favor C at +2; disfavor A at +1, U at -1, G at +2.
G>C:
favor A at +1, C at -1, C at -2, A at +2; highly disfavor C at +1; disfavor G at -1, G at -2, G at +1, G at +2.
U>C:
highly favor A at -1; favor C at +1, G at -2, C at +2, G at +1, A at +2; highly disfavor G at -1; disfavor U at -1, U at +1, G at +2, U at -2, U at +2.
U>A:
highly favor G at +1; favor C at +2, C at -1, G at +2; highly disfavor U at +1; disfavor G at -1, A at +1, U at +2, C at +1, A at -1, U at -2.
U>G:
highly favor C at +1; favor C at -1; highly disfavor A at +1; disfavor G at +1, G at -1.

@ryhisner
Copy link
Author

ryhisner commented Apr 5, 2023

Awesome, thanks for that analysis, @oobb45729! It's very interesting to contrast those findings with what we found in our preprint on molnupiravir sequences for C->T.

• At +1: A favored, G very slightly favored, C disfavored, T highly disfavored
• At -1: C disfavored, T disfavored, G highly favored

Biggest contrast is G at -1, which has the largest positive effect of any preceding or following nucleotide in molnupiravir sequences but is disfavored in all other analyses.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
designated XBB proposed sublineage of XBB
Projects
None yet
Development

No branches or pull requests

5 participants