Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XBB.1.5 + S:S494P from multiple states in USA (12 sequences) #1610

Closed
rquiroga7 opened this issue Feb 2, 2023 · 8 comments
Closed

XBB.1.5 + S:S494P from multiple states in USA (12 sequences) #1610

rquiroga7 opened this issue Feb 2, 2023 · 8 comments
Labels
XBB proposed sublineage of XBB

Comments

@rquiroga7
Copy link
Contributor

Sub-lineage of: XBB.1.5
Earliest sequence: 2022-12-19
Most recent sequence: 2023-01-22
Countries circulating: USA (states: New York, Rhode Island, Connecticut, North Carolina, Massachusetts, Maryland, Florida)

The proposed lineage has ORF8:G8*, and S:494P. I am including the USHER tree although for some reason it shows F486L, although GISAID has F486P for all these sequences. It also makes a mistake with 417 mutations, since all the sequences I am proposing have S:K417N.

Genomes:

USA/MA-CDCBI-CRSP_IKXMYH37FEBGGBBQ/2023|OQ346588.1|2023-01-14
USA/NY-CDC-LC0990938/2023|OQ307674.1|2023-01-14
USA/NY-CDC-LC0978047/2022|OQ205374.1|2022-12-25
USA/RI-CDC-LC0992381/2023|OQ345257.1|2023-01-15
USA/NY-CDC-LC0992366/2023|OQ345214.1|2023-01-15
USA/NY-PRL-221221_00F15/2022|EPI_ISL_16343447|2022-12-19
USA/NY-ASC-210962515/2023|EPI_ISL_16576398|2023-01-03
USA/FL-CDC-LC0977122/2022|OQ206188.1|2022-12-29
USA/MD-HP43388-PIDIGBSHND/2023|EPI_ISL_16674195|2023-01-07
USA/MD-HP43230-PIDRNALJWU/2022|EPI_ISL_16674093|2022-12-29

Also, GISAID finds two additional, newer genomes for a total of 12 sequences:
EPI_ISL_16343447
EPI_ISL_16572620
EPI_ISL_16573542
EPI_ISL_16674093
EPI_ISL_16576398
EPI_ISL_16674195
EPI_ISL_16668777
EPI_ISL_16713029
EPI_ISL_16710218
EPI_ISL_16710261
EPI_ISL_16742093
EPI_ISL_16763385

Usher tree:
image

@AngieHinrichs
Copy link
Member

I am including the USHER tree although for some reason it shows F486L, although GISAID has F486P for all these sequences.

Sorry, there is a bug in the UShER web interface's generation of JSON files to display in nextstrain: mutations (like T23018C) are considered one at a time, even when there is more than one mutation in a codon (like both T23018C and T23019C).

@rquiroga7
Copy link
Contributor Author

Thank you @AngieHinrichs , that explains a lot!

@corneliusroemer
Copy link
Contributor

ORF8:G8* is always part of XBB.1.5, and S:417N is also normal (if it's K that's most likely an artefact)

It's good to include as many context samples as possible in an Usher tree, yours is a bit sparse making it hard to see if the lineage is clean and where exactly it embeds.

Usher tells me your 12 sequences are part of 3 separate lineages:
https://next.nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_1b5ef_ca1910.json?f_userOrOld=uploaded%20sample&label=id:node_7948064
https://next.nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice2_genome_test_1b5ef_ca1910.json
https://next.nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice3_genome_test_1b5ef_ca1910.json?f_userOrOld=uploaded%20sample

The biggest one has 9 sequences at the moment - probably best to wait a bit more to see which one grows as we otherwise end up with 100s of lineages.

@thomasppeacock thomasppeacock added the XBB proposed sublineage of XBB label Feb 3, 2023
@rquiroga7 rquiroga7 changed the title XBB.1.5 + ORF8:G8* + S:S494P from multiple states in USA (12 sequences) XBB.1.5 + S:S494P from multiple states in USA (12 sequences) Feb 3, 2023
@rquiroga7
Copy link
Contributor Author

rquiroga7 commented Feb 3, 2023

I think the 3 separate Usher subtrees might be wrong, since the separations are due to what I think are a wrong 486L node and a wrong 417N+417K node.

But I agree waiting a bit is reasonable. Thank you Cornelius

@AngieHinrichs
Copy link
Member

Looking at the full tree in taxonium, I think it's likely that the 3 branches should be separate. There are 144 sequences that have C22945T and A1230G like USA/MA-CDCBI-CRSP_IKXMYH37FEBGGBBQ/2023|OQ346588.1 (and then it adds T23042C/S:494P). Then there are more than 7,000 sequences that have T10204C, of which 189 sequences also have A24730T like USA/RI-CDC-LC0993565/2023|OQ353425.1, USA/NY-CDC-LC0978047/2022|OQ205374.1 and USA/NY-CDC-LC0990938/2023|OQ307674.1.

There's even a fourth XBB.1.5 > T17124C > ... > T23042C/S:494P branch in the 2023-02-02 tree: 26 sequences have C8655T, then one sequence has C8655T and G23628T (S:S689I), and then USA/PA-CDC-QDX45433024/2023|OQ329235.1|2023-01-09 adds T23042C/S:494P.

Like @corneliusroemer said it's helpful to use a larger subtree size. The UShER web interface's default number of samples per subtree (50) was set with little local outbreaks in mind. For analyzing potential lineages, it's better to use a much larger size like 1000 or 5000 in order to get as much context as possible.

@rquiroga7
Copy link
Contributor Author

Thank you for the input! Seems like S494P is actually appearing on multiple different lineages. Thank you also for the feedback regarding subtree sample number. Will keep that in mind.

@rquiroga7
Copy link
Contributor Author

There are now 37 XBB1.5* sequences with S494P in GISAID, but as @AngieHinrichs suggested, they appear to be from many separate branches. Now also appearing in England and Denmark.

@corneliusroemer
Copy link
Contributor

I'll close this due to homoplasy - if there's a single big S:S494P, please open an issue for it specifically

@corneliusroemer corneliusroemer closed this as not planned Won't fix, can't repro, duplicate, stale Mar 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
XBB proposed sublineage of XBB
Projects
None yet
Development

No branches or pull requests

4 participants