-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XCC - CH.1.1.1/XBB.1.9.1 recombinant, 32 samples from Pakistan, Australia, England and China. #1876
Comments
Also @DoropFan discovered this a couple days after your great spot! Always good when we are more than one on a new thing. Kudos Josette |
Thank you for the catch! Please note that in NSW we sequence a mix of random community and targeted samples... so that 4% should not be interpreted as community frequency. |
Yes, I know. Thank you. |
An extra one from Pakistan/Karachi. EPI_ISL_17474691 |
6 new ones from NSW. I have changed the query, to catch them all. -18583A, C1545T, C5183T, T28297C I made a graph for NSW. Keep in mind that week 14 only has around a quarter of the amount of samples sequenced compared to earlier weeks. This means its percentages will still change and that in the event of a cluster it will cause even more bias than it will in other weeks. That said, I do think it would be a good idea to name this one. Also because it will now mistakenly be seen as XBL, XBB.1.9.1, XBB.1 or XBB.1.22 (latest Nextclade vercel app version). Here is the new list of now 25 samples: |
The apparent leap in frequency of this lineage in NSW data is due to specific targeted sampling (and not of overseas travellers). It does not represent community frequency and should not be interpreted as such; the plot of growth in NSW is therefore quite misleading due to sampling bias. Within the next few weeks, the numbers should more accurately reflect community cases. |
Thank you, @learithe , for this comment. I do not know if you are the right person to ask this to, but it would be very useful if the "Sampling strategy" of the samples in GISAID would be filled with comments like "local cases" or "community sureveillance" or on the other hand with "travelling from:" or "hospitalized" or "targeted" or something else that is appropriate. Several countries do that, and besides given insight in what's going on in countries with a limited amount of sequencing through airport surveillance, it makes it much easier to establish reliable growth figures. The latter also because GISAID offers the opportunity to select for certain samples by searching for a specific word. |
@JosetteSchoenma, not all sequence submitters have ethics/organisational/legal approval to release that kind of information on a per-sample basis... and yes that unfortunate from a global surveillance standpoint, since so many places have complex and frequently changing sampling strategies/biases now. I'm doing what I can by giving you the heads up here, because it's an interesting lineage. :) Best to contact the submitting labs if you require more specific epidemiological information! |
@learithe Yes, thank you very much for your information. |
@JosetteSchoenma ICYMI, it seems samples from China are no longer being shared via GISAID, but rather by a new site: GenBase |
Thank you for the info and link @Mike-Honey . |
@JosetteSchoenma, yes I personally am one of the bioinformaticians at one of the NSW Health Pathology laboratories (ICPMR) and I occasionally submit sequences to GISAID for NSW/ICPMR. Which is why I know that these are not random community samples, and wanted to warn about that so the data isn't misinterpreted. And is why I am replying to you the way I am. ;) If you need more information about these sequences or want to make a case for the NSW government to publicly release more epidemiological information on a per-sample basis, feel free to contact the submitter (which is not actually me personally for the majority of NSW sequences, including for this lineage) through the appropriate channel (the "contact submitter" link in GISAID), with identifying information about you and your work and how you can be contacted (your github and twitter profiles provide no information about you beyond your name), which can then forwarded to people who could approve releasing/sharing the details that you are asking for (which is not my purview). Again, it's an interesting lineage, thank you for noting and tracking it here and we hope it gets a designation. We're curious how it will truly compete with everything else here in NSW, since its parents have both made a strong showing! And I'll keep commenting here if the lack of metadata for this lineage continues to be an issue for interpretation. :) |
@learithe Thank you for your response. I will send you a message through GISAID. If it does not reach you, my email address is [email protected]. 6 new ones from NSW (Australia): |
@learithe Thanks! The concerns re detailed de-anonymizing information are totally understandable. Something as simple as a binary "random" vs "targeted" (could be traveller or whatever else) would already be very informative without revealing much that is potentially identifying. |
One found in Denmark. EPI_ISL_17541339 |
XCC |
Description: CH.1.1.1/XBB.1.9.1 recombinant
Private mutations: C1545T, C3857A
Breakpoint: between nucleotide 12444 and 12789 at the end of ORF1a, so Spike from XBB.1.9.1.
Earliest sequence: 2023-02-12 from Karachi/Pakistan
Most recent sequence: 2023-04-01 from New South Wales/Australia
Countries circulating: Pakistan (Karachi 6, Gilgit 1), England 1, Australia (New South Wales 9)
**GISAID: -18583A, C1545T, C5183T, T28297C
CovSpectrum query: https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?variantQuery=%2118583A+%26+C1545T+%26+C5183T+%26+T28297C&
Please look through the open data as well!
Please note that, even though absolute numbers are small, most samples have been uploaded very recently. And from the latest week with sequenced samples from Karachi (week 11) 4 out of 20 sequenced samples, so 20%, were this recombinant. And for the latest week in NSW (week 13) 7 out of 157 were, so 4%.
Evidence:
Nextclade for comparison with CH.1.1.1 and XBB.1.9.1:
Usher tree:
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_d59e_5af640.json?branchLabel=nuc%20mutations&c=Nextstrain_clade&label=id:node_5871347
Genomes:
EPI_ISL_17441155
EPI_ISL_17441163
EPI_ISL_17441166
EPI_ISL_17408880
EPI_ISL_17409015
EPI_ISL_17408852
EPI_ISL_17408846
EPI_ISL_17408851
EPI_ISL_17408969
EPI_ISL_17408840
EPI_ISL_17409014
EPI_ISL_17375487
EPI_ISL_17358926
EPI_ISL_17268855
EPI_ISL_17205126
EPI_ISL_17099006
EPI_ISL_17030078
The text was updated successfully, but these errors were encountered: