New Lineages Update: 20 new lineages active through 2023-02-15 #193

jmcbroome · 2023-02-16T23:53:44Z

Lineage Name	Parent Lineage	Size	Exponential Growth Coefficient CI	Earliest Appearance	Latest Appearance	Regions	Nucleotide Changes	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Open Sequence FASTA	EPI ISLs	Amino Acid Changes	Nucleotide Reversions
XBB.1.9.2.1	XBB.1.9.2	111	[0.46792743 0.7430019 ]	2023-01-02	2023-02-08	Austria, Australia, and Germany	C28928T,G23401T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	No Data Available	Get EPI ISLs	S:Q613H,N:L219F	No Reversions
XBB.1.5.2.1	XBB.1.5.2	59	[0.36211221 0.73847185]	2022-12-28	2023-02-01	USA	A22002T,C5221T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	No Data Available	Get EPI ISLs	S:K147I	No Reversions
XBB.3.1.1	XBB.3.1	17	[0.35314583 0.91739771]	2022-12-04	2023-02-04	Denmark and Netherlands	C10263T,C9042T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:S2926F,ORF1ab:A3333V	No Reversions
XBM.1	XBM	219	[0.35292579 0.46126096]	2022-11-20	2023-02-07	Canada	T18429C,C12534T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:T4090I	No Reversions
XBK.1	XBK	180	[0.28522462 0.38813739]	2022-07-27	2023-02-08	Slovenia, Germany, and Italy	C25046T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	S:P1162S	No Reversions
XAY.2.3	XAY.2	82	[0.28382549 0.43708748]	2022-11-25	2023-02-08	Denmark	A22034G,C657T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	No Data Available	Get EPI ISLs	ORF1ab:A131V,S:R158G	No Reversions
XBF.3.1	XBF.3	90	[0.28305407 0.42545791]	2022-11-02	2023-02-04	Netherlands, Australia, and England	C842T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	No Data Available	Get EPI ISLs	ORF1ab:P193S	No Reversions
XBB.1.4.2	XBB.1.4	40	[0.23443399 0.55073977]	2022-10-16	2023-02-06	Italy, Thailand, and Austria	G1148T,T26160C	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:G295C	No Reversions
XBB.1.5.11	XBB.1.5	860	[0.23090608 0.27395253]	2022-11-09	2023-02-07	USA, England, and Canada	T17124C	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	S:V252G	T22317G
XBB.1.9.3	XBB.1.9	62	[0.19420051 0.7738821 ]	2022-12-07	2023-02-04	Netherlands, Spain, and England	G18169T,G19480A	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:G5969C,ORF1ab:G6406S	No Reversions
XBB.6.1.1	XBB.6.1	48	[0.18384719 0.40605432]	2022-11-27	2023-02-03	USA	C19895T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:A6544V	No Reversions
XBB.1.15	XBB.1	3385	[0.16053001 0.17860341]	2022-02-16	2023-02-08	USA, Guatemala, and Mexico	G27915T,C1884T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:A540V,ORF8:G8*	No Reversions
XBB.1.5.7.1	XBB.1.5.7	114	[0.12235608 0.27905442]	2022-12-01	2023-02-04	USA, Germany, and Mexico		View On Cov-Spectrum	View On Taxonium (Public Samples Only)	No Data Available	Get EPI ISLs	ORF1ab:F4649V,S:N417K	T14209G,T22813G
XBC.1.3.1	XBC.1.3	60	[0.10071042 0.21145987]	2022-10-05	2023-02-07	Australia	C6145T,G15743A	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	No Data Available	Get EPI ISLs	ORF1ab:S5160N	No Reversions
XBB.1.4.1.1	XBB.1.4.1	275	[0.08893083 0.16888427]	2022-10-30	2023-02-03	Sweden and Denmark	C12741T,C14922T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:T4159I	No Reversions
XBB.2.2.1	XBB.2.2	93	[0.04227181 0.27881254]	2022-10-18	2023-02-04	Spain, Germany, and England	G27870T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF7b:E39*	No Reversions
XBF.4	XBF	80	[0.03033164 0.20510676]	2022-12-01	2023-02-07	England, Luxembourg, and Iceland	G625T,C1514T,C4252T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:H417Y,ORF1ab:K120N	No Reversions
XBB.2.5	XBB.2	201	[-0.03259708 0.05205796]	2022-10-31	2023-02-03	USA, England, and India	G23401T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	S:Q613H	No Reversions
miscBA.5.2CJ.1.1	miscBA.5.2CJ.1	114	[-0.03782871 0.07828514]	2022-10-27	2023-02-07	Japan and England	G14829T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	No Data Available	Get EPI ISLs	ORF1ab:M4855I	No Reversions
XBB.1.9.1.1	XBB.1.9.1	73	[-0.04567397 0.36771319]	2023-01-02	2023-02-08	Germany, USA, and France	G16741T	View On Cov-Spectrum	View On Taxonium (Public Samples Only)	Download Example Sequence FASTA (LAPIS)	Get EPI ISLs	ORF1ab:V5493F	No Reversions

jmcbroome · 2023-02-17T00:27:21Z

Immediate note: a lot of these look fairly good, though XBB.1.5.7.1 is probably spurious- its defined by two reversion mutations.

AngieHinrichs · 2023-02-17T02:36:07Z

though XBB.1.5.7.1 is probably spurious- its defined by two reversion mutations.

Definitely -- the first reversion (T22813G) is a common dropout-reversion and the second reversion (T14209G) reverts the defining mutation of XBB.1.5.7!

AngieHinrichs · 2023-02-17T02:38:18Z

I see its cell is empty in the "Nucleotide changes" column, which I assume means "no nucleotide changes that aren't reversions" -- that might be a good reason to not propose a lineage.

AngieHinrichs · 2023-02-17T02:42:49Z

I get empty text from the link Get EPI ISLs for XBB.1.9.2.1. The headers look OK (aside from content-length: 0):

HTTP/2 200 
server: nginx/1.18.0 (Ubuntu)
date: Fri, 17 Feb 2023 02:40:41 GMT
content-type: text/plain
content-length: 0
vary: Origin
vary: Access-Control-Request-Method
vary: Access-Control-Request-Headers
lapis-data-version: 1676538737
content-disposition: inline
cache-control: no-store

-- just no EPI_ISLs.

AngieHinrichs · 2023-02-17T02:49:24Z

Oh, maybe cov-spectrum doesn't yet have XBB.1.9.2. It was designated on Feb. 3, and added to the tree as of the 2023-02-25 build, but there hasn't yet been a pangolin data release that would include it (I'm working on that...). Or @corneliusroemer does CoV-Spectrum use nextclade and does that have XBB.1.9.2?

[Edit: yep, if I alter the URL to have XBB.1.9 instead of XBB.1.9.2 then it returns plenty of IDs. I'm not suggesting that you do that! Just pointing it out as a workaround that humans looking into these can do.]

AngieHinrichs · 2023-02-17T02:56:13Z

Anyway, the XBB.1.9.2.1 proposal is good and it already has a pango-designation issue (as of 9 hours ago): cov-lineages#1664 -- except that doesn't include S:Q613H as written, and Fede Gueli pointed out the S:Q613H so good call there. 🙂

aineniamh · 2023-02-17T13:44:38Z

This looks really great- if there are a set of commonly seen reversion mutations, should we maintain a list and perhaps rule out auto-lineage suggestions based on these mutations?

corneliusroemer · 2023-02-17T14:00:17Z

I'll review these carefully over the weekend. @FedeGueli, @Sinickle, @ryhisner if you'd like to have a look at the proposed lineages, it'd be great to have some extra eyes!

jmcbroome · 2023-02-17T17:33:57Z

I did add a filter to block proposals that have empty mutation sets with respect to their parents/are defined by reversions only after seeing XBB.1.5.7.1 in this test.

I'd appreciate additional eyes on this, of course, and if you want to designate any of the proposals here, feel free, but this is still intended as a test PR- I'm planning on opening one directly to the pango-designation repo this weekend, such that with a quick bit of review it can be merged and the update will be complete in a single button press!

AngieHinrichs · 2023-02-17T17:45:44Z

I did add a filter to block proposals that have empty mutation sets with respect to their parents/are defined by reversions only after seeing XBB.1.5.7.1 in this test.

Awesome! I realized after I asked for a filter that it would also be useful to be alerted that there's a faulty-looking branch like that... sometimes by removing a few problematic sequences and re-optimizing, I can get the sequences placed on a better branch so at least the reversion on the parent lineage-defining mutation is not necessary. Sorry to keep asking for things, but would it be possible to call out branches that were filtered for that reason?

jmcbroome · 2023-02-17T17:51:35Z

I could add a logging file to the pipeline that does it with minimal effort- I don't think it fits really with the actual pull request, though. Tracking that kind of branch is a good idea in general, though- could be a small project in setting up an automated system that scans daily builds for reversion-heavy paths and emails/posts issues when they're detected. This idea is related to some thoughts I've had around tracking saltations that might be due to Molnupiravir treatment and similar, actually.

AngieHinrichs · 2023-02-17T17:59:30Z

I could add a logging file to the pipeline that does it with minimal effort

That would be great, thanks!

FedeGueli · 2023-02-17T23:47:21Z

I'll review these carefully over the weekend. @FedeGueli, @Sinickle, @ryhisner if you'd like to have a look at the proposed lineages, it'd be great to have some extra eyes!

Very willingly to do. How to do it? commenting directly down here?

first i let here my thoughts on
The ones i already "looked at" in the last weeks
1 XBB.1.9.2.1 already proposed ok
2 i noticed xbb.1.5+s:K147I in steep rise too! so also the second one is clearly ok.
BUT defining mutations are two S:t284I and S:k147I

( i think today is already bigger than this) and please note that it gained further S:E619K very similar to S:E619Q designated in one of the first sublineages of BQ.1
EDITED it is already XBB.1.5.2 i think maybe better to start it from S:k147I than designating a new sublineage
3 XBM.1: already looked at it and commented a bit i think in one issue of xbm+455f : it is clearly the most fit branch of XBM so designating could have sense. tree is clear.
4 miscBA.5.2CJ.1.1 was the one i proposed named third recombinant with CJ.1 or something similar. i closed it after we stated there were no way to be sure it was not just XBK even if any hypothetical ancestor without the defining mutations of both was never sampled. to me it os ok to be designated.

Now i'll finish the work on the xbb.1.5 spike issue and then coming back here.

FedeGueli · 2023-02-18T09:19:37Z

XBB.3.1.1 i think it is just the sequencing intensity of Denmark versus the other countries that makes it grow faster? or do your model account for that already? to be noted , this designated lineages has further the Orf6:61 mutations reverted. Maybe its advantage could come from there.
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3716a_967d0.json?c=country&label=id:node_8000678

FedeGueli · 2023-02-18T09:49:57Z

XBK.1 defined by S:P1162S :i. compared it to the main branches and subranches of XBK on covspectrum and while likely is less transmissible than the big branch with C2701T that should be designated( it is not uk or dk) it has a relevant advantage versus the other branches. But its advantage in my estimation comes deeper in the tree starting after C14694A (Orf1b:D409E) and not immediately after S:P1162S

https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?nucMutations=C2701T&nextcladePangoLineage=XBK*&aaMutations1=S%3A1162S&nucMutations1=C14694A&nextcladePangoLineage1=XBK*&analysisMode=CompareToBaseline&

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3b372_9afe0.json?c=country&label=id:node_8300141

FedeGueli · 2023-02-18T10:26:33Z

XAY.2.3 with orf1a:A131V is faster than the recently designated XAY.2.1 XAY.2.2 in the upper branch of the tree . designation deserved.
https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?aaMutations=Orf1a%3An447S&nextcladePangoLineage=XAY*&aaMutations1=ORF1a%3AA131V%2CS%3AR158G&nextcladePangoLineage1=XAY*&analysisMode=CompareToBaseline&
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3f9bf_a0400.json?label=id:node_8649634

FedeGueli · 2023-02-18T10:36:40Z

XBF.3.1 is slower than XBF.3. i would not designate it (although it is a significant branch of it , it is mainly a Netherlands sublineage with also a visible clusterization after C1204T, that could explain why your model flagged it)

https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?aaMutations=orf1a%3Av274I&nextcladePangoLineage=XBF*&aaMutations1=ORF1a%3AP193S&nextcladePangoLineage1=xbf*&analysisMode=CompareToBaseline&
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_12ec0_a8a20.json?label=id:node_8294552

FedeGueli · 2023-02-18T10:45:06Z

XBB.1.4.2 Usher clean but i thinkmthe interesting part is just the Austrian branch with S:G75S, recent and very fast vs parent.

I tested the S:E1188D part but it doesnt show any sign of advantage :
https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?nucMutations=G1148T%2CT26160C&nextcladePangoLineage=XBB.1.4*&aaMutations1=S%3Ag75S&nucMutations1=G1148T%2CT26160C&analysisMode=CompareToBaseline&
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3716a_967d0.json?c=country&label=id:node_8000678

FedeGueli · 2023-02-18T11:22:49Z

XBB.1.5.11 has the reversion of the defining (S:G252V) of XBB.1 as defining. very unlikely to me. @AngieHinrichs

FedeGueli · 2023-02-18T11:34:36Z

XBB.1.9.3 although slower of its sibling top growing lineages .1 and .2 it seems clearly faster than the other XBB.1.9 branches.
Interestingly it gained further spike mutation S:L212S but surprisingly it seems to slow it down ( at the opposite we had seen in BA.2 with @corneliusroemer in the first quarter of 2022)
https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?nucMutations=C11758T&nextcladePangoLineage=XBB.1.9*&nucMutations1=G18169T%2CG19480A&nextcladePangoLineage1=XBB.1.9*&analysisMode=CompareToBaseline&

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_1a253_b5d60.json?c=userOrOld&label=id:node_8017574

FedeGueli · 2023-02-18T11:37:37Z

I ll try to end up the second half of lineages later.
@AngieHinrichs @corneliusroemer my personal thought from a basic variant seeker person (so not speaking of any bioinfo thing) the overall quality of lineages picked up is good. I suggest to exclude reversions from the game maybe just for the first phase .

FedeGueli · 2023-02-18T14:25:41Z

XBB.6.1.1 i dont see a valid reason to designate it beyond the fact being the only one big branch of XBB.6.1
No growth advantage.
https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?nextcladePangoLineage=XBB.6.1*&nucMutations1=C19895T&nextcladePangoLineage1=XBB.6.1*&analysisMode=CompareToBaseline&
tree clear but quite not interesting:
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_30cc7_d3620.json?c=userOrOld&label=id:node_7998784

FedeGueli · 2023-02-18T14:31:38Z

XBB.1.15 ok 3000+ sequences but why to designate? it could stay XBB.1 withouth big thoughts.
(only note is that it has a very good prevalence in Ecuador)
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3c9b0_e09e0.json?label=id:node_8013121
https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?nextcladePangoLineage=XBB.1&nucMutations1=G27915T%2CC1884T&nextcladePangoLineage1=XBB.1*&analysisMode=CompareToBaseline&

FedeGueli · 2023-02-18T14:44:44Z

XBC.1.3.1 it doesnt seem to be competitive with XBC.1 but to have some advantage vs XBC.1.3 (that starts after 25614T)

To me its potential fitness become more evident just after acquiring T6447C - ORF1a:V2061A, still a very slight advantage versus the grandma lineage XBC.1 . If proposed by me i would not advocate to designate it at this point.
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_351b9_e22b0.json?label=id:node_8651060
https://cov-spectrum.org/explore/World/AllSamples/Past3M/variants?nextcladePangoLineage=XBC.1*&nucMutations1=C6145T%2CG15743A%2C6447C&analysisMode=CompareToBaseline&

FedeGueli · 2023-02-18T15:09:40Z

XBB.1.4.1
Also here i have doubts ok it is the biggest branch of XBB.1.4 , but as usually happens this means not so much in a not growing fast lineage.
Checking with other branches it has no clear advantage vs them, often it has disadvantage.
Also taking account a Danish effect in artificailly boosting some lil branches i sont think this deserves a .1 or it is helpful to designate.
https://cov-spectrum.org/explore/World/AllSamples/Past3M/variants?nucMutations=C16260T&nextcladePangoLineage=XBB.1.4.1*&aaMutations1=ORF1a%3AT4159I&nextcladePangoLineage1=XBB.1.4.1*&analysisMode=CompareToBaseline&
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_377df_e51c0.json?label=id:node_8060493

from jmcbroome#193

corneliusroemer · 2023-02-19T16:56:46Z

Thanks @FedeGueli! I independently reviewed the proposed lineages with comments (and Usher links) in this sheet: https://docs.google.com/spreadsheets/d/1vEKbYtX7HDfFtm20t-bGJ6Qog-l_mPVHF4Q3ZIkRNk8/edit?usp=sharing

I've got similar thoughts to yours - the type of lineages proposed here are very different than the ones from #191. I preferred the ones from the previous PR. Not sure what the difference in settings was, maybe the inclusion of some sort of growth estimate in the new PR?

Many of the proposed lineages are very small (<100) and encompass a large portion (>50%) of the parent lineage. I didn't designate these for now as they wouldn't be designated under the ordinary procedure through issues or systematic designation. Maybe Autolin is onto something - we can review again in a month and see if these branches exploded.

A few specific points:

As @FedeGueli noted, the proposed XBB.1.5.11 has the reversion of the defining (S:G252V) of XBB.1 as defining. Likely artefact
The proposed XAY.2.3 relies on S:158G, that mutation is apparently defining of XAY.2 generally - just often not correctly sequenced in Denmark. It's very homplasic on the XAY.2 subtree, hence should not be used as defining mutation. If you leave out the Spike mutation it's not clear why it should be designated over the other branches of XAY.2
Two proposed lineages have issues attached to them, the ones designated as XBQ and XBK.1

For various reasons, the PR can't be used/merged as is:

Strain names are not the way they need to be (they contain dates and EPI_ISLs and | separator)
One big commit rather than 1 commit per lineage make it hard to impossible to cherrypick those that we want to designate
Lineage note entries are not in right place (below the parent lineage) but at the very end of the notes
Lineages with already 3 levels don't get an alias (which they should get) and there's also no entry in alias_key.json creating that alias (e.g. XBB.1.9.2.1 should be called EG.1 and get appropriate entry in alias_key.json

See further comments in #194

I've designated 7 lineages based on this PR, see https://github.com/cov-lineages/pango-designation/commits/master

Reviewing took quite some time, a similar amount (or more) compared to reviewing designations proposed by the community as issues often contain reasoning and are less likely to be artefacts. They are currently less efficient from a designation perspective than manual systematic designations like last weekend the various BA.5/BQs. A few thoughts on time savers are in #194. Prebuilt Usher trees viewable with link would be a big time saver.

jmcbroome · 2023-02-19T20:52:15Z

Sorry to hear you're disappointed in the quality of these lineages and the necessary time to review. The primary difference between this and the last pull request is the series of designations put in in the meantime- many of the higher quality lineages it picked up were designated by you in advance of the release, were incorporated into the public builds in the days between the 9th and the 14, and otherwise overlapped with the proposed designations. This PR represents more edge cases and leftovers after the release, and I had to relax some parameters to get more than a handful of designations to examine.

I appreciate the detailed feedback! I will go over it this week.

To briefly address your points about why it can't be used/merged:

I use the sample names as they are present in the UShER tree and associated metadata file- this is obviously an oversight. I can apply some processing to extract the appropriate subsection.
I can return to the issue output. I had originally moved away from issue output as part of the goal was to prevent the maintainers from having to select/update the notes and lineages.csv files manually. I could potentially open multiple parallel pull requests, but this could lead to merge conflicts. Your suggestion to split each into separate commits is also potentially viable, of course.
I wasn't aware the notes were sorted in this way.
I've been using your pango_aliasor tool to sort out lineage compression without worrying about the details of its implementation. On closer examination, it appears it can only handle compressions predefined in the JSON you mention. I can look into updating a local copy of the JSON, but it will take a bit of time to sort out the implementation. Do you programmatically identify what names are available when doing novel third-level compression? If so, do you have that code available publicly? Or perhaps pango_aliasor can do that, and I'm misapplying it somehow?

Sinickle · 2023-02-20T14:45:06Z

Question - if I read the paper correctly, then I believe the growth advantage of auto-lineages doesn't take into consideration the growth advantage of the parent lineage, right?

I think this will result in many unimportant mutations on the fastest growing variants to be given their own designation. This makes sense if you are just trying to describe the things that are currently growing, but less sense if you're trying to describe the ways that things are meaningfully changing.

For the piece on reversions --
At least to me, before I believe a reversion is real, I want to see it accompanied by some other unique mutations, and see that when those unique mutations are present the reversion is much more likely to be present, and that this is true in multiple countries.

jmcbroome · 2023-02-20T20:26:10Z

@Sinickle The growth modeling of the autolineages does not consider the parent or any contextual information, no- but it is only used to filter and prioritize the output, not to generate designations itself. The foundation of my approach is about the agnostic representation of genotypes through lineages, allowing researchers to communicate about the genetic diversity of SARS-CoV-2 without inherent assumptions as to what mutations may or may not be important. I do incorporate weighting schema optionally into the pipeline, but don't actually apply mutation-level weighting for this particular output- though I do apply additional weight to underrepresented countries to encourage the designation of international lineages.

Essentially, this method is attempting to describe the full breadth of diversity of active SARS-CoV-2 virions- the first of your two statements. Identifying what lineages are important a priori- "trying to describe the way things are meaningfully changing"- requires significant assumptions about the behavior of the viral genome that can easily be violated by epistatic effects or simply fitness effects more complex than reduced antibody binding.

Navigating the competing philosophies of lineage designation- between representation of what is there, and prediction of what is important- has been a serious challenge in developing this work, given the wide diversity of opinions among the community as to the viability and importance of each of these functions. My stance has generally been that creating new names is relatively cheap and that we can identify what lineages are important or different after they are given names. Even if a new lineage doesn't appear epidemiologically distinct on its face, the genetic distinction confers the possibility for altered fitness as environments and context changes.

I hope this serves as sufficient explanation to you and @FedeGueli and others who might question why this method would designate a sublineage that doesn't have immediate and obvious behavioral differences from the parent lineage or any mutations that we would consider interesting a priori.

FedeGueli · 2023-02-21T10:26:07Z

Hi @jmcbroome i didnt put any question. my only role here is (and i couldnt do more of thst)to compare old and new method for enhance its performance.
Maybe if i can tell from my point of experience that Uk and Denmark are big distorsions to real growth advantage. Removing or weighing them at the max will help a lot.

jmcbroome · 2023-02-21T18:53:33Z

@FedeGueli I meant in reply to your comment "XBB.1.15 ok 3000+ sequences but why to designate? it could stay XBB.1 withouth big thoughts." above. You seemed implicitly confused about why I might be designating lineages that don't have an obvious growth advantage, and my last comment attempted to address that question.

I do appreciate your feedback, though. RE: countries with denser sequencing leading to more growth- I already control for this. I use growth stratified by country, and compute it as the percentage of all samples collected in each week. Being 5% of 100 samples from Japan is the same as being 5% of 10000 samples from England with respect to the model. There could still be some inherent biases resulting from sequencing strategy, of course- some countries intentionally sequence outbreaks, that are more likely to be closely related to one another, instead of doing unbiased population sequencing- but variation in overall sequencing volume shouldn't impact these estimates too much.

AngieHinrichs · 2023-02-21T20:42:32Z

XBB.1.5.11 has the reversion of the defining (S:G252V) of XBB.1 as defining. very unlikely to me. @AngieHinrichs

Yep. And when I look at a public-tree taxonium query, or look at the CoV-Spectrum query sequences in the full tree in taxonium, they are spread out all over XBB.1.5 because the false reversion is not limited to that one cluster.

Lineages with already 3 levels don't get an alias (which they should get) and there's also no entry in alias_key.json creating that alias (e.g. XBB.1.9.2.1 should be called EG.1 and get appropriate entry in alias_key.json

There is some tension between this request and the request to easily cherry-pick the lineages. If you accept only a subset of the proposed lineages, and autolin starts picking as-yet-unassigned aliases, then search-and-replaces will still be necessary in order to get correct aliases and it might be even more confusing than four-number proposed names that obviously are wrong and need alias conversion.

AngieHinrichs · 2023-02-21T20:45:08Z

A script that does the search and replace of four-number accepted lineage to new alias, including adding the alias to alias_key.json and noting it in lineage_notes.txt, would be helpful to avoid error-prone human/editor search and replace.

jmcbroome added 2 commits February 16, 2023 15:53

Updating with new lineages.

df116da

Updating with new lineages.

0756f63

remove duplicate entries

70f80fb

corneliusroemer added a commit to cov-lineages/pango-designation that referenced this pull request Feb 19, 2023

Added lineage EG.1 (XBB.1.9.2 with S:Q613H) with 100 designations

18da557

from jmcbroome#193

corneliusroemer mentioned this pull request Feb 19, 2023

General Feedback [2023-02-19] #194

Open

corneliusroemer mentioned this pull request Feb 19, 2023

Potential XBB.1.9.2 sublineage with N:L219F first collected in Indonesia and Malayasia (137 good seqs as of 2023-02-16; Europe, North America, Asia, Australia) cov-lineages/pango-designation#1664

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Lineages Update: 20 new lineages active through 2023-02-15 #193

New Lineages Update: 20 new lineages active through 2023-02-15 #193

jmcbroome commented Feb 16, 2023

jmcbroome commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023 •

edited

Loading

AngieHinrichs commented Feb 17, 2023

aineniamh commented Feb 17, 2023 •

edited

Loading

corneliusroemer commented Feb 17, 2023

jmcbroome commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

jmcbroome commented Feb 17, 2023 •

edited

Loading

AngieHinrichs commented Feb 17, 2023

FedeGueli commented Feb 17, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023

corneliusroemer commented Feb 19, 2023

jmcbroome commented Feb 19, 2023 •

edited

Loading

Sinickle commented Feb 20, 2023

jmcbroome commented Feb 20, 2023

FedeGueli commented Feb 21, 2023

jmcbroome commented Feb 21, 2023

AngieHinrichs commented Feb 21, 2023

AngieHinrichs commented Feb 21, 2023

New Lineages Update: 20 new lineages active through 2023-02-15 #193

Are you sure you want to change the base?

New Lineages Update: 20 new lineages active through 2023-02-15 #193

Conversation

jmcbroome commented Feb 16, 2023

jmcbroome commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023 • edited Loading

AngieHinrichs commented Feb 17, 2023

aineniamh commented Feb 17, 2023 • edited Loading

corneliusroemer commented Feb 17, 2023

jmcbroome commented Feb 17, 2023

AngieHinrichs commented Feb 17, 2023

jmcbroome commented Feb 17, 2023 • edited Loading

AngieHinrichs commented Feb 17, 2023

FedeGueli commented Feb 17, 2023 • edited Loading

FedeGueli commented Feb 18, 2023 • edited Loading

FedeGueli commented Feb 18, 2023 • edited Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023 • edited Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023 • edited Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023 • edited Loading

FedeGueli commented Feb 18, 2023

FedeGueli commented Feb 18, 2023

corneliusroemer commented Feb 19, 2023

jmcbroome commented Feb 19, 2023 • edited Loading

Sinickle commented Feb 20, 2023

jmcbroome commented Feb 20, 2023

FedeGueli commented Feb 21, 2023

jmcbroome commented Feb 21, 2023

AngieHinrichs commented Feb 21, 2023

AngieHinrichs commented Feb 21, 2023

AngieHinrichs commented Feb 17, 2023 •

edited

Loading

aineniamh commented Feb 17, 2023 •

edited

Loading

jmcbroome commented Feb 17, 2023 •

edited

Loading

FedeGueli commented Feb 17, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023 •

edited

Loading

FedeGueli commented Feb 18, 2023 •

edited

Loading

jmcbroome commented Feb 19, 2023 •

edited

Loading