You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The scatterplot functionality has proved highly enabling to be able to view multiple aspects of data in a single view. One chief example at this moment is to compare attributes of emerging lineages, where a couple common views look like:
Both these views help to surface which emerging lineages have highest S1 mutations (P.1) and which emerging lineages have highest rates of logistic growth (largely B.1.1.7), but other lineages get lost in the mix due to occlusion of tips.
For the desired comparison of S1 mutations and logistic growth across emerging lineages it would be preferable to have emerging lineage on the x axis and S1 mutations or logistic growth on the y axis.
In this case, rather than a regression line, I'd imagine a horizontal black line for each categorical variable demarcating its mean.
The text was updated successfully, but these errors were encountered:
One other use case that occurred to me recently would be a kind of transmission network view where we could plot time on the x-axis, regions on the y-axis, color by clades, and look for clades whose branches traverse region boundaries (sort of like a slope plot). For some subset of geographic locations, we could get a similar effect from the current functionality by plotting latitude or longitude on the y-axis.
Edit: Here is an example of what I was trying to describe above where I've plotted strains from a Washington-focused tree by sample date and latitude (of country as inferred by augur traits for internal nodes or missing data), colored by country:
Filter view to only strains from USA and identify transmissions into USA from other latitudes:
Zoom in to see possible transmission from Mexico to USA. The three red diagonal lines suggest three separate introductions and their slopes suggest different rates at which the introductions occurred (note that this tree is biased heavily toward Washington and North America, so it isn't a fair representation):
Switch back to tree view to see the phylogenetic context and confirm that there do appear to be three separate introductions into the US:
The scatterplot functionality has proved highly enabling to be able to view multiple aspects of data in a single view. One chief example at this moment is to compare attributes of emerging lineages, where a couple common views look like:
Color by emerging lineage, time on x axis, spike S1 mutations on y axis
https://nextstrain.org/ncov/global?branches=hide&c=emerging_lineage&l=scatter&scatterY=S1_mutations
Color by emerging lineage, time on x axis, logistic growth on y axis
https://nextstrain.org/ncov/global?c=emerging_lineage&l=scatter&scatterY=logistic_growth
Both these views help to surface which emerging lineages have highest S1 mutations (P.1) and which emerging lineages have highest rates of logistic growth (largely B.1.1.7), but other lineages get lost in the mix due to occlusion of tips.
For the desired comparison of S1 mutations and logistic growth across emerging lineages it would be preferable to have emerging lineage on the x axis and S1 mutations or logistic growth on the y axis.
In this case, rather than a regression line, I'd imagine a horizontal black line for each categorical variable demarcating its mean.
The text was updated successfully, but these errors were encountered: