Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only use a random subset of genes for generating core aln #260

Merged
merged 3 commits into from
Dec 20, 2023

Conversation

mgalardini
Copy link
Contributor

Added a --core_subset argument to every command with which a user can choose to generate a core genome alignment.

When using large datasets the core genome alignment takes a substantial amount of time, but perhaps a reasonable sample would produce the same phylogenetic signal in a fraction of the time

Tested briefly on a toy dataset, opening a PR in case this is a useful new argument!

gtonkinhill and others added 3 commits September 21, 2023 10:33
Added a `--core_subset` argument to every command with which a user
can choose to generate a core genome alignment.

When using large datasets the core genome alignment takes
a substantial amount of time, but perhaps a reasonable sample
would produce the same phylogenetic signal in a fraction of the
time
@gtonkinhill
Copy link
Owner

Thanks very much for this! I've just got back from holiday and am planning to work on Panaroo this week so will try and incorporate this option into the next release.

We've also been working on identifying gene families that are more reliable for inferring phylogenies. At the moment we've incorporated an entropy based approach. The filtered alignments are currently output as core_gene_alignment_filtered.aln which has led to substantially improved phylogenies on simulated datasets.

@gtonkinhill gtonkinhill changed the base branch from master to devel December 20, 2023 01:09
@gtonkinhill gtonkinhill merged commit 647443a into gtonkinhill:devel Dec 20, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants