Add a new genome to a Panaroo graph #153

fabgenomics · 2022-04-08T11:34:52Z

Hi,
I'm using a lot panaroo for a research project. I generated a gene_presence_absence matrix from 1500 bacterial genomes.
Then I used this matrix to train a supervised machine learning model. I got a decent accuracy so I wanted to dig in.
My interest now is to use the generated graph to include a new genome, extract the data from the new genome and do some prediction.
The problem behind the panaroo-integrate command is that all the groups are renamed in an order that is different from the original matrix.
For instance, in the whole matrix, the group_5637 represent the gene hemB but when I add a genome with panaroo-integrate the hemB gene is now in group_695.
As I'm only keeping some of the group for my trainning and predicting process, I would like to keep them identical when adding a genome. I want the group_5637 to represent the same hemB gene.
Is it dificult for you to implement this in the panaroo-integrate code ?
Thanks for all your work,
Fabien

gtonkinhill · 2022-04-13T05:09:27Z

Hi,

This is a good point and should hopefully not be too difficult to implement. I will try and get to it as soon as I can and add it to the next release.

In the meantime you may be able to use the geneIDs to keep track of the same clusters between runs. The panaroo-integrate command should maintain the existing clusters which can be identified by the geneIDs within them.

fabgenomics · 2022-04-13T14:11:17Z

Hi,
The problem with the geneID is that I can have 2 different groups with the same geneID. I used the defaut parameters for cluster thresholds plus --clean-mode strict --remove-invalid-genes --merge_paralogs.

daisy238 · 2024-07-25T12:29:42Z

Hello, is there any update on this please? I would also like to use panaroo on a reference genome using the gene cluster ids from a previous panaroo run.

gtonkinhill added the enhancement New feature or request label Apr 13, 2022

gtonkinhill self-assigned this Apr 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new genome to a Panaroo graph #153

Add a new genome to a Panaroo graph #153

fabgenomics commented Apr 8, 2022 •

edited

Loading

gtonkinhill commented Apr 13, 2022

fabgenomics commented Apr 13, 2022

daisy238 commented Jul 25, 2024

Add a new genome to a Panaroo graph #153

Add a new genome to a Panaroo graph #153

Comments

fabgenomics commented Apr 8, 2022 • edited Loading

gtonkinhill commented Apr 13, 2022

fabgenomics commented Apr 13, 2022

daisy238 commented Jul 25, 2024

fabgenomics commented Apr 8, 2022 •

edited

Loading