-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transfer annotations from similar genome #247
Comments
Ok, since more and more users ask for a feature like this, I've started a new issue #250 collecting ideas and requirements. Feedback and input is highly welcome! |
Great, thank you very much @oschwengers! |
OK, Having a deeper look into this, and merging #250, I think we can address some of the use cases you've described above:
Of note, both options ( However, of course, this setup is not ideal in batch situations, so for Skipping the entire de novo gene prediction would require further command line parameters. Furthermore, since even closely related genomes will not have exact gene start/stop matches, one would have to search for these. As you already mentioned, this is a fairly complex task if you do not want to introduce any false positives. You can find some very brief thoughts on this here: #250 (comment) Hence, I'll keep this issue open, but put into the backlog for now. Of course, any further comments, ideas, thoughts are highly welcome! |
Thank you very much @oschwengers, this is already very useful and I think it adds more flexibility and interoperability to bakta 👍 with a companion liftover tool and some scripting, Is there any official guideline for auxiliary scripts? Like dependency management, coding style, testing needs... |
Not yet, but in a nutshell:
|
Hi, thanks for developing this great tool!
I believe a great addition to the workflow would be the possibility to provide a pre-annotated genome as input, allowing to:
While this is partially covered by the
--proteins
parameter, there are cases like #216 and #245 where this is not sufficient.Developing the liftover feature from scratch would be rather challenging, but there are several tools for that (Liftoff, Flo, nf-LO, TOGA) which theoretically should work on prokaryotic genomes. Bakta could then accept the partially annotated genome resulting from a liftover pipeline, and "finish" the annotation process. For example, with an additional input file containing the alignment to the reference genome (e.g. minimap2 intermediary output from Liftoff), it could extract unmapped regions, annotate them and "merge" the output.
Thanks again, let me know if something isn't clear. I totally understand if this is not something you'd want to support in bakta, but in any case I believe there's definitely an unmet need for it!
The text was updated successfully, but these errors were encountered: