-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way to deactivate the overlap detection so bakta does not filter my input proteins? #295
Comments
Hi, thanks for reaching out. To make sure that I correctly understand what you're finally trying to achieve: you would like to annotate a phage genome sequence with Bakta using a user-provided proteins file with functional annotations from Phanotate? Is this correct? |
Hi! yes, I want to perform the bakta annotation over a user-provided proteins file with functional annotations from Phanotate |
Hmm, in principle, you can do this. However, Bakta was designed to annotate bacterial genomes, hence the overlap filters. I could add an option to deactivate all overlap filters in the next release. But I cannot make any promises when this will be. Meanwhile, you could try pharokka? |
Hey @Daniel-Tichy , I just added a new I hope this fits your needs in this case. I'll close this for now. If there are any further comments, ideas, suggestions, please do not hesitate to re-open this (or a new one). Thanks again an best regards! |
Thank you! |
The issue is related to the user-provided proteins feature and its associated issues.
I am trying to use bakta to perform annotation on a phage predicted protein file that used Phanotate. I was expecting an annotation to every protein of my input file but it seems that overlapped proteins are being filtered by bakta.
-I would like to deactivate the overlap detection so bakta does not filter the previously predicted proteins that I am using as input.
Example: this is my input gbk for bakta.
I parse it and input it in the following format to bakta.
But I get this output, the protein for WARQSXNU_10 is missing probably because of the overlap in the genome.
I am currently running bakta with this line within a docker.
bakta --db $bakta_db/ --protein $faa_input_bakta --skip-trna --skip-tmrna --skip-rrna --skip-ncrna --skip-ncrna-region --skip-crispr --skip-pseudo --skip-gap --skip-ori --skip-plot --output ${assembly_input_bakta.simpleName}_bakta/ --threads ${params.threads} $assembly_input_bakta
The text was updated successfully, but these errors were encountered: