-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error on GenBank file with CompoundLocation overlapping sequence origin #324
Comments
Hi @aekazakov and thanks for the detailed report! |
OK, this should be fixed by #332. @aekazakov If you like, and feel comfortable with being a very-early-tester, you could checkout the |
I tested the fix-edge-user-proteins branch with two RefSeq genomes (NZ_CP012831.1 and NZ_CP015511.1) that have gene and CDS features overlapping sequence origin on - and + strand, respectively. |
Thanks for testing and reporting back. With that, I'll close this. If there are any further things to discuss or issues arising from this, please do not hesitate to re-open this or a new one. Thanks. |
Hi! Thank you for the great tool!
On a genome that has a coding gene overlapping circular sequence origin, Bakta terminates with error.
I run Bakta version 1.9.4 installed with conda, database version 5.1.
The command executed is:
bakta --debug --db /mnt/data/ref/Bakta/v5.1/db --output test_bakta --prefix NZ_CP0128315.bakta --threads 8 --regions sequence.gb sequence.fasta
Input files sequence.gb and sequence.fasta for the NCBI sequence NZ_CP012831.1 were downloaded from https://www.ncbi.nlm.nih.gov/nuccore/NZ_CP012831.1 and contain full Genbank record and FASTA-formatted nucleotide sequence.
This sequence has a gene and CDS features overlapping sequence origin:
Error message from NZ_CP0128315.bakta.log is:
This error occurs because the CDS sequence extracted by extract_feature_sequence (bakta/utils.py) is wrong. It is 7.1 Mbp long instead of 774 bp.
The text was updated successfully, but these errors were encountered: