Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken error handling in multithread_codonalign_build #310

Open
revinici opened this issue Oct 1, 2024 · 1 comment
Open

Broken error handling in multithread_codonalign_build #310

revinici opened this issue Oct 1, 2024 · 1 comment

Comments

@revinici
Copy link

revinici commented Oct 1, 2024

For some reason there was an IndexError when running the method below, but since the variable codon_alignment was never initialized there was a UnboundLocalError raised as well claiming: local variable 'codon_alignment' referenced before assignment

I am not sure why there was an index error, but can this method be updated so that it gracefully continues? I will have to run the code with the problem inputs in the debugger to investigate the index error.

I am using panaroo version 1.5.

def multithread_codonalign_build(dna, protein, name):
try:
codon_alignment = codonalign.build(dna, protein)
except RuntimeError as e:
print(e)
print(name)
print(dna)
print(protein)
except IndexError as e:
print(e)
print(name)
print(dna)
print(protein)
return(name, codon_alignment)

@revinici
Copy link
Author

revinici commented Oct 2, 2024

Hello, I investigated the cause of the index error by passing the problem aligned protein FASTA and unaligned dna FASTA to Bio.codonalign.build and I got the stack trace below. It appears that there are some mistranslation issues in addition to a mismatch between refound dna and protein sequences. I think that until this get solved I will turn off codon alignments.

/home/user/miniconda3/envs/panaroo/lib/python3.9/site-packages/Bio/codonalign/__init__.py:627: BiopythonWarning: GENOME_ID_1;1243_20_0(M 0) does not correspond to GENOME_ID_1;1243_20_0(GTG)
  warnings.warn(
/home/user/miniconda3/envs/panaroo/lib/python3.9/site-packages/Bio/codonalign/__init__.py:627: BiopythonWarning: GENOME_ID_2;1490_12_0(M 0) does not correspond to GENOME_ID_2;1490_12_0(GTG)
  warnings.warn(
/home/user/miniconda3/envs/panaroo/lib/python3.9/site-packages/Bio/codonalign/__init__.py:627: BiopythonWarning: GENOME_ID_3;1609_19_51(M 0) does not correspond to GENOME_ID_3;1609_19_51(TTG)
  warnings.warn(
/home/user/miniconda3/envs/panaroo/lib/python3.9/site-packages/Bio/codonalign/__init__.py:382: BiopythonWarning: middle frameshift detection failed for GENOME_ID_4;101_refound_2380
  warnings.warn(
Traceback (most recent call last):
  File "/local/home/user/.pycharm_helpers/pydev/pydevd.py", line 1551, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/local/home/user/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/tmp/pycharm_project_499/panaroo/bug.py", line 30, in <module>
    codon_alignment = codonalign.build(protein, dna)# load the final pangenome graph
  File "/home/user/miniconda3/envs/panaroo/lib/python3.9/site-packages/Bio/codonalign/__init__.py", line 169, in build
    corr_span = _check_corr(
  File "/home/user/miniconda3/envs/panaroo/lib/python3.9/site-packages/Bio/codonalign/__init__.py", line 435, in _check_corr
    raise RuntimeError(
RuntimeError: Protein SeqRecord (GENOME_ID_4;101_refound_2380) and Nucleotide SeqRecord (GENOME_ID_4;101_refound_2380) do not match!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant