Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault in mashmap compare on empty input files #181

Closed
taylorreiter opened this issue May 28, 2021 · 5 comments
Closed

segfault in mashmap compare on empty input files #181

taylorreiter opened this issue May 28, 2021 · 5 comments

Comments

@taylorreiter
Copy link
Member

[Fri May 28 13:58:25 2021]
rule mashmap_compare:
    input: /home/tereiter/github/2020-ibd/genbank_genomes/GCA_900112995.1_genomic.fna.gz, genbank_genomes/GCA_008680295.1_genomic.fna.gz
    output: output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.align, output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.out
    jobid: 11808
    wildcards: f=GCA_900112995.1_genomic.fna.gz, acc=GCA_008680295.1

Activating conda environment: /home/tereiter/github/charcoal/.snakemake/conda/841ac41a561672c1e47e4858f7419cbb
/bin/bash: line 1: 23840 Segmentation fault      mashmap -q /home/tereiter/github/2020-ibd/genbank_genomes/GCA_900112995.1_genomic.fna.gz -r genbank_genomes/GCA_008680295.1_genomic.fna.gz -o output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.align --pi 95 > output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.out
[Fri May 28 13:58:26 2021]
Error in rule mashmap_compare:
    jobid: 11808
    output: output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.align, output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.out
    conda-env: /home/tereiter/github/charcoal/.snakemake/conda/841ac41a561672c1e47e4858f7419cbb
    shell:

        mashmap -q /home/tereiter/github/2020-ibd/genbank_genomes/GCA_900112995.1_genomic.fna.gz -r genbank_genomes/GCA_008680295.1_genomic.fna.gz -o output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.align             --pi 95 > output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.out

        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job mashmap_compare since they might be corrupted:
output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.out
Job failed, going on with independent jobs.
@ctb
Copy link
Member

ctb commented Jun 1, 2021

can you repeat this problem when you run the command at the command line, and not in snakemake? not sure what the problem is, beyond mashmap barfing for no obvious reason ;).

        mashmap -q /home/tereiter/github/2020-ibd/genbank_genomes/GCA_900112995.1_genomic.fna.gz -r genbank_genomes/GCA_008680295.1_genomic.fna.gz -o output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.align             --pi 95 > output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.out

@taylorreiter
Copy link
Member Author

I can:

(/home/tereiter/github/charcoal/.snakemake/conda/841ac41a561672c1e47e4858f7419cbb) tereiter@bm9:~/github/charcoal$ mashmap -q /home/tereiter/github/2020-ibd/genbank_genomes/GCA_900112995.1_genomic.fna.gz -r genbank_genomes/GCA_008680295.1_genomic.fna.gz -o output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.align             --pi 95 > output.genbank_genomes/stage2/GCA_900112995.1_genomic.fna.gz.x.GCA_008680295.1.mashmap.out
Segmentation fault

it appears to be caused by an empty file passed to the -r command

$ ls -lh /home/tereiter/github/2020-ibd/genbank_genomes/GCA_900112995.1_genomic.fna.gz
-rw-r--r-- 1 tereiter tereiter 806K May 15 14:20 /home/tereiter/github/2020-ibd/genbank_genomes/GCA_900112995.1_genomic.fna.gz
$ ls -lh genbank_genomes/GCA_008680295.1_genomic.fna.gz
-rw-r--r-- 1 tereiter tereiter 0 May 27 09:32 genbank_genomes/GCA_008680295.1_genomic.fna.gz

@ctb
Copy link
Member

ctb commented Jun 2, 2021

ok - do you think we should tell the mashmap authors, and/or guard against it in charcoal? I'd be fine with leaving it in the issue tracker and only fixing it if it becomes a regular problem...

@taylorreiter
Copy link
Member Author

I think just leaving in issue tracker so its searchable is fine for now...I fixed this by deleting the empty files (e.g. rm genbank_genomes/GCA_008680295.1_genomic.fna.gz) and then re-running charcoal (python -m charcoal run genbank_genomes.conf --rerun-incomplete -j 1), it re-downloaded the files and went on its merry way. If it becomes a problem we can implement a check in charcoal as well. I'll post an issue on the mashmap repo as well.

@ctb
Copy link
Member

ctb commented Jun 2, 2021

thank you!

please link to the mashmap issue here and then close this issue :). I'll update the title, too.

@ctb ctb changed the title segfault in mashmap compare segfault in mashmap compare on empty input files Jun 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants