Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing raw_mapped_ref.bai file when using --reference-filter flag #140

Closed
Ge0rges opened this issue Sep 11, 2023 · 7 comments
Closed

Missing raw_mapped_ref.bai file when using --reference-filter flag #140

Ge0rges opened this issue Sep 11, 2023 · 7 comments

Comments

@Ge0rges
Copy link

Ge0rges commented Sep 11, 2023

Hello again,

I'm running into an issue when running QC and assembly below:

09/11/2023 06:29:54 PM INFO: Command - /Accounts/gkanaan/.conda/envs/binning/bin/aviary assemble -w flye_assembly --min-read-size 100 --min-mean-q 10 --coassemble --longreads fastqs/decontaminated/barcode01/barcode01.fastq fastqs/decontaminated/barcode02/barcode02.fastq fastqs/decontaminated/barcode03/barcode03.fas
tq fastqs/decontaminated/barcode05/barcode05.fastq fastqs/decontaminated/barcode06/barcode06.fastq fastqs/decontaminated/barcode07/barcode07.fastq --longread-type ont_hq --reference-filter ../databases/human_fasta/GCF_000001405.40_GRCh38.p14_genomic.fna -t 10 -n 20 -o aviary                                                                                                                
09/11/2023 06:29:54 PM INFO: Version - 0.7.2                                                                                                                  
09/11/2023 06:29:54 PM INFO: Configuration file written to /localdata/researchdrive/gkanaan/seaice_methylation/aviary/config.yaml                             
09/11/2023 06:29:54 PM INFO: Executing: snakemake --snakefile /Accounts/gkanaan/.conda/envs/binning/lib/python3.10/site-packages/aviary/modules/Snakefile --di
rectory /localdata/researchdrive/gkanaan/seaice_methylation/aviary --cores 20 --rerun-incomplete   --configfile /localdata/researchdrive/gkanaan/seaice_methyl
ation/aviary/config.yaml --nolock  --conda-frontend mamba --resources mem_mb=256000   --use-conda --conda-prefix /Accounts/gkanaan/.conda/envs   flye_assembly
Building DAG of jobs...                                                                                                                                       
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, s
ee https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority str
ict'.                                                                                                                                                         
Using shell: /usr/bin/bash                                                                                                                                    
Provided cores: 20                                                                                                                                            
Rules claiming more threads will be scaled down.                                                                                                              
Provided resources: mem_mb=256000                                                                                                                             
Job stats:                                                                                                                                                    
job                      count    min threads    max threads                                                                                                  
---------------------  -------  -------------  -------------                                                                                                  
flye_assembly                1             10             10                                                                                                  
get_reads_list_ref           1             10             10                                                                                                  
get_umapped_reads_ref        1              1              1                                                                                                  
map_reads_ref                1             10             10                                                                                                  
total                        4              1             10                                                                                                  
                                                                                                                                                              
gtdbtk_folder does not point to a folder
Select jobs to execute...                                                                                                                                     
                                                                                                                                                              
[Mon Sep 11 18:30:04 2023]                                                                                                                                    
rule map_reads_ref:                                                                                                                                           
    input: /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode01/barcode01.fastq, /localdata/researchdrive/gkanaan/seaice_methyl
ation/fastqs/decontaminated/barcode02/barcode02.fastq, /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode03/barcode03.fastq, /l
ocaldata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode05/barcode05.fastq, /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/
decontaminated/barcode06/barcode06.fastq, /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode07/barcode07.fastq, /localdata/rese
archdrive/gkanaan/databases/human_fasta/GCF_000001405.40_GRCh38.p14_genomic.fna                                                                               
    output: data/raw_mapped_ref.bam, data/raw_mapped_ref.bai                                                                                                  
    jobid: 3                                                                                                                                                  
    benchmark: benchmarks/map_reads_ref.benchmark.txt                                                                                                         
    reason: Missing output files: data/raw_mapped_ref.bam                                                                                                     
    threads: 10                                                                                                                                               
    resources: tmpdir=/tmp                                                                                                                                    

Activating conda environment: ../../../../../Accounts/gkanaan/.conda/envs/9448f1f7457bb9ac58b046a43edc6842_
[E::hts_open_format] Failed to open file "data/raw_mapped_ref.bai" : No such file or directory
samtools view: failed to open "data/raw_mapped_ref.bai" for reading: No such file or directory
[M::mm_idx_gen::89.858*1.52] collected minimizers
[M::mm_idx_gen::97.290*2.05] sorted minimizers
[M::main::97.290*2.05] loaded/built the index for 705 target sequence(s)
[M::mm_mapopt_update::102.283*2.00] mid_occ = 509
[M::mm_idx_stat] kmer size: 19; skip: 10; is_hpc: 1; #seq: 705
[M::mm_idx_stat::104.259*1.98] distinct minimizers: 86723810 (42.99% are singletons); average occurrences: 4.620; average spacing: 8.232; total length: 329843
0636
[M::worker_pipeline::165.006*4.76] mapped 588727 sequences
[M::worker_pipeline::209.055*5.84] mapped 415657 sequences
[M::worker_pipeline::230.728*6.16] mapped 480957 sequences
[M::worker_pipeline::311.214*7.08] mapped 220564 sequences
[M::worker_pipeline::380.643*7.62] mapped 235529 sequences
[M::worker_pipeline::453.522*8.01] mapped 230323 sequences
[M::worker_pipeline::535.428*8.31] mapped 232307 sequences
[M::worker_pipeline::607.527*8.52] mapped 230645 sequences
[M::worker_pipeline::677.500*8.67] mapped 211058 sequences
[M::worker_pipeline::687.692*8.68] mapped 28518 sequences
[M::worker_pipeline::710.546*8.70] mapped 562875 sequences
[M::worker_pipeline::795.067*8.80] mapped 374432 sequences
[M::worker_pipeline::798.763*8.81] mapped 16472 sequences
[M::worker_pipeline::809.185*8.81] mapped 659641 sequences
[Mon Sep 11 18:43:34 2023]
Error in rule map_reads_ref:
    jobid: 3
    output: data/raw_mapped_ref.bam, data/raw_mapped_ref.bai
    conda-env: /Accounts/gkanaan/.conda/envs/9448f1f7457bb9ac58b046a43edc6842_
    shell:
        minimap2 -ax map-pb --split-prefix=tmp -t 10 /localdata/researchdrive/gkanaan/databases/human_fasta/GCF_000001405.40_GRCh38.p14_genomic.fna /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode01/barcode01.fastq /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode02/barcode02.fastq /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode03/barcode03.fastq /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode05/barcode05.fastq /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode06/barcode06.fastq /localdata/researchdrive/gkanaan/seaice_methylation/fastqs/decontaminated/barcode07/barcode07.fastq | samtools view -@ 10 -b > data/raw_mapped_ref.bam data/raw_mapped_ref.bai && samtools index data/raw_mapped_ref.bam data/raw_mapped_ref.bai                                                           
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)                                                     

Removing output files of failed job map_reads_ref since they might be corrupted:                                                                             
data/raw_mapped_ref.bam
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
An error occurred
Complete log: .snakemake/log/2023-09-11T182955.464551.snakemake.log
09/11/2023 06:43:35 PM CRITICAL: Command '['snakemake', '--snakefile', '/Accounts/gkanaan/.conda/envs/binning/lib/python3.10/site-packages/aviary/modules/Snakefile', '--directory', '/localdata/researchdrive/gkanaan/seaice_methylation/aviary', '--cores', '20', '--rerun-incomplete', '--configfile', '/localdata/researchdrive/gkanaan/seaice_methylation/aviary/config.yaml', '--nolock', '--conda-frontend', 'mamba', '--resources', 'mem_mb=256000', '--use-conda', '--conda-prefix', '/Accounts/gkanaan/.conda/envs', 'flye_assembly']' returned non-zero exit status 1.                                                                      
@rhysnewell
Copy link
Owner

Apologies, it looks like a typo slipped into the reference filter workflow a couple of weeks ago and evaded testing. I'm working on a fix now for you

@Ge0rges
Copy link
Author

Ge0rges commented Sep 12, 2023

Thank you, I appreciate the prompt support.

@Ge0rges
Copy link
Author

Ge0rges commented Sep 13, 2023

Hi @rhysnewell I noticed the fix, thanks! Is there a nightly build I can install to get it?

@rhysnewell
Copy link
Owner

It's not currently ready, but it might work for your use case since you are using long reads only. You can clone the repo from that branch that commit is from and then pip install . while in the aviary directory.

Might be easier to just wait though

@Ge0rges
Copy link
Author

Ge0rges commented Sep 13, 2023

Got it, I'll wait, thanks!

@rhysnewell
Copy link
Owner

This should be fixed as of v0.8.0, release available via pip

@rhysnewell
Copy link
Owner

Also, it should now be capable of taking multiple files as input into the reference filter flag for future reference

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants