Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adjustments for #76 , rename certain options to be more explicit #81

Merged
merged 16 commits into from
Nov 19, 2018
8 changes: 4 additions & 4 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -383,17 +383,17 @@ Turn this on to utilize BWA Mem instead of `bwa aln` for alignment. Can be quite

Users can configure to keep/discard/extract certain groups of reads efficiently in the nf-core/eager pipeline.

### `--bam_keep_mapped_only`
### `--bam_analyse_mapped_only`

This can be used to only keep mapped reads for downstream analysis. By default turned off, all reads are kept in the BAM file. Unmapped reads are stored both in BAM and FastQ format e.g. for different downstream processing.
This can be used to only keep mapped reads in the BAM file for downstream analysis. By default turned off, all reads are kept in the BAM file. Unmapped reads are stored both in BAM and FastQ format e.g. for different downstream processing.
Copy link
Member

@jfy133 jfy133 Nov 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change following line to: 'By default turned off, where all reads are kept in the bam file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


### `--bam_keep_all`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference of this flag verses --bam_retain_all_reads?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--bam_discard_unmapped_entirely

Removes the unmapped reads file only, the BAM file contains only mapped reads and unmapped reads are entirely discarded (no fastq/bam at all present).

I guess I'll have to give this another proper thought. Right now bam_retain_all_reads just is the thing to turn on BAM filtering in general.


Turned on by default, keeps all reads that were mapped in the dataset.

### `--bam_filter_reads`
### `--bam_retain_all_reads`

Specify this, if you want to filter reads for downstream analysis.
Specify this, if you want to filter reads for downstream analysis. This keeps all mapped and unmapped reads in the output, but allows for quality threshold filtering using `--bam_mapping_quality_threshold`.

### `--bam_mapping_quality_threshold`

Expand Down
12 changes: 6 additions & 6 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,8 @@ def helpMessage() {
--bwamem Turn on BWA Mem instead of CM/BWA aln for mapping

BAM Filtering
--bam_keep_mapped_only Only consider mapped reads for downstream analysis. Unmapped reads are extracted to separate output.
--bam_filter_reads Keep all reads in BAM file for downstream analysis
--bam_analyse_mapped_only Only consider mapped reads for downstream analysis. Unmapped reads are extracted to separate output.
--bam_retain_all_reads Keep all reads in BAM file for downstream analysis
--bam_mapping_quality_threshold Minimum mapping quality for reads filter

DeDuplication
Expand Down Expand Up @@ -173,9 +173,9 @@ params.circularfilter = false
params.bwamem = false

//BAM Filtering steps (default = keep mapped and unmapped in BAM file)
params.bam_keep_mapped_only = false
params.bam_analyse_mapped_only = false
params.bam_keep_all = true
params.bam_filter_reads = false
params.bam_retain_all_reads = false
params.bam_mapping_quality_threshold = 0

//DamageProfiler settings
Expand Down Expand Up @@ -715,12 +715,12 @@ process samtools_filter {
file "*.unmapped.bam" optional true
file "*.bai"

when: "${params.bam_filter_reads}"
when: "${params.bam_retain_all_reads}"

script:
prefix="$bam" - ~/(\.bam)?/

if("${params.bam_keep_mapped_only}"){
if("${params.bam_analyse_mapped_only}"){
"""
samtools view -h $bam | tee >(samtools view - -@ ${task.cpus} -f4 -q ${params.bam_mapping_quality_threshold} -o ${prefix}.unmapped.bam) >(samtools view - -@ ${task.cpus} -F4 -q ${params.bam_mapping_quality_threshold} -o ${prefix}.filtered.bam)
samtools fastq -tn "${prefix}.unmapped.bam" | gzip > "${prefix}.unmapped.fq.gz"
Expand Down