Skip to content

Commit

Permalink
Merge pull request #56 from ctuni/feature/seqfu
Browse files Browse the repository at this point in the history
added seqfu stats
  • Loading branch information
ctuni authored Nov 11, 2024
2 parents 5e56fc3 + e3117f6 commit 46d39d6
Show file tree
Hide file tree
Showing 12 changed files with 295 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Initial release of nf-core/seqinspector, created with the [nf-core](https://nf-c
- [#20](https://github.com/nf-core/seqinspector/pull/20) Use tags to generate group reports
- [#13](https://github.com/nf-core/seqinspector/pull/13) Generate reports per run, per project and per lane.
- [#49](https://github.com/nf-core/seqinspector/pull/49) Merge with template 3.0.2.
- [#56](https://github.com/nf-core/seqinspector/pull/56) Added SeqFu stats module.
- [#50](https://github.com/nf-core/seqinspector/pull/50) Add an optional subsampling step.
- [#51](https://github.com/nf-core/seqinspector/pull/51) Add nf-test to CI.
- [#63](https://github.com/nf-core/seqinspector/pull/63) Contribution guidelines added about displaying results for new tools
Expand Down
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,10 @@

> Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data [Online].
- [SeqFu](https://telatin.github.io/seqfu2/)

> Telatin A, Fariselli P, Birolo G. SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files. Bioengineering 2021, 8, 59. doi.org/10.3390/bioengineering8050059
- [MultiQC](https://pubmed.ncbi.nlm.nih.gov/27312411/)

> Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016 Oct 1;32(19):3047-8. doi: 10.1093/bioinformatics/btw354. Epub 2016 Jun 16. PubMed PMID: 27312411; PubMed Central PMCID: PMC5039924.
Expand Down
9 changes: 9 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,15 @@ process {
ext.args = '--quiet'
}

withName: 'SEQFU_STATS' {
ext.args = ''
publishDir = [
path: { "${params.outdir}/seqfu_stats" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
}

withName: 'MULTIQC_GLOBAL' {
ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' }
publishDir = [
Expand Down
14 changes: 14 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d

- [Seqtk](#seqtk) - Subsample a specific number of reads per sample
- [FastQC](#fastqc) - Raw read QC
- [SeqFu Stats](#seqfu_stats) - Statistics for FASTA or FASTQ files
- [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline
- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution

Expand Down Expand Up @@ -40,6 +41,19 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d

[FastQC](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) gives general quality metrics about your sequenced reads. It provides information about the quality score distribution across your reads, per base sequence content (%A/T/G/C), adapter contamination and overrepresented sequences. For further reading and documentation see the [FastQC help pages](http://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/).

#### SeqFu Stats

<details markdown="1">
<summary>Output files</summary>

- `seqfu/`
- `*.tsv`: Tab-separated file containing quality metrics.
- `*_mqc.txt`: File containing the same quality metrics as the TSV file, ready to be read by MultiQC.

</details>

[SeqFu](https://telatin.github.io/seqfu2/) is general-purpose program to manipulate and parse information from FASTA/FASTQ files, supporting gzipped input files. Includes functions to interleave and de-interleave FASTQ files, to rename sequences and to count and print statistics on sequence lengths. In this pipeline, the `seqfu stats` module is used to produce general quality metrics statistics.

### MultiQC

nf-core/seqinspector will generate the following MultiQC reports:
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,11 @@
"git_sha": "cf17ca47590cc578dfb47db1c2a44ef86f89976d",
"installed_by": ["modules"]
},
"seqfu/stats": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
"installed_by": ["modules"]
},
"seqtk/sample": {
"branch": "master",
"git_sha": "666652151335353eef2fcd58880bcef5bc2928e1",
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/seqfu/stats/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

51 changes: 51 additions & 0 deletions modules/nf-core/seqfu/stats/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

60 changes: 60 additions & 0 deletions modules/nf-core/seqfu/stats/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

75 changes: 75 additions & 0 deletions modules/nf-core/seqfu/stats/tests/main.nf.test

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

56 changes: 56 additions & 0 deletions modules/nf-core/seqfu/stats/tests/main.nf.test.snap

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions modules/nf-core/seqfu/stats/tests/tags.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions workflows/seqinspector.nf
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

include { SEQTK_SAMPLE } from '../modules/nf-core/seqtk/sample/main'
include { FASTQC } from '../modules/nf-core/fastqc/main'
include { SEQFU_STATS } from '../modules/nf-core/seqfu/stats'

include { MULTIQC as MULTIQC_GLOBAL } from '../modules/nf-core/multiqc/main'
include { MULTIQC as MULTIQC_PER_TAG } from '../modules/nf-core/multiqc/main'
Expand Down Expand Up @@ -58,6 +59,16 @@ workflow SEQINSPECTOR {
ch_multiqc_files = ch_multiqc_files.mix(FASTQC.out.zip)
ch_versions = ch_versions.mix(FASTQC.out.versions.first())


//
// Module: Run SeqFu stats
//
SEQFU_STATS (
ch_samplesheet
)
ch_multiqc_files = ch_multiqc_files.mix(SEQFU_STATS.out.multiqc)
ch_versions = ch_versions.mix(SEQFU_STATS.out.versions.first())

//
// Collate and save software versions
//
Expand Down

0 comments on commit 46d39d6

Please sign in to comment.