Support aggregate reporting for demultiplexed FASTQ files #124

mtomko · 2023-10-17T17:35:10Z

Our group has long generated FastQC reports for a single lane of sequencing at a time. Our sequencing provider is now only providing demultiplexed FASTQs, which means that we need to look at hundreds of FastQC reports instead of just 2. We would be interested in an option to FastQC that generated one aggregate report for all of the demultiplexed FASTQ files, summarizing the overall quality of all of them. This would be akin to the report generated by simply concatenating all the FASTQ files and running FastQC on that.

I would consider implementing this myself if it would be welcome.

mtomko · 2023-10-17T17:40:37Z

Ah, my coworker has pointed out that it's possible to do this by reading from standard in:

If you want to run fastqc on a stream of data to be read from standard input then you
can do this by specifing 'stdin' as the name of the file to be processed and then
streaming uncompressed fastq format data to the program. For example:
zcat *fastq.gz | fastqc stdin
If you want the results from a streamed analysis sent to a file with a name other than
stdin then you can add a colon and put the file name you want, for example:
zcat *fastq.gz | fastqc stdin:my_results
..would write results to my_result.html and my_results.zip.

s-andrews · 2023-10-18T08:11:29Z

You've found one option for this which will combine the full set of results. To be honest, if you're just looking at data quality then it's pretty unlikely that you'll see a difference in quality between the different split subsets of reads so any of the reports is likely to be representative.

The other option to consider is MultiQC (https://multiqc.info/) which you can run in a directory where you have multiple FastQC (and other programs) reports and it will aggregate them into a single combined report. We use this on the end of our sequencing pipelines and it works great for this purpose.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support aggregate reporting for demultiplexed FASTQ files #124

Support aggregate reporting for demultiplexed FASTQ files #124

mtomko commented Oct 17, 2023

mtomko commented Oct 17, 2023

s-andrews commented Oct 18, 2023

Support aggregate reporting for demultiplexed FASTQ files #124

Support aggregate reporting for demultiplexed FASTQ files #124

Comments

mtomko commented Oct 17, 2023

mtomko commented Oct 17, 2023

s-andrews commented Oct 18, 2023