MultiQC Version 1.8
A huge release, this one has been a long time coming. Due to @ewels being away on paternity leave for over six months it was very delayed and has been nearly a year in the making! During that time there has been 344
commits with 3,370
lines of code added and 1,194
deletions by 19
contributors. That's a lot of changes.
Highlights include:
- Finally removing the annoying YAML warning
- Six new modules, and many large updates to existing modules
- Code restructuring allowing MultiQC to be imported into Python environments and easier running on Windows
- Lots of tiny bug fixes all over the place.
Enjoy the update! And I promise I'll try not to make everyone wait so long for the next release...
Full changelog
New Modules:
- fgbio
- Process family size count hist data from GroupReadsByUmi
- biobambam2
- Added submodule for
bamsormadup
tool - Totally cheating - it uses Picard MarkDuplicates but with a custom search pattern and naming
- Added submodule for
- SeqyClean
- Adds analysis for seqyclean files
- mtnucratio
- Added little helper tool to compute mt to nuclear ratios for NGS data.
- mosdepth
- fast BAM/CRAM depth calculation for WGS, exome, or targeted sequencing
- SexDetErrmine
- Relative coverage and error rate of X and Y chromosomes
Module updates:
- bcl2fastq
- Added handling of demultiplexing of more than 2 reads
- Allow bcl2fastq to parse undetermined barcode information in situations when lane indexes do not start at 1
- BBMap
- Support for scafstats output marked as not yet implemented in docs
- DeDup
- Added handling clusterfactor and JSON logfiles
- damageprofiler
- Added writing metrics to data output file.
- DeepTools
- fastp
- Fix faulty column handling for the after filtering Q30 rate (#936)
- FastQC
- When including a FastQC section multiple times in one report, the Per Base Sequence Content heatmaps now behave as you would expect.
- Added heatmap showing FastQC status checks for every section report across all samples
- Made sequence content individual plots work after samples have been renamed (#777)
- Highlighting samples from status - respect chosen highlight colour in the toolbox (#742)
- FastQ Screen
- When including a FastQ Screen section multiple times in one report, the plots now behave as you would expect.
- GATK
- Refactored BaseRecalibrator code to be more consistent with MultiQC Python style
- Handle zero count errors in BaseRecalibrator
- HiC Explorer
- Fixed bug where module tries to parse QC_table.txt, a new log file in hicexplorer v2.2.
- HTSeq
- Fixed bug where module would crash if a sample had zero reads (#1006)
- LongRanger
- Added support for the LongRanger Align pipeline.
- miRTrace
- Fixed bug where a sample in some plots was missed. (#932)
- Peddy
- Fixed bug where sample name cleaning could lead to error. (#1024)
- All plots (including Het Check and Sex Check) now hidden if no data
- Picard
- Modified OxoGMetrics.py so that it will find files created with GATK CollectMultipleMetrics and ConvertSequencingArtifactToOxoG.
- QoRTs
- Fixed bug where
--dirs
broke certain input files. (#821)
- Fixed bug where
- Qualimap
- Added in mean coverage computation for general statistics report
- Creates now tables of collected data in
multiqc_data
- RNA-SeQC
- Updated broken URL link
- RSeQC
- Fixed bug where Junction Saturation plot when clicking a single sample was mislabelling the lines.
- When including a RSeQC section multiple times in one report, clicking Junction Saturation plot now behaves as you would expect.
- Fixed bug where exported data in
multiqc_rseqc_read_distribution.txt
files had incorrect values for_kb
fields (#1017)
- Samtools
- Utilize in-built
read_count_multiplier
functionality to plotflagstat
results more nicely
- Utilize in-built
- SnpEff
- Increased the default summary csv file-size limit from 1MB to 5MB.
- Stacks
- Fixed bug where multi-population sum stats are parsed correctly (#906)
- TopHat
- Fixed bug where TopHat would try to run with files from Bowtie2 or HiSAT2 and crash
- VCFTools
- Fixed a bug where
tstv_by_qual.py
produced invalid json from infinity-values.
- Fixed a bug where
- snpEff
- Added plot of effects
New MultiQC Features:
- Added some installation docs for windows
- Added some docs about using MultiQC in bioinformatics pipelines
- Rewrote Docker image
- New base image
czentye/matplotlib-minimal
reduces image size from ~200MB to ~80MB - Proper installation method ensures latest version of the code
- New entrypoint allows easier command-line usage
- New base image
- Support opening MultiQC on websites with CSP
script-src 'self'
with some sha256 exceptions- Plot data is no longer intertwined with javascript code so hashes stay the same
- Made
config.report_section_order
work for module sub-sections as well as just modules. - New config options
exclude_modules
andrun_modules
to complement-e
and-m
cli flags. - Command line output is now coloured by default 🌈 (use
--no-ansi
to turn this off) - Better launch comparability due to code refactoring by @KerstenBreuer and @ewels
- Windows support for base
multiqc
command - Support for running as a python module:
python -m multiqc .
- Support for running within a script:
import multiqc
andmultiqc.run('/path/to/files')
- Windows support for base
- Config option
custom_plot_config
now works for bargraph category configs as well (#1044) - Config
table_columns_visible
can now be given a module namespace and it will hide all columns from that module (#541)
Bug Fixes:
- MultiQC now ignores all
.md5
files - Use
SafeLoader
for PyYaml load calls, avoiding recent warning messages. - Hide
multiqc_config_example.yaml
in thetest
directory to stop people from using it without modification. - Fixed matplotlib background colour issue (@epakarin - #886)
- Table rows that are empty due to hidden columns are now properly hidden on page load (#835)
- Sample name cleaning: All sample names are now truncated to their basename, without a path.
- This includes for
regex
andreplace
(before was only the defaulttruncate
). - Only affects modules that take sample names from file contents, such as cutadapt.
- See #897 for discussion.
- This includes for