nf-core/methylseq is a bioinformatics analysis pipeline used for Methylation (Bisulfite) sequencing data. It pre-processes raw data from FastQ inputs, aligns the reads and performs extensive quality-control on the results.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker containers making installation trivial and results highly reproducible.
The pipeline allows you to choose between running either Bismark or bwa-meth / MethylDackel.
Choose between workflows by using --aligner bismark
(default, uses bowtie2 for alignment), --aligner bismark_hisat
or --aligner bwameth
.
Step | Bismark workflow | bwa-meth workflow |
---|---|---|
Generate Reference Genome Index (optional) | Bismark | bwa-meth |
Raw data QC | FastQC | FastQC |
Adapter sequence trimming | Trim Galore! | Trim Galore! |
Align Reads | Bismark | bwa-meth |
Deduplicate Alignments | Bismark | Picard MarkDuplicates |
Extract methylation calls | Bismark | MethylDackel |
Sample report | Bismark | - |
Summary Report | Bismark | - |
Alignment QC | Qualimap | Qualimap |
Sample complexity | Preseq | Preseq |
Project Report | MultiQC | MultiQC |
-
Install
nextflow
-
Install any of
Docker
,Singularity
,Podman
,Shifter
orCharliecloud
for full pipeline reproducibility (please only useConda
as a last resort; see docs) -
Download the pipeline and test it on a minimal dataset with a single command:
nextflow run nf-core/methylseq -profile test,<docker/singularity/podman/shifter/charliecloud/conda/institute>
Please check nf-core/configs to see if a custom config file to run nf-core pipelines already exists for your Institute. If so, you can simply use
-profile <institute>
in your command. This will enable eitherdocker
orsingularity
and set the appropriate execution settings for your local compute environment. -
Start running your own analysis!
nextflow run nf-core/methylseq -profile <docker/singularity/podman/shifter/charliecloud/conda/institute> --input '*_R{1,2}.fastq.gz' --genome GRCh37
See usage docs for all of the available options when running the pipeline.
The nf-core/methylseq pipeline comes with documentation about the pipeline: usage and output.
These scripts were originally written for use at the National Genomics Infrastructure at SciLifeLab in Stockholm, Sweden.
- Main author:
- Phil Ewels (@ewels)
- Contributors:
If you would like to contribute to this pipeline, please see the contributing guidelines.
For further information or help, don't hesitate to get in touch on the Slack #methylseq
channel (you can join with this invite).
If you use nf-core/methylseq for your analysis, please cite it using the following doi: 10.5281/zenodo.2555454
You can cite the nf-core
publication as follows:
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.
In addition, references of tools and data used in this pipeline are as follows:
- FastQC - https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- Trim Galore! - https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/
- Bismark - 10.1093/bioinformatics/btr167
- bwa-meth - arXiv:1401.1129
- Picard - http://broadinstitute.github.io/picard/
- Qualimap - 10.1093/bioinformatics/btv566
- Preseq - 10.1038/nmeth.2375
- MultiQC - 10.1093/bioinformatics/btw354