GitHub - nf-core/metaboigniter: Pre-processing of mass spectrometry-based metabolomics data with quantification and identification based on MS1 and MS2 data.

Introduction

nf-core/metaboigniter is a bioinformatics pipeline that ingests raw mass spectrometry data in mzML format, typically in the form of peak lists and MS2 spectral data, for comprehensive metabolomics analysis. The key stages involve centroiding, feature detection, adduct detection, alignment, and linking, which progressively refine and align the data. The pipeline can also perform requantification to compensate for missing values and leverages MS2Query for compound identification based on MS2 data, outputting a comprehensive list of detected and potentially identified metabolites.

Centroiding: Converts the continuous mass spectra into a series of discrete points.
Feature Detection: Identifies unique signals or 'features' in the spectra.
Adduct Detection: Identifies adduct ions, which are formed by the interaction of the sample with the ion source.
Alignment: Ensures that the same features across different samples are matched together.
Linking: Establishes connections between features across different ionization modes or adducts.
Requantification: Fills in missing values in the data set for a more complete analysis.
Identification: Uses MS2Query and SIRIUS to identify compounds based on their MS2 spectral data.
Output Generation: Produces a comprehensive list of detected and potentially identified metabolites.

Usage

Note

If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.

First, prepare a samplesheet with your input data that looks as follows:

samplesheet.csv:

sample,level,type,msfile
CONTROL_REP1,MS1,normal,mzML_POS_Quant/X2_Rep1.mzML
CONTROL_REP2,MS1,normal,mzML_POS_Quant/X2_Rep2.mzML
POOL_MS2,MS2,normal,mzML_POS_ID/POOL_MS2.mzML

Each row in this CSV file represents a unique sample, with the details provided in the columns.

sample: This column should contain unique names for each sample. No two samples should share the same name in this column.
level: This column should specify the level of mass spectrometry data contained in each sample file. This can be 'MS1' for files containing only MS1 data, 'MS2' for files containing only MS2 data, and 'MS12' for files containing both MS1 and MS2 data.
type: This column can contain any descriptor of your choice, such as 'normal', 'disease', etc. This is usually used to provide some classification or group identification to your samples.
msfile: This column should contain the path to the mzML file for each sample.

Now, you can run the pipeline using:

nextflow run nf-core/metaboigniter \
   -profile <docker/singularity/.../institute> \
   --input samplesheet.csv \
   --outdir <OUTDIR>

Warning

Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.

For more details and further functionality, please refer to the usage documentation and the parameter documentation.

Pipeline output

To see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page. For more details about the output files and reports, please refer to the output documentation.

Credits

nf-core/metaboigniter was originally written by Payam Emami. The DSL2 version was developed with significant contributions from Axel Walter and Efi Kontou.

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #metaboigniter channel (you can join with this invite).

Citations

If you use nf-core/metaboigniter for your analysis, please cite it using the following doi: 10.5281/zenodo.4743790

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Name		Name	Last commit message	Last commit date
Latest commit History 261 Commits
.devcontainer		.devcontainer
.github		.github
assets		assets
bin		bin
conf		conf
docs		docs
modules		modules
subworkflows		subworkflows
workflows		workflows
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitpod.yml		.gitpod.yml
.nf-core-lint.yaml		.nf-core-lint.yaml
.nf-core.yml		.nf-core.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
.prettierrc.yml		.prettierrc.yml
CHANGELOG.md		CHANGELOG.md
CITATIONS.md		CITATIONS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
modules.json		modules.json
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
pyproject.toml		pyproject.toml
tower.yml		tower.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Usage

Pipeline output

Credits

Contributions and Support

Citations

About

Releases 4

Contributors 2

Languages

License

nf-core/metaboigniter

Folders and files

Latest commit

History

Repository files navigation

Introduction

Usage

Pipeline output

Credits

Contributions and Support

Citations

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 4

Contributors 2

Languages