From 4004f28c5a38f669b303359765b0bcae1550474b Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Wed, 11 Dec 2024 12:15:07 -0800 Subject: [PATCH 01/20] Add draft of lipidomics documentation --- docs/index_lipid.rst | 114 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 114 insertions(+) create mode 100644 docs/index_lipid.rst diff --git a/docs/index_lipid.rst b/docs/index_lipid.rst new file mode 100644 index 0000000..eece248 --- /dev/null +++ b/docs/index_lipid.rst @@ -0,0 +1,114 @@ +Lipidomics Workflow +============================== + +Summary +------- + +The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics workflow ( part of MetaMS) +is built using PNNL’s CoreMS software framework. The workflow leverages many features of CoreMS as well as PNNL’s MetabRef LC-MS database to process LC-MS/MS data and identify lipids. +The initial signal processing includes peak picking, integration of mass features, deconvolution of MS1, and calculation of peak shape metrics. +The workflow associates MS1 spectra with their corresponding MS2 spectra (only for data-dependent acqusition, currently). +It uses the MS2 spectra to search an in-silico spectra database for lipids and uses the MS1 data to assign a molecular formula. +Each candidate lipid assignment is given two confidence scores: one for its match to the predicted molecular formula based on the mass accuracy and fine isotopic structure +and a second for the MS2 spectral matching for filtering and selecting the best match. + + +Workflow Diagram +------------------ + +.. image:: metamsworkflow.png + + +Workflow Dependencies +--------------------- + +Third party software +~~~~~~~~~~~~~~~~~~~~ + +- CoreMS version 3.0 or greater (2-clause BSD) +- Click (BSD 3-Clause "New" or "Revised" License) + +Database +~~~~~~~~~~~~~~~~ +- PNNL Metabolomics LC-MS in silico Spectral Database (https://metabref.emsl.pnnl.gov/) + +Workflow Availability +--------------------- + +The workflow is available in GitHub: +https://github.com/microbiomedata/metaMS + +The container is available at Docker Hub (microbiomedata/metaMS): +https://hub.docker.com/r/microbiomedata/metams + +The python package is available on PyPi: +https://pypi.org/project/metaMS/ + +The database is available by request. +Please contact NMDC (support@microbiomedata.org) for access. + +Test datasets +------------- +#TODO KRH: add test datasets somewhere + +Execution Details +--------------------- + +This workflow should be executed using the provided wdl file (wdl/metaMS_lipidomics.wdl). + +Example command to run the workflow: +``` +miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files +``` + +Inputs +~~~~~~~~ +Only data-dependent acquisition is supported at this time, but both HCD and CID fragmentation are supported. +Data must be collected in profile mode for MS1. + +To use the wdl, inputs should be specified in a json file. See example input json file in wdl/metaMS_lipidomics.wdl. + +The following inputs are required: +- Supported file formats for LC-MS data: + - ThermoFisher mass spectrometry data files (.raw) + - mzML mass spectrometry data files (.mzml) +- Parameter files: + - CoreMS Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_corems_params.toml. + - Scan Translator Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. + - MetabRef configuration key (metabref.token). See MetabRef documentation (https://metabref.emsl.pnnl.gov/api) for how to generate a token. +- Cores (optional): + - How many cores to use for processing. Default is 1. + +Outputs +~~~~~~~~ + +- Metabolites data-table + - Peak data table with annotated lipids (.csv) + - HDF: CoreMS HDF5 format +- Workflow Metadata: + - TOML : CoreMS TOML format + + +Requirements for Execution +-------------------------- + +- Docker Container Runtime + + or +- Python Environment >= 3.11 +- .NET or appropriate runtime (i.e. pythonnet). Only if processing ThermoFisher raw files. +- Python Dependencies are listed on requirements.txt + +Hardware Requirements +-------------------------- +- To run this application, we reccomend a processor with at least 2.0 GHz speed, 8GB of RAM, 10GB of free hard disk space + +Version History +--------------- + +- #TODO KRH: add version history + +Point of contact +---------------- + +Package maintainer: Katherine R. Heal From 3b7dbc7c5a9b6f2b8d1962a7153486a964ed107b Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Wed, 11 Dec 2024 17:04:15 -0800 Subject: [PATCH 02/20] Edit documentation for lipid workflow --- docs/index_lipid.rst | 40 ++++++++++++++++++++++------------------ 1 file changed, 22 insertions(+), 18 deletions(-) diff --git a/docs/index_lipid.rst b/docs/index_lipid.rst index eece248..a8cbcef 100644 --- a/docs/index_lipid.rst +++ b/docs/index_lipid.rst @@ -1,22 +1,23 @@ -Lipidomics Workflow +Lipidomics Workflow (v1.0.0) ============================== Summary ------- -The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics workflow ( part of MetaMS) +The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics workflow (part of MetaMS) is built using PNNL’s CoreMS software framework. The workflow leverages many features of CoreMS as well as PNNL’s MetabRef LC-MS database to process LC-MS/MS data and identify lipids. -The initial signal processing includes peak picking, integration of mass features, deconvolution of MS1, and calculation of peak shape metrics. -The workflow associates MS1 spectra with their corresponding MS2 spectra (only for data-dependent acqusition, currently). -It uses the MS2 spectra to search an in-silico spectra database for lipids and uses the MS1 data to assign a molecular formula. +The initial signal processing includes peak picking, integration of mass features, deconvolution of MS1, and calculation of peak shape metrics. +The workflow associates MS1 spectra with their corresponding MS2 spectra (only for data-dependent acqusition, currently). +It uses the MS2 spectra to search an in-silico spectra database for lipids and uses the MS1 data to assign a molecular formula. Each candidate lipid assignment is given two confidence scores: one for its match to the predicted molecular formula based on the mass accuracy and fine isotopic structure -and a second for the MS2 spectral matching for filtering and selecting the best match. +and a second for the MS2 spectral matching for filtering and selecting the best match. Workflow Diagram ------------------ .. image:: metamsworkflow.png +#TODO KRH: add lipidomics workflow diagram Workflow Dependencies @@ -27,6 +28,7 @@ Third party software - CoreMS version 3.0 or greater (2-clause BSD) - Click (BSD 3-Clause "New" or "Revised" License) +- miniwdl (MIT License) Database ~~~~~~~~~~~~~~~~ @@ -57,25 +59,26 @@ Execution Details This workflow should be executed using the provided wdl file (wdl/metaMS_lipidomics.wdl). Example command to run the workflow: -``` -miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files -``` + +.. code-block:: bash + + miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files Inputs ~~~~~~~~ -Only data-dependent acquisition is supported at this time, but both HCD and CID fragmentation are supported. -Data must be collected in profile mode for MS1. +Only data collected in profile mode for MS1 and data-dependent acquisition for MS2 is supported at this time. To use the wdl, inputs should be specified in a json file. See example input json file in wdl/metaMS_lipidomics.wdl. The following inputs are required: -- Supported file formats for LC-MS data: + +- LC-MS data in one of the following formats: - ThermoFisher mass spectrometry data files (.raw) - mzML mass spectrometry data files (.mzml) -- Parameter files: - - CoreMS Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_corems_params.toml. - - Scan Translator Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. - - MetabRef configuration key (metabref.token). See MetabRef documentation (https://metabref.emsl.pnnl.gov/api) for how to generate a token. +- Workflow inputs: + - CoreMS Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_corems_params.toml. + - Scan Translator Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. + - MetabRef configuration key (metabref.token). See MetabRef documentation (https://metabref.emsl.pnnl.gov/api) for how to generate a token. - Cores (optional): - How many cores to use for processing. Default is 1. @@ -86,13 +89,14 @@ Outputs - Peak data table with annotated lipids (.csv) - HDF: CoreMS HDF5 format - Workflow Metadata: - - TOML : CoreMS TOML format + - CoreMS Parameter file (.toml), the full set of parameters used in the workflow, some of which are set dynamically within the workflow. Requirements for Execution -------------------------- - Docker Container Runtime +- miniwdl (v1, https://pypi.org/project/miniwdl/) or - Python Environment >= 3.11 @@ -101,7 +105,7 @@ Requirements for Execution Hardware Requirements -------------------------- -- To run this application, we reccomend a processor with at least 2.0 GHz speed, 8GB of RAM, 10GB of free hard disk space +- To run this application, we recommend a processor with at least 2.0 GHz speed, 8GB of RAM, 10GB of free hard disk space Version History --------------- From 710c60f2ddbba6cdffce4f57bfba0b9597b27928 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Wed, 11 Dec 2024 17:13:20 -0800 Subject: [PATCH 03/20] Modify Readme and separate workflow readmes --- README.md | 137 ++------------------------------ docs/README_GCMS.md | 187 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 192 insertions(+), 132 deletions(-) create mode 100644 docs/README_GCMS.md diff --git a/README.md b/README.md index bfa0625..a37a417 100644 --- a/README.md +++ b/README.md @@ -17,145 +17,18 @@ - [Docker Container](#metams-docker-container) # MetaMS -**MetaMS** is a workflow for metabolomics data processing and annotation +**MetaMS** is a repository of workflows for metabolomics data processing and annotation in association with the NMDC ([National Microbiome Data Collaborative](https://microbiomedata.org/)) ## Current Version ### `2.2.3` -### Data input formats +## Available Workflows -- ANDI NetCDF for GC-MS (.cdf) -- CoreMS self-containing Hierarchical Data Format (.hdf5) -- ChemStation Agilent (Ongoing) +- [GC/MS metabolomics workflow](docs/README_GCMS.md). +- LC/MS lipidomics workflow -### Data output formats - -- Pandas data frame (can be saved using pickle, h5, etc) -- Text Files (.csv, tab separated .txt, etc) -- Microsoft Excel (xlsx) -- JSON for workflow metadata -- Self-containing Hierarchical Data Format (.hdf5) including raw data and ime-series data-point for processed data-sets with all associated metadata stored as json attributes - -### Data structure types - -- GC-MS - -## Available features - -### Signal Processing - -- Baseline detection, subtraction, smoothing -- m/z based Chromatogram Peak Deconvolution, -- Manual and automatic noise threshold calculation -- First and second derivatives peak picking methods -- Peak Area Calculation - - -### Calibration - -- Retention Index Linear XXX method - -### Compound Identification - -- Automatic local (SQLite) or external (MongoDB or PostgreSQL) database check, generation, and search -- Automatic molecular match algorithm with all spectral similarity methods - -## MetaMS Installation - -Make sure you have python 3.9.13 installed before continue - -- PyPi: -```bash -pip3 install metams -``` - -- From source: - ```bash -pip3 install --editable . -``` - -To be able to open chemstation files a installation of pythonnet is needed: -- Windows: - ```bash - pip3 install pythonnet - ``` - -- Mac and Linux: - ```bash - brew install mono - pip3 install pythonnet - ``` - -## Execution - -```bash -metaMS dump-gcms-toml-template gcms_metams.toml -``` -```bash -metaMS dump-gcms-corems-toml-template gcms_corems.toml -``` - - Modify the gcms_metams.toml and gcms_corems.toml accordingly to your dataset and workflow parameters -make sure to include gcms_corems.json path inside the gcms_metams.toml: "corems_toml_path": "path_to_corems.toml" - -```bash -metaMS run-gcms-workflow path_to_gcms_metams.toml -``` - -## MiniWDL - -Make sure you have python 3.9.13 installed before continue - -MiniWDL uses the microbiome/metaMS image so there is not need to install metaMS - -- Change wdl/gcms_metams_input.json to specify the data location - -- Change data/gcms_corems.toml to specify the workflow parameters - -Install miniWDL: -```bash -pip3 install miniwdl -``` - -Call: -```bash -miniwdl run wdl/metaMS_gcms.wdl -i wdl/metams_input_gcms.json --verbose --no-cache --copy-input-files -``` -## MetaMS Docker Container - -You will need docker and docker compose: - -If you don't have it installed, the easiest way is to [install docker for desktop](https://www.docker.com/products/docker-desktop/) - -- Pull from Docker Registry: - - ```bash - docker pull microbiomedata/metams:latest - - ``` -- or Build the image from source: - - ```bash - docker build -t microbiomedata/metams:latest . - ``` -- Run Workflow from Container: - - $(data_dir) = full path of directory containing the gcms data - $(config_dir) = full path of directory containing configuration and parameters metams.toml and corems.toml - ```bash - docker run -v $(data_dir):/metaB/data -v $(config_dir):/metaB/configuration microbiomedata/metams:latest metaMS run-gcms-workflow /metaB/configuration/metams.toml - ``` - -- Getting the parameters templates: - - ```bash - docker run -v $(config_dir):/metaB/configuration microbiomedata/metams:latest metaMS dump-json-template /metaB/configuration/metams.toml - ``` - - ```bash - docker run -v $(config_dir):/metaB/configuration microbiomedata/metams:latest metaMS dump-corems-json-template /metaB/configuration/corems.toml - ``` +For information about data input, output, and execution for the individual workflows, see above. ## Disclaimer diff --git a/docs/README_GCMS.md b/docs/README_GCMS.md new file mode 100644 index 0000000..7f82d7e --- /dev/null +++ b/docs/README_GCMS.md @@ -0,0 +1,187 @@ +# Table of Contents +- Introduction + - [MetaMS's GC/MS Metabolomics Workflow](#metamss-gcms-metabolomics-workflow) + - [Version](#current-version) + - [Data Input](#data-input-formats) + - [Data Output](#data-output-formats) + - [Data Structure](#data-structure-types) + - [Features](#available-features) + - [Code Documentation](https://emsl-computing.github.io/MetaMS/) + +- Installation + - [PyPi](#metams-installation) + +- Execution: + - [CLI](#execution) + - [MiniWDL](#MiniWDL) + - [Docker Container](#metams-docker-container) + +# MetaMS's GC/MS Metabolomics Workflow + +## Current Version + +### `2.2.3` + +## Available Workflows + +- GC/MS metabolomics workflow +- LC/MS lipidomics workflow + +### Data input formats + +- ANDI NetCDF for GC-MS (.cdf) +- CoreMS self-containing Hierarchical Data Format (.hdf5) +- ChemStation Agilent (Ongoing) + +### Data output formats + +- Pandas data frame (can be saved using pickle, h5, etc) +- Text Files (.csv, tab separated .txt, etc) +- Microsoft Excel (xlsx) +- JSON for workflow metadata +- Self-containing Hierarchical Data Format (.hdf5) including raw data and ime-series data-point for processed data-sets with all associated metadata stored as json attributes + +### Data structure types + +- GC-MS + +## Available features + +### Signal Processing + +- Baseline detection, subtraction, smoothing +- m/z based Chromatogram Peak Deconvolution, +- Manual and automatic noise threshold calculation +- First and second derivatives peak picking methods +- Peak Area Calculation + + +### Calibration + +- Retention Index Linear XXX method + +### Compound Identification + +- Automatic local (SQLite) or external (MongoDB or PostgreSQL) database check, generation, and search +- Automatic molecular match algorithm with all spectral similarity methods + +## MetaMS Installation + +Make sure you have python 3.9.13 installed before continue + +- PyPi: +```bash +pip3 install metams +``` + +- From source: + ```bash +pip3 install --editable . +``` + +To be able to open chemstation files a installation of pythonnet is needed: +- Windows: + ```bash + pip3 install pythonnet + ``` + +- Mac and Linux: + ```bash + brew install mono + pip3 install pythonnet + ``` + +## Execution + +```bash +metaMS dump-gcms-toml-template gcms_metams.toml +``` +```bash +metaMS dump-gcms-corems-toml-template gcms_corems.toml +``` + + Modify the gcms_metams.toml and gcms_corems.toml accordingly to your dataset and workflow parameters +make sure to include gcms_corems.json path inside the gcms_metams.toml: "corems_toml_path": "path_to_corems.toml" + +```bash +metaMS run-gcms-workflow path_to_gcms_metams.toml +``` + +## MiniWDL + +Make sure you have python 3.9.13 installed before continue + +MiniWDL uses the microbiome/metaMS image so there is not need to install metaMS + +- Change wdl/gcms_metams_input.json to specify the data location + +- Change data/gcms_corems.toml to specify the workflow parameters + +Install miniWDL: +```bash +pip3 install miniwdl +``` + +Call: +```bash +miniwdl run wdl/metaMS_gcms.wdl -i wdl/metams_input_gcms.json --verbose --no-cache --copy-input-files +``` +## MetaMS Docker Container + +You will need docker and docker compose: + +If you don't have it installed, the easiest way is to [install docker for desktop](https://www.docker.com/products/docker-desktop/) + +- Pull from Docker Registry: + + ```bash + docker pull microbiomedata/metams:latest + + ``` +- or Build the image from source: + + ```bash + docker build -t microbiomedata/metams:latest . + ``` +- Run Workflow from Container: + + $(data_dir) = full path of directory containing the gcms data + $(config_dir) = full path of directory containing configuration and parameters metams.toml and corems.toml + ```bash + docker run -v $(data_dir):/metaB/data -v $(config_dir):/metaB/configuration microbiomedata/metams:latest metaMS run-gcms-workflow /metaB/configuration/metams.toml + ``` + +- Getting the parameters templates: + + ```bash + docker run -v $(config_dir):/metaB/configuration microbiomedata/metams:latest metaMS dump-json-template /metaB/configuration/metams.toml + ``` + + ```bash + docker run -v $(config_dir):/metaB/configuration microbiomedata/metams:latest metaMS dump-corems-json-template /metaB/configuration/corems.toml + ``` + +## Disclaimer + +This material was prepared as an account of work sponsored by an agency of the +United States Government. Neither the United States Government nor the United +States Department of Energy, nor Battelle, nor any of their employees, nor any +jurisdiction or organization that has cooperated in the development of these +materials, makes any warranty, express or implied, or assumes any legal +liability or responsibility for the accuracy, completeness, or usefulness or +any information, apparatus, product, software, or process disclosed, or +represents that its use would not infringe privately owned rights. + +Reference herein to any specific commercial product, process, or service by +trade name, trademark, manufacturer, or otherwise does not necessarily +constitute or imply its endorsement, recommendation, or favoring by the United +States Government or any agency thereof, or Battelle Memorial Institute. The +views and opinions of authors expressed herein do not necessarily state or +reflect those of the United States Government or any agency thereof. + + PACIFIC NORTHWEST NATIONAL LABORATORY + operated by + BATTELLE + for the + UNITED STATES DEPARTMENT OF ENERGY + under Contract DE-AC05-76RL01830 \ No newline at end of file From d4911257309e1f28e96b75ce529989a9b17cbe94 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Wed, 11 Dec 2024 17:15:40 -0800 Subject: [PATCH 04/20] Fix Readme table of contents --- README.md | 17 +++-------------- 1 file changed, 3 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index a37a417..32072d2 100644 --- a/README.md +++ b/README.md @@ -1,20 +1,9 @@ # Table of Contents -- Introduction - [MetaMS](#MetaMS) - [Version](#current-version) - - [Data Input](#data-input-formats) - - [Data Output](#data-output-formats) - - [Data Structure](#data-structure-types) - - [Features](#available-features) - - [Code Documentation](https://emsl-computing.github.io/MetaMS/) - -- Installation - - [PyPi](#metams-installation) - -- Execution: - - [CLI](#execution) - - [MiniWDL](#MiniWDL) - - [Docker Container](#metams-docker-container) + - [Available Workflows](#available-workflows) + - [Disclaimer](#disclaimer) + # MetaMS **MetaMS** is a repository of workflows for metabolomics data processing and annotation in association with the NMDC ([National Microbiome Data Collaborative](https://microbiomedata.org/)) From f39cf82ea69214383c32c87d8d10f8be91c5ab86 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 09:20:22 -0800 Subject: [PATCH 05/20] Add lipid workflow specific documentation and make file command to generate --- Makefile | 3 +- README.md | 8 +-- docs/README_LCMS_LIPID.md | 148 ++++++++++++++++++++++++++++++++++++++ docs/convert_rst_to_md.py | 14 ++++ requirements-dev.txt | 3 +- 5 files changed, 170 insertions(+), 6 deletions(-) create mode 100644 docs/README_LCMS_LIPID.md create mode 100644 docs/convert_rst_to_md.py diff --git a/Makefile b/Makefile index c0cc5d5..3c54f50 100644 --- a/Makefile +++ b/Makefile @@ -67,5 +67,6 @@ wdl-run : miniwdl run wdl/metaMS.wdl -i wdl/metams_input.json --verbose --no-cache --copy-input-files - +convert_lipid_rst_to_md: + python docs/convert_rst_to_md.py \ No newline at end of file diff --git a/README.md b/README.md index 32072d2..68b978c 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ - [Version](#current-version) - [Available Workflows](#available-workflows) - [Disclaimer](#disclaimer) - + # MetaMS **MetaMS** is a repository of workflows for metabolomics data processing and annotation in association with the NMDC ([National Microbiome Data Collaborative](https://microbiomedata.org/)) @@ -14,10 +14,10 @@ ## Available Workflows -- [GC/MS metabolomics workflow](docs/README_GCMS.md). -- LC/MS lipidomics workflow +- [GC/MS metabolomics workflow](docs/README_GCMS.md) +- [LC/MS lipidomics workflow](docs/README_LCMS_LIPID.md) -For information about data input, output, and execution for the individual workflows, see above. +For information about data input, output, and execution for the individual workflows, follow the linked readmes above. ## Disclaimer diff --git a/docs/README_LCMS_LIPID.md b/docs/README_LCMS_LIPID.md new file mode 100644 index 0000000..3fc6816 --- /dev/null +++ b/docs/README_LCMS_LIPID.md @@ -0,0 +1,148 @@ +# Lipidomics Workflow (v1.0.0) + +## Summary + +The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics +workflow (part of MetaMS) is built using PNNL's CoreMS software +framework. The workflow leverages many features of CoreMS as well as +PNNL's MetabRef LC-MS database to process LC-MS/MS data and identify +lipids. The initial signal processing includes peak picking, integration +of mass features, deconvolution of MS\1\, and calculation +of peak shape metrics. The workflow associates MS\1\ +spectra with their corresponding MS\2\ spectra (only for +data-dependent acqusition, currently). It uses the MS\2\ +spectra to search an in-silico spectra database for lipids and uses the +MS\1\ data to assign a molecular formula. Each candidate +lipid assignment is given two confidence scores: one for its match to +the predicted molecular formula based on the mass accuracy and fine +isotopic structure and a second for the MS\2\ spectral +matching for filtering and selecting the best match. + +## Workflow Diagram + +![image](metamsworkflow.png) + +#TODO KRH: add lipidomics workflow diagram + +## Workflow Dependencies + +### Third party software + +- CoreMS version 3.0 or greater (2-clause BSD) +- Click (BSD 3-Clause \"New\" or \"Revised\" License) +- miniwdl (MIT License) + +### Database + +- PNNL Metabolomics LC-MS in silico Spectral Database + () + +## Workflow Availability + +The workflow is available in GitHub: + + +The container is available at Docker Hub (microbiomedata/metaMS): + + +The python package is available on PyPi: + + +The database is available by request. Please contact NMDC +() for access. + +## Test datasets + +#TODO KRH: add test datasets somewhere + +## Execution Details + +This workflow should be executed using the provided wdl file +(wdl/metaMS_lipidomics.wdl). + +Example command to run the workflow: + +``` bash +miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files +``` + +### Inputs + +Only data collected in profile mode for MS\1\ and +data-dependent acquisition for MS\2\ is supported at this +time. + +To use the wdl, inputs should be specified in a json file. See example +input json file in wdl/metaMS_lipidomics.wdl. + +The following inputs are required: + +- + + LC-MS data in one of the following formats: + + : - ThermoFisher mass spectrometry data files (.raw) + - mzML mass spectrometry data files (.mzml) + +- + + Workflow inputs: + + : - CoreMS Parameter file (.toml). See example in + configuration/lipid_configs/emsl_lipidomics_corems_params.toml. + - Scan Translator Parameter file (.toml). See example in + configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. + - MetabRef configuration key (metabref.token). See MetabRef + documentation () for how + to generate a token. + +- + + Cores (optional): + + : - How many cores to use for processing. Default is 1. + +### Outputs + +- + + Metabolites data-table + + : - Peak data table with annotated lipids (.csv) + - HDF: CoreMS HDF5 format + +- + + Workflow Metadata: + + : - CoreMS Parameter file (.toml), the full set of parameters + used in the workflow, some of which are set dynamically + within the workflow. + +## Requirements for Execution + +- Docker Container Runtime + +- miniwdl (v1, ) + + or + +- Python Environment \>= 3.11 + +- .NET or appropriate runtime (i.e. pythonnet). Only if processing + ThermoFisher raw files. + +- Python Dependencies are listed on requirements.txt + +## Hardware Requirements + +- To run this application, we recommend a processor with at least 2.0 + GHz speed, 8GB of RAM, 10GB of free hard disk space + +## Version History + +- #TODO KRH: add version history + +## Point of contact + +Package maintainer: Katherine R. Heal \<\> diff --git a/docs/convert_rst_to_md.py b/docs/convert_rst_to_md.py new file mode 100644 index 0000000..3f8cd96 --- /dev/null +++ b/docs/convert_rst_to_md.py @@ -0,0 +1,14 @@ +import pypandoc + +def convert_rst_to_md(input_file, output_file): + # Convert RST to Markdown + output = pypandoc.convert_file(input_file, 'md', format='rst') + + # Write the output to the MD file + with open(output_file, 'w') as f: + f.write(output) + +if __name__ == "__main__": + input_file = 'docs/index_lipid.rst' + output_file = 'docs/README_LCMS_LIPID.md' + convert_rst_to_md(input_file, output_file) \ No newline at end of file diff --git a/requirements-dev.txt b/requirements-dev.txt index 6361449..b9705a7 100644 --- a/requirements-dev.txt +++ b/requirements-dev.txt @@ -4,4 +4,5 @@ pytest-cov pyprof2calltree memory_profiler twine -bumpversion \ No newline at end of file +bumpversion +pypandoc \ No newline at end of file From 33d270ae9231394f7c6a3c72df993d75d5293717 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 09:55:49 -0800 Subject: [PATCH 06/20] WIP convert from md to rst --- Makefile | 4 +- docs/README_LCMS_LIPID.md | 8 ++- docs/convert_rst_to_md.py | 14 ---- docs/index_lipid.rst | 148 ++++++++++++++++++++++++-------------- 4 files changed, 103 insertions(+), 71 deletions(-) delete mode 100644 docs/convert_rst_to_md.py diff --git a/Makefile b/Makefile index 3c54f50..bc6d01b 100644 --- a/Makefile +++ b/Makefile @@ -64,9 +64,9 @@ docker-run: @docker run -v $(data_dir):/metams/data -v $(config_dir):/metams/configuration microbiomedata/metams:latest metaMS run-gcms-workflow /metams/configuration/metams.toml wdl-run : - miniwdl run wdl/metaMS.wdl -i wdl/metams_input.json --verbose --no-cache --copy-input-files convert_lipid_rst_to_md: - python docs/convert_rst_to_md.py + # convert the lipid documentation from rst to md + pandoc -f rst -t markdown -o docs/README_LCMS_LIPID.md docs/index_lipid.rst \ No newline at end of file diff --git a/docs/README_LCMS_LIPID.md b/docs/README_LCMS_LIPID.md index 3fc6816..ef88688 100644 --- a/docs/README_LCMS_LIPID.md +++ b/docs/README_LCMS_LIPID.md @@ -1,5 +1,7 @@ # Lipidomics Workflow (v1.0.0) +MS$^{1}$ + ## Summary The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics @@ -8,10 +10,10 @@ framework. The workflow leverages many features of CoreMS as well as PNNL's MetabRef LC-MS database to process LC-MS/MS data and identify lipids. The initial signal processing includes peak picking, integration of mass features, deconvolution of MS\1\, and calculation -of peak shape metrics. The workflow associates MS\1\ +of peak shape metrics. The workflow associates MS1 spectra with their corresponding MS\2\ spectra (only for -data-dependent acqusition, currently). It uses the MS\2\ -spectra to search an in-silico spectra database for lipids and uses the +data-dependent acqusition, currently). It uses the MS^2^ spectra to +search an in-silico spectra database for lipids and uses the MS\1\ data to assign a molecular formula. Each candidate lipid assignment is given two confidence scores: one for its match to the predicted molecular formula based on the mass accuracy and fine diff --git a/docs/convert_rst_to_md.py b/docs/convert_rst_to_md.py deleted file mode 100644 index 3f8cd96..0000000 --- a/docs/convert_rst_to_md.py +++ /dev/null @@ -1,14 +0,0 @@ -import pypandoc - -def convert_rst_to_md(input_file, output_file): - # Convert RST to Markdown - output = pypandoc.convert_file(input_file, 'md', format='rst') - - # Write the output to the MD file - with open(output_file, 'w') as f: - f.write(output) - -if __name__ == "__main__": - input_file = 'docs/index_lipid.rst' - output_file = 'docs/README_LCMS_LIPID.md' - convert_rst_to_md(input_file, output_file) \ No newline at end of file diff --git a/docs/index_lipid.rst b/docs/index_lipid.rst index a8cbcef..713784c 100644 --- a/docs/index_lipid.rst +++ b/docs/index_lipid.rst @@ -1,24 +1,36 @@ Lipidomics Workflow (v1.0.0) -============================== +============================ + +MS\ :math:`^{1}` Summary ------- -The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics workflow (part of MetaMS) -is built using PNNL’s CoreMS software framework. The workflow leverages many features of CoreMS as well as PNNL’s MetabRef LC-MS database to process LC-MS/MS data and identify lipids. -The initial signal processing includes peak picking, integration of mass features, deconvolution of MS1, and calculation of peak shape metrics. -The workflow associates MS1 spectra with their corresponding MS2 spectra (only for data-dependent acqusition, currently). -It uses the MS2 spectra to search an in-silico spectra database for lipids and uses the MS1 data to assign a molecular formula. -Each candidate lipid assignment is given two confidence scores: one for its match to the predicted molecular formula based on the mass accuracy and fine isotopic structure -and a second for the MS2 spectral matching for filtering and selecting the best match. - +The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics +workflow (part of MetaMS) is built using PNNL’s CoreMS software +framework. The workflow leverages many features of CoreMS as well as +PNNL’s MetabRef LC-MS database to process LC-MS/MS data and identify +lipids. The initial signal processing includes peak picking, integration +of mass features, deconvolution of MS1, and calculation of +peak shape metrics. The workflow associates MS1 spectra with their +corresponding MS2 spectra (only for data-dependent +acqusition, currently). It uses the MS\ :sup:`2` spectra to search an +in-silico spectra database for lipids and uses the MS1 data +to assign a molecular formula. Each candidate lipid assignment is given +two confidence scores: one for its match to the predicted molecular +formula based on the mass accuracy and fine isotopic structure and a +second for the MS2 spectral matching for filtering and +selecting the best match. Workflow Diagram ------------------- +---------------- -.. image:: metamsworkflow.png -#TODO KRH: add lipidomics workflow diagram +.. figure:: metamsworkflow.png + :alt: image + image + +#TODO KRH: add lipidomics workflow diagram Workflow Dependencies --------------------- @@ -26,13 +38,15 @@ Workflow Dependencies Third party software ~~~~~~~~~~~~~~~~~~~~ -- CoreMS version 3.0 or greater (2-clause BSD) -- Click (BSD 3-Clause "New" or "Revised" License) -- miniwdl (MIT License) +- CoreMS version 3.0 or greater (2-clause BSD) +- Click (BSD 3-Clause "New" or "Revised" License) +- miniwdl (MIT License) + +Database +~~~~~~~~ -Database -~~~~~~~~~~~~~~~~ -- PNNL Metabolomics LC-MS in silico Spectral Database (https://metabref.emsl.pnnl.gov/) +- PNNL Metabolomics LC-MS in silico Spectral Database + (https://metabref.emsl.pnnl.gov/) Workflow Availability --------------------- @@ -46,71 +60,101 @@ https://hub.docker.com/r/microbiomedata/metams The python package is available on PyPi: https://pypi.org/project/metaMS/ -The database is available by request. -Please contact NMDC (support@microbiomedata.org) for access. +The database is available by request. Please contact NMDC +(support@microbiomedata.org) for access. Test datasets ------------- + #TODO KRH: add test datasets somewhere Execution Details ---------------------- +----------------- -This workflow should be executed using the provided wdl file (wdl/metaMS_lipidomics.wdl). +This workflow should be executed using the provided wdl file +(wdl/metaMS_lipidomics.wdl). Example command to run the workflow: -.. code-block:: bash +.. code:: bash - miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files + miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files Inputs -~~~~~~~~ -Only data collected in profile mode for MS1 and data-dependent acquisition for MS2 is supported at this time. +~~~~~~ -To use the wdl, inputs should be specified in a json file. See example input json file in wdl/metaMS_lipidomics.wdl. +Only data collected in profile mode for MS1 and +data-dependent acquisition for MS2 is supported at this time. + +To use the wdl, inputs should be specified in a json file. See example +input json file in wdl/metaMS_lipidomics.wdl. The following inputs are required: -- LC-MS data in one of the following formats: - - ThermoFisher mass spectrometry data files (.raw) - - mzML mass spectrometry data files (.mzml) -- Workflow inputs: - - CoreMS Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_corems_params.toml. - - Scan Translator Parameter file (.toml). See example in configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. - - MetabRef configuration key (metabref.token). See MetabRef documentation (https://metabref.emsl.pnnl.gov/api) for how to generate a token. -- Cores (optional): - - How many cores to use for processing. Default is 1. +- + + LC-MS data in one of the following formats: + - ThermoFisher mass spectrometry data files (.raw) + - mzML mass spectrometry data files (.mzml) + +- + + Workflow inputs: + - CoreMS Parameter file (.toml). See example in + configuration/lipid_configs/emsl_lipidomics_corems_params.toml. + - Scan Translator Parameter file (.toml). See example in + configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. + - MetabRef configuration key (metabref.token). See MetabRef + documentation (https://metabref.emsl.pnnl.gov/api) for how to + generate a token. + +- + + Cores (optional): + - How many cores to use for processing. Default is 1. Outputs -~~~~~~~~ +~~~~~~~ -- Metabolites data-table - - Peak data table with annotated lipids (.csv) - - HDF: CoreMS HDF5 format -- Workflow Metadata: - - CoreMS Parameter file (.toml), the full set of parameters used in the workflow, some of which are set dynamically within the workflow. +- + Metabolites data-table + - Peak data table with annotated lipids (.csv) + - HDF: CoreMS HDF5 format + +- + + Workflow Metadata: + - CoreMS Parameter file (.toml), the full set of parameters used + in the workflow, some of which are set dynamically within the + workflow. Requirements for Execution -------------------------- -- Docker Container Runtime -- miniwdl (v1, https://pypi.org/project/miniwdl/) - - or -- Python Environment >= 3.11 -- .NET or appropriate runtime (i.e. pythonnet). Only if processing ThermoFisher raw files. -- Python Dependencies are listed on requirements.txt +- Docker Container Runtime + +- miniwdl (v1, https://pypi.org/project/miniwdl/) + + or + +- Python Environment >= 3.11 + +- .NET or appropriate runtime (i.e. pythonnet). Only if processing + ThermoFisher raw files. + +- Python Dependencies are listed on requirements.txt Hardware Requirements --------------------------- -- To run this application, we recommend a processor with at least 2.0 GHz speed, 8GB of RAM, 10GB of free hard disk space +--------------------- + +- To run this application, we recommend a processor with at least 2.0 + GHz speed, 8GB of RAM, 10GB of free hard disk space Version History --------------- -- #TODO KRH: add version history +- #TODO KRH: add version history Point of contact ---------------- From c829356abe1074844acd22570bed2026b3a1705b Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 09:57:17 -0800 Subject: [PATCH 07/20] WIP convert from md to rst --- docs/README_LCMS_LIPID.md | 2 +- docs/index_lipid.rst | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/README_LCMS_LIPID.md b/docs/README_LCMS_LIPID.md index ef88688..7c0d7a7 100644 --- a/docs/README_LCMS_LIPID.md +++ b/docs/README_LCMS_LIPID.md @@ -1,6 +1,6 @@ # Lipidomics Workflow (v1.0.0) -MS$^{1}$ +MS^1^ ## Summary diff --git a/docs/index_lipid.rst b/docs/index_lipid.rst index 713784c..532b819 100644 --- a/docs/index_lipid.rst +++ b/docs/index_lipid.rst @@ -1,7 +1,7 @@ Lipidomics Workflow (v1.0.0) ============================ -MS\ :math:`^{1}` +MS\ :sup:`1` Summary ------- From 876e6691a573ab21580c9bb778be0a32f1866442 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 14:00:09 -0800 Subject: [PATCH 08/20] Add most of lipid documentation --- docs/README_LCMS_LIPID.md | 149 ------------------------------------ docs/index_lipid.html | 106 ++++++++++++++++++++++++++ docs/index_lipid.md | 116 ++++++++++++++++++++++++++++ docs/index_lipid.rst | 156 +++++++++++++++----------------------- 4 files changed, 283 insertions(+), 244 deletions(-) create mode 100644 docs/index_lipid.html create mode 100644 docs/index_lipid.md diff --git a/docs/README_LCMS_LIPID.md b/docs/README_LCMS_LIPID.md index 7c0d7a7..8b13789 100644 --- a/docs/README_LCMS_LIPID.md +++ b/docs/README_LCMS_LIPID.md @@ -1,150 +1 @@ -# Lipidomics Workflow (v1.0.0) -MS^1^ - -## Summary - -The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics -workflow (part of MetaMS) is built using PNNL's CoreMS software -framework. The workflow leverages many features of CoreMS as well as -PNNL's MetabRef LC-MS database to process LC-MS/MS data and identify -lipids. The initial signal processing includes peak picking, integration -of mass features, deconvolution of MS\1\, and calculation -of peak shape metrics. The workflow associates MS1 -spectra with their corresponding MS\2\ spectra (only for -data-dependent acqusition, currently). It uses the MS^2^ spectra to -search an in-silico spectra database for lipids and uses the -MS\1\ data to assign a molecular formula. Each candidate -lipid assignment is given two confidence scores: one for its match to -the predicted molecular formula based on the mass accuracy and fine -isotopic structure and a second for the MS\2\ spectral -matching for filtering and selecting the best match. - -## Workflow Diagram - -![image](metamsworkflow.png) - -#TODO KRH: add lipidomics workflow diagram - -## Workflow Dependencies - -### Third party software - -- CoreMS version 3.0 or greater (2-clause BSD) -- Click (BSD 3-Clause \"New\" or \"Revised\" License) -- miniwdl (MIT License) - -### Database - -- PNNL Metabolomics LC-MS in silico Spectral Database - () - -## Workflow Availability - -The workflow is available in GitHub: - - -The container is available at Docker Hub (microbiomedata/metaMS): - - -The python package is available on PyPi: - - -The database is available by request. Please contact NMDC -() for access. - -## Test datasets - -#TODO KRH: add test datasets somewhere - -## Execution Details - -This workflow should be executed using the provided wdl file -(wdl/metaMS_lipidomics.wdl). - -Example command to run the workflow: - -``` bash -miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files -``` - -### Inputs - -Only data collected in profile mode for MS\1\ and -data-dependent acquisition for MS\2\ is supported at this -time. - -To use the wdl, inputs should be specified in a json file. See example -input json file in wdl/metaMS_lipidomics.wdl. - -The following inputs are required: - -- - - LC-MS data in one of the following formats: - - : - ThermoFisher mass spectrometry data files (.raw) - - mzML mass spectrometry data files (.mzml) - -- - - Workflow inputs: - - : - CoreMS Parameter file (.toml). See example in - configuration/lipid_configs/emsl_lipidomics_corems_params.toml. - - Scan Translator Parameter file (.toml). See example in - configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. - - MetabRef configuration key (metabref.token). See MetabRef - documentation () for how - to generate a token. - -- - - Cores (optional): - - : - How many cores to use for processing. Default is 1. - -### Outputs - -- - - Metabolites data-table - - : - Peak data table with annotated lipids (.csv) - - HDF: CoreMS HDF5 format - -- - - Workflow Metadata: - - : - CoreMS Parameter file (.toml), the full set of parameters - used in the workflow, some of which are set dynamically - within the workflow. - -## Requirements for Execution - -- Docker Container Runtime - -- miniwdl (v1, ) - - or - -- Python Environment \>= 3.11 - -- .NET or appropriate runtime (i.e. pythonnet). Only if processing - ThermoFisher raw files. - -- Python Dependencies are listed on requirements.txt - -## Hardware Requirements - -- To run this application, we recommend a processor with at least 2.0 - GHz speed, 8GB of RAM, 10GB of free hard disk space - -## Version History - -- #TODO KRH: add version history - -## Point of contact - -Package maintainer: Katherine R. Heal \<\> diff --git a/docs/index_lipid.html b/docs/index_lipid.html new file mode 100644 index 0000000..158f6ab --- /dev/null +++ b/docs/index_lipid.html @@ -0,0 +1,106 @@ +

Lipidomics Workflow (v1.0.0)

+
+metamsworkflow.png +
image
+
+

#TODO KRH: replace with lipid diagram when available

+

Overview

+

The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics +workflow (part of MetaMS) is built using PNNL’s CoreMS software +framework. The workflow leverages many features of CoreMS as well as +PNNL’s MetabRef LC-MS database to process LC-MS/MS data and identify +lipids. The initial signal processing includes peak picking, integration +of mass features, deconvolution of MS1 spectra, and calculation of peak +shape metrics. The workflow associates MS1 spectra with their +corresponding MS2 spectra. It uses the MS2 spectra to search an +in-silico spectra database for lipids and uses the deconvoluted MS1 +spectra to assign a molecular formula. Each candidate lipid assignment +is given two confidence scores: one for its match to the predicted +molecular formula based on the mass accuracy and fine isotopic structure +and a second for the MS2 spectral matching for filtering and selecting +the best match.

+

Workflow Availability

+

The workflow is available in GitHub: https://github.com/microbiomedata/metaMS

+

The container is available at Docker Hub (microbiomedata/metaMS): https://hub.docker.com/r/microbiomedata/metams

+

The python package is available on PyPi: https://pypi.org/project/metaMS/

+

The database is available by request. Please contact NMDC (support@microbiomedata.org) +for access.

+

Requirements for Execution

+

The recommended way to run the workflow is via the provided wdl file +and the miniwdl package. Using the wdl file requires the following:

+

Hardware Requirements

+

To run this application, we recommend a processor with at least 2.0 +GHz speed, 8GB of RAM, 10GB of free hard disk space.

+

Software Requirements

+ +

Note that the wdl file will automatically pull the necessary +docker with the required software dependencies.

+

Database

+ +

Test datasets

+

#TODO KRH: add test datasets here

+

Execution Details

+

This workflow should be executed using the wdl file provided in the +MetaMS package (wdl/metaMS_lipidomics.wdl).

+

Example command to run the workflow:

+
miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files
+

Inputs

+

Only data collected in profile mode for MS1 and data-dependent +acquisition for MS2 is supported at this time.

+

To use the wdl, inputs should be specified in a json file. See +example input json file in wdl/metaMS_lipidomics.wdl.

+

The following inputs are required (declared in the input json +file):

+
    +
  • LC-MS/MS data file locations in one of the following formats +
      +
    • ThermoFisher mass spectrometry data files (.raw)
    • +
    • mzML mass spectrometry data files (.mzml)
    • +
  • +
  • Workflow inputs +
      +
    • CoreMS Parameter file (.toml)
    • +
    • Scan Translator Parameter file (.toml)
    • +
    • MetabRef configuration key (metabref.token). See [MetabRef +documentation] (https://metabref.emsl.pnnl.gov/api) +for how to generate a token.
    • +
  • +
  • Cores (optional input) +
      +
    • How many cores to use for processing. Default is 1.
    • +
  • +
+

Outputs

+
    +
  • Lipidomics data +
      +
    • Peak data table with annotated lipids (.csv)
    • +
    • HDF: CoreMS HDF5 format of CoreMS LCMS object for further +analysis
    • +
  • +
  • Workflow Metadata +
      +
    • CoreMS Parameter file (.toml), the full set of parameters used in +the workflow, some of which are set dynamically within the +workflow.
    • +
  • +
+

Version History

+
    +
  • #TODO KRH: add version history
  • +
+

Point of contact

+

Package maintainer: Katherine R. Heal <katherine.heal@pnnl.gov>

diff --git a/docs/index_lipid.md b/docs/index_lipid.md new file mode 100644 index 0000000..0842d62 --- /dev/null +++ b/docs/index_lipid.md @@ -0,0 +1,116 @@ +# Lipidomics Workflow (v1.0.0) + +
+metamsworkflow.png +
image
+
+ +#TODO KRH: replace with lipid diagram when available + +## Overview + +The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics +workflow (part of MetaMS) is built using PNNL's CoreMS software +framework. The workflow leverages many features of CoreMS as well as +PNNL's MetabRef LC-MS database to process LC-MS/MS data and identify +lipids. The initial signal processing includes peak picking, integration +of mass features, deconvolution of MS1 spectra, and calculation of peak +shape metrics. The workflow associates MS1 spectra with their +corresponding MS2 spectra. It uses the MS2 spectra to search an +in-silico spectra database for lipids and uses the deconvoluted MS1 +spectra to assign a molecular formula. Each candidate lipid assignment +is given two confidence scores: one for its match to the predicted +molecular formula based on the mass accuracy and fine isotopic structure +and a second for the MS2 spectral matching for filtering and selecting +the best match. + +## Workflow Availability + +The workflow is available in GitHub: + + +The container is available at Docker Hub (microbiomedata/metaMS): + + +The python package is available on PyPi: + + +The database is available by request. Please contact NMDC +() for access. + +## Requirements for Execution + +The recommended way to run the workflow is via the provided wdl file and +the miniwdl package. Using the wdl file requires the following: + +### Hardware Requirements + +To run this application, we recommend a processor with at least 2.0 GHz +speed, 8GB of RAM, 10GB of free hard disk space. + +### Software Requirements + +- Docker Container Runtime +- miniwdl (v1, ) + +*Note that the wdl file will automatically pull the necessary docker +with the required software dependencies.* + +### Database + +- PNNL Metabolomics LC-MS *in silico* Spectral Database + () + +## Test datasets + +#TODO KRH: add test datasets here + +## Execution Details + +This workflow should be executed using the wdl file provided in the +MetaMS package (wdl/metaMS_lipidomics.wdl). + +Example command to run the workflow: + + miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files + +### Inputs + +Only data collected in profile mode for MS1 and data-dependent +acquisition for MS2 is supported at this time. + +To use the wdl, inputs should be specified in a json file. See example +input json file in wdl/metaMS_lipidomics.wdl. + +The following inputs are required (declared in the input json file): + +- LC-MS/MS data file locations in one of the following formats + - ThermoFisher mass spectrometry data files (.raw) + - mzML mass spectrometry data files (.mzml) +- Workflow inputs + - CoreMS Parameter file (.toml) + - Scan Translator Parameter file (.toml) + - MetabRef configuration key (metabref.token). See \[MetabRef + documentation\] () for how + to generate a token. +- Cores (optional input) + - How many cores to use for processing. Default is 1. + +### Outputs + +- Lipidomics data + - Peak data table with annotated lipids (.csv) + - HDF: CoreMS HDF5 format of CoreMS LCMS object for further + analysis +- Workflow Metadata + - CoreMS Parameter file (.toml), the full set of parameters used + in the workflow, some of which are set dynamically within the + workflow. + +## Version History + +- #TODO KRH: add version history + +## Point of contact + +Package maintainer: Katherine R. Heal \<\> diff --git a/docs/index_lipid.rst b/docs/index_lipid.rst index 532b819..d27f896 100644 --- a/docs/index_lipid.rst +++ b/docs/index_lipid.rst @@ -1,53 +1,31 @@ Lipidomics Workflow (v1.0.0) ============================ -MS\ :sup:`1` +.. figure:: metamsworkflow.png + :alt: image + + image -Summary -------- +#TODO KRH: replace with lipid diagram when available + +Overview +-------- The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics workflow (part of MetaMS) is built using PNNL’s CoreMS software framework. The workflow leverages many features of CoreMS as well as PNNL’s MetabRef LC-MS database to process LC-MS/MS data and identify lipids. The initial signal processing includes peak picking, integration -of mass features, deconvolution of MS1, and calculation of +of mass features, deconvolution of MS1 spectra, and calculation of peak shape metrics. The workflow associates MS1 spectra with their -corresponding MS2 spectra (only for data-dependent -acqusition, currently). It uses the MS\ :sup:`2` spectra to search an -in-silico spectra database for lipids and uses the MS1 data +corresponding MS2 spectra. It uses the MS2 spectra to search an +in-silico spectra database for lipids and uses the deconvoluted MS1 spectra to assign a molecular formula. Each candidate lipid assignment is given two confidence scores: one for its match to the predicted molecular formula based on the mass accuracy and fine isotopic structure and a -second for the MS2 spectral matching for filtering and +second for the MS2 spectral matching for filtering and selecting the best match. -Workflow Diagram ----------------- - -.. figure:: metamsworkflow.png - :alt: image - - image - -#TODO KRH: add lipidomics workflow diagram - -Workflow Dependencies ---------------------- - -Third party software -~~~~~~~~~~~~~~~~~~~~ - -- CoreMS version 3.0 or greater (2-clause BSD) -- Click (BSD 3-Clause "New" or "Revised" License) -- miniwdl (MIT License) - -Database -~~~~~~~~ - -- PNNL Metabolomics LC-MS in silico Spectral Database - (https://metabref.emsl.pnnl.gov/) - Workflow Availability --------------------- @@ -63,98 +41,86 @@ https://pypi.org/project/metaMS/ The database is available by request. Please contact NMDC (support@microbiomedata.org) for access. +Requirements for Execution +-------------------------- +The recommended way to run the workflow is via the provided wdl file and the miniwdl package. +Using the wdl file requires the following: + +Hardware Requirements +~~~~~~~~~~~~~~~~~~~~~ +To run this application, we recommend a processor with at least 2.0 GHz speed, 8GB of RAM, 10GB of free hard disk space. + +Software Requirements +~~~~~~~~~~~~~~~~~~~~~ +- Docker Container Runtime +- miniwdl (v1, https://pypi.org/project/miniwdl/) + +*Note that the wdl file will automatically pull the necessary docker with the required software dependencies.* + +Database +~~~~~~~~ + +- PNNL Metabolomics LC-MS *in silico* Spectral Database + (https://metabref.emsl.pnnl.gov/) + +The in-silico lipid spectra in the database are generated from the LipidBlast database (v68), found at https://systemsomicslab.github.io/compms/msdial/main.html. +Note that there is no retention time in the PNNL version of the database. + Test datasets ------------- -#TODO KRH: add test datasets somewhere +- An example dataset can be downloaded from here: https://nmdcdemo.emsl.pnnl.gov/lipidomics/blanchard_11_8ws97026/Blanch_Nat_Lip_H_32_AB_O_19_NEG_25Jan18_Brandi-WCSH5801.raw +- Example CoreMS Parameter file (applicable to the example dataset) +- Example Scan Translator file (applicable to the example dataset) Execution Details ----------------- -This workflow should be executed using the provided wdl file +This workflow should be executed using the wdl file provided in the MetaMS package (wdl/metaMS_lipidomics.wdl). Example command to run the workflow: -.. code:: bash +.. code-block:: - miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files + miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files Inputs ~~~~~~ -Only data collected in profile mode for MS1 and -data-dependent acquisition for MS2 is supported at this time. +Only data collected in profile mode for MS1 and +data-dependent acquisition for MS2 is supported at this time. To use the wdl, inputs should be specified in a json file. See example input json file in wdl/metaMS_lipidomics.wdl. -The following inputs are required: +The following inputs are required (declared in the input json file): -- - - LC-MS data in one of the following formats: - - ThermoFisher mass spectrometry data files (.raw) - - mzML mass spectrometry data files (.mzml) - -- - - Workflow inputs: - - CoreMS Parameter file (.toml). See example in - configuration/lipid_configs/emsl_lipidomics_corems_params.toml. - - Scan Translator Parameter file (.toml). See example in - configuration/lipid_configs/emsl_lipidomics_scan_translator_params.toml. - - MetabRef configuration key (metabref.token). See MetabRef - documentation (https://metabref.emsl.pnnl.gov/api) for how to - generate a token. - -- - - Cores (optional): - - How many cores to use for processing. Default is 1. +- LC-MS/MS data file locations in one of the following formats + - ThermoFisher mass spectrometry data files (.raw) + - mzML mass spectrometry data files (.mzml) +- Workflow inputs + - CoreMS Parameter file (.toml) + - Scan Translator Parameter file (.toml) + - MetabRef configuration key (metabref.token). See [MetabRef documentation] (https://metabref.emsl.pnnl.gov/api) for how to generate a token. +- Cores (optional input) + - How many cores to use for processing. Default is 1. Outputs ~~~~~~~ -- - - Metabolites data-table - - Peak data table with annotated lipids (.csv) - - HDF: CoreMS HDF5 format - -- +- Lipidomics data + - Peak data table with annotated lipids (.csv) + - HDF: CoreMS HDF5 format of CoreMS LCMS object for further analysis - Workflow Metadata: - - CoreMS Parameter file (.toml), the full set of parameters used - in the workflow, some of which are set dynamically within the - workflow. - -Requirements for Execution --------------------------- - -- Docker Container Runtime - -- miniwdl (v1, https://pypi.org/project/miniwdl/) - - or - -- Python Environment >= 3.11 - -- .NET or appropriate runtime (i.e. pythonnet). Only if processing - ThermoFisher raw files. - -- Python Dependencies are listed on requirements.txt - -Hardware Requirements ---------------------- -- To run this application, we recommend a processor with at least 2.0 - GHz speed, 8GB of RAM, 10GB of free hard disk space +- Workflow Metadata + - CoreMS Parameter file (.toml), the full set of parameters used in the workflow, some of which are set dynamically within the workflow. Version History --------------- -- #TODO KRH: add version history +#TODO KRH: add version history Point of contact ---------------- From ba201d873bb2cb0fb7d6c9533061c59cef3cb4b0 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 14:06:09 -0800 Subject: [PATCH 09/20] Add example files to rst documentation --- docs/index_lipid.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/index_lipid.rst b/docs/index_lipid.rst index d27f896..4d19ebc 100644 --- a/docs/index_lipid.rst +++ b/docs/index_lipid.rst @@ -70,8 +70,8 @@ Test datasets ------------- - An example dataset can be downloaded from here: https://nmdcdemo.emsl.pnnl.gov/lipidomics/blanchard_11_8ws97026/Blanch_Nat_Lip_H_32_AB_O_19_NEG_25Jan18_Brandi-WCSH5801.raw -- Example CoreMS Parameter file (applicable to the example dataset) -- Example Scan Translator file (applicable to the example dataset) +- Example CoreMS Parameter file (applicable to the example dataset): https://nmdcdemo.emsl.pnnl.gov/lipidomics/parameter_files/emsl_lipidomics_corems_params.toml #TODO KRH: still needs to be uploaded +- Example Scan Translator file (applicable to the example dataset): https://nmdcdemo.emsl.pnnl.gov/lipidomics/parameter_files/emsl_lipidomics_scan_translator.toml Execution Details ----------------- From 2bb48671cf14dc31956a8dfb29f3964839ef0532 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 14:12:48 -0800 Subject: [PATCH 10/20] Add first draft of lipidomics workflow documentation --- docs/README_LCMS_LIPID.md | 130 ++++++++++++++++++++++++++++++++++++++ docs/index_lipid.html | 23 +++++-- docs/index_lipid.md | 116 ---------------------------------- docs/index_lipid.rst | 10 +-- 4 files changed, 154 insertions(+), 125 deletions(-) delete mode 100644 docs/index_lipid.md diff --git a/docs/README_LCMS_LIPID.md b/docs/README_LCMS_LIPID.md index 8b13789..3cf1019 100644 --- a/docs/README_LCMS_LIPID.md +++ b/docs/README_LCMS_LIPID.md @@ -1 +1,131 @@ +# Lipidomics Workflow (v1.0.0) +
+metamsworkflow.png +
image
+
+ +#TODO KRH: replace with lipid diagram when available + +## Overview + +The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics +workflow (part of MetaMS) is built using PNNL's CoreMS software +framework. The workflow leverages many features of CoreMS as well as +PNNL's MetabRef LC-MS database to process LC-MS/MS data and identify +lipids. The initial signal processing includes peak picking, integration +of mass features, deconvolution of MS1 spectra, and calculation of peak +shape metrics. The workflow associates MS1 spectra with their +corresponding MS2 spectra. It uses the MS2 spectra to search an +in-silico spectra database for lipids and uses the deconvoluted MS1 +spectra to assign a molecular formula. Each candidate lipid assignment +is given two confidence scores: one for its match to the predicted +molecular formula based on the mass accuracy and fine isotopic structure +and a second for the MS2 spectral matching for filtering and selecting +the best match. + +Note that only data collected in profile mode for MS1 and data-dependent +acquisition for MS2 is supported at this time. + +## Workflow Availability + +The workflow is available in GitHub: + + +The container is available at Docker Hub (microbiomedata/metaMS): + + +The python package is available on PyPi: + + +The database is available by request. Please contact NMDC +() for access. + +## Requirements for Execution + +The recommended way to run the workflow is via the provided wdl file and +the miniwdl package. Using the wdl file requires the following: + +### Hardware Requirements + +To run this application, we recommend a processor with at least 2.0 GHz +speed, 8GB of RAM, 10GB of free hard disk space. + +### Software Requirements + +- Docker Container Runtime +- miniwdl (v1, ) + +*Note that the wdl file will automatically pull the necessary docker +with the required software dependencies.* + +### Database + +- PNNL Metabolomics LC-MS *in silico* Spectral Database + () + +The in-silico lipid spectra in the database are generated from the +LipidBlast database (v68), found at +. Note that +there is no retention time in the PNNL version of the database and the +workflow does not use retention time scoring. + +## Test datasets + +- An example dataset can be downloaded from here: + +- Example CoreMS Parameter file (applicable to the example dataset): + + #TODO KRH: still needs to be uploaded +- Example Scan Translator file (applicable to the example dataset): + + +## Execution Details + +This workflow should be executed using the wdl file provided in the +MetaMS package (wdl/metaMS_lipidomics.wdl). + +Example command to run the workflow: + +``` +miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files +``` + +### Inputs + +To use the wdl, inputs should be specified in a json file. See example +input json file in wdl/metaMS_lipidomics.wdl. + +The following inputs are required (declared in the input json file): + +- LC-MS/MS data file locations in one of the following formats + - ThermoFisher mass spectrometry data files (.raw) + - mzML mass spectrometry data files (.mzml) +- Workflow inputs + - CoreMS Parameter file (.toml) + - Scan Translator Parameter file (.toml) + - MetabRef configuration key (metabref.token). See \[MetabRef + documentation\] () for how + to generate a token. +- Cores (optional input) + - How many cores to use for processing. Default is 1. + +### Outputs + +- Lipidomics data + - Peak data table with annotated lipids (.csv) + - HDF: CoreMS HDF5 format of CoreMS LCMS object for further + analysis +- Workflow Metadata + - CoreMS Parameter file (.toml), the full set of parameters used + in the workflow, some of which are set dynamically within the + workflow. + +## Version History + +- v1.0.0: Initial release of the lipidomics workflow #TODO KRH: update + wtih releease date when available + +## Point of contact + +Package maintainer: Katherine R. Heal \<\> diff --git a/docs/index_lipid.html b/docs/index_lipid.html index 158f6ab..950df1a 100644 --- a/docs/index_lipid.html +++ b/docs/index_lipid.html @@ -19,6 +19,8 @@

Overview

molecular formula based on the mass accuracy and fine isotopic structure and a second for the MS2 spectral matching for filtering and selecting the best match.

+

Note that only data collected in profile mode for MS1 and +data-dependent acquisition for MS2 is supported at this time.

Workflow Availability

The workflow is available in GitHub: https://github.com/microbiomedata/metaMS

@@ -48,16 +50,28 @@

Database

  • PNNL Metabolomics LC-MS in silico Spectral Database (https://metabref.emsl.pnnl.gov/)
  • +

    The in-silico lipid spectra in the database are generated from the +LipidBlast database (v68), found at https://systemsomicslab.github.io/compms/msdial/main.html. +Note that there is no retention time in the PNNL version of the database +and the workflow does not use retention time scoring.

    Test datasets

    -

    #TODO KRH: add test datasets here

    +

    Execution Details

    This workflow should be executed using the wdl file provided in the MetaMS package (wdl/metaMS_lipidomics.wdl).

    Example command to run the workflow:

    miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files

    Inputs

    -

    Only data collected in profile mode for MS1 and data-dependent -acquisition for MS2 is supported at this time.

    To use the wdl, inputs should be specified in a json file. See example input json file in wdl/metaMS_lipidomics.wdl.

    The following inputs are required (declared in the input json @@ -99,7 +113,8 @@

    Outputs

    Version History

      -
    • #TODO KRH: add version history
    • +
    • v1.0.0: Initial release of the lipidomics workflow #TODO KRH: update +wtih releease date when available

    Point of contact

    Package maintainer: Katherine R. Heal < -metamsworkflow.png -

    image
    - - -#TODO KRH: replace with lipid diagram when available - -## Overview - -The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics -workflow (part of MetaMS) is built using PNNL's CoreMS software -framework. The workflow leverages many features of CoreMS as well as -PNNL's MetabRef LC-MS database to process LC-MS/MS data and identify -lipids. The initial signal processing includes peak picking, integration -of mass features, deconvolution of MS1 spectra, and calculation of peak -shape metrics. The workflow associates MS1 spectra with their -corresponding MS2 spectra. It uses the MS2 spectra to search an -in-silico spectra database for lipids and uses the deconvoluted MS1 -spectra to assign a molecular formula. Each candidate lipid assignment -is given two confidence scores: one for its match to the predicted -molecular formula based on the mass accuracy and fine isotopic structure -and a second for the MS2 spectral matching for filtering and selecting -the best match. - -## Workflow Availability - -The workflow is available in GitHub: - - -The container is available at Docker Hub (microbiomedata/metaMS): - - -The python package is available on PyPi: - - -The database is available by request. Please contact NMDC -() for access. - -## Requirements for Execution - -The recommended way to run the workflow is via the provided wdl file and -the miniwdl package. Using the wdl file requires the following: - -### Hardware Requirements - -To run this application, we recommend a processor with at least 2.0 GHz -speed, 8GB of RAM, 10GB of free hard disk space. - -### Software Requirements - -- Docker Container Runtime -- miniwdl (v1, ) - -*Note that the wdl file will automatically pull the necessary docker -with the required software dependencies.* - -### Database - -- PNNL Metabolomics LC-MS *in silico* Spectral Database - () - -## Test datasets - -#TODO KRH: add test datasets here - -## Execution Details - -This workflow should be executed using the wdl file provided in the -MetaMS package (wdl/metaMS_lipidomics.wdl). - -Example command to run the workflow: - - miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files - -### Inputs - -Only data collected in profile mode for MS1 and data-dependent -acquisition for MS2 is supported at this time. - -To use the wdl, inputs should be specified in a json file. See example -input json file in wdl/metaMS_lipidomics.wdl. - -The following inputs are required (declared in the input json file): - -- LC-MS/MS data file locations in one of the following formats - - ThermoFisher mass spectrometry data files (.raw) - - mzML mass spectrometry data files (.mzml) -- Workflow inputs - - CoreMS Parameter file (.toml) - - Scan Translator Parameter file (.toml) - - MetabRef configuration key (metabref.token). See \[MetabRef - documentation\] () for how - to generate a token. -- Cores (optional input) - - How many cores to use for processing. Default is 1. - -### Outputs - -- Lipidomics data - - Peak data table with annotated lipids (.csv) - - HDF: CoreMS HDF5 format of CoreMS LCMS object for further - analysis -- Workflow Metadata - - CoreMS Parameter file (.toml), the full set of parameters used - in the workflow, some of which are set dynamically within the - workflow. - -## Version History - -- #TODO KRH: add version history - -## Point of contact - -Package maintainer: Katherine R. Heal \<\> diff --git a/docs/index_lipid.rst b/docs/index_lipid.rst index 4d19ebc..d4890e1 100644 --- a/docs/index_lipid.rst +++ b/docs/index_lipid.rst @@ -26,6 +26,9 @@ formula based on the mass accuracy and fine isotopic structure and a second for the MS2 spectral matching for filtering and selecting the best match. +Note that only data collected in profile mode for MS1 and +data-dependent acquisition for MS2 is supported at this time. + Workflow Availability --------------------- @@ -64,7 +67,7 @@ Database (https://metabref.emsl.pnnl.gov/) The in-silico lipid spectra in the database are generated from the LipidBlast database (v68), found at https://systemsomicslab.github.io/compms/msdial/main.html. -Note that there is no retention time in the PNNL version of the database. +Note that there is no retention time in the PNNL version of the database and the workflow does not use retention time scoring. Test datasets ------------- @@ -88,9 +91,6 @@ Example command to run the workflow: Inputs ~~~~~~ -Only data collected in profile mode for MS1 and -data-dependent acquisition for MS2 is supported at this time. - To use the wdl, inputs should be specified in a json file. See example input json file in wdl/metaMS_lipidomics.wdl. @@ -120,7 +120,7 @@ Outputs Version History --------------- -#TODO KRH: add version history +- v1.0.0: Initial release of the lipidomics workflow #TODO KRH: update wtih releease date when available Point of contact ---------------- From a5ab96d1e2fe264501c8646d1be7c586fe032549 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 14:30:02 -0800 Subject: [PATCH 11/20] Edit makefile for help with rendering documentation for lipid workflow --- Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Makefile b/Makefile index bc6d01b..e980cba 100644 --- a/Makefile +++ b/Makefile @@ -69,4 +69,5 @@ wdl-run : convert_lipid_rst_to_md: # convert the lipid documentation from rst to md pandoc -f rst -t markdown -o docs/README_LCMS_LIPID.md docs/index_lipid.rst - \ No newline at end of file + # render the lipid documentation into html from the rst file + pandoc -f rst -t html -o docs/index_lipid.html docs/index_lipid.rst \ No newline at end of file From c5d7cea1b7688381f4b7e3311d3635bbcb52a134 Mon Sep 17 00:00:00 2001 From: Katherine Heal Date: Thu, 12 Dec 2024 15:07:36 -0800 Subject: [PATCH 12/20] Edit lipid documentation --- docs/README_LCMS_LIPID.md | 16 ++++++---------- docs/index_lipid.html | 12 +++++------- docs/index_lipid.rst | 12 +++++------- requirements-dev.txt | 2 +- 4 files changed, 17 insertions(+), 25 deletions(-) diff --git a/docs/README_LCMS_LIPID.md b/docs/README_LCMS_LIPID.md index 3cf1019..b7f5157 100644 --- a/docs/README_LCMS_LIPID.md +++ b/docs/README_LCMS_LIPID.md @@ -1,9 +1,6 @@ # Lipidomics Workflow (v1.0.0) -
    -metamsworkflow.png -
    image
    -
    +![](metamsworkflow.png) #TODO KRH: replace with lipid diagram when available @@ -76,7 +73,6 @@ workflow does not use retention time scoring. - Example CoreMS Parameter file (applicable to the example dataset): - #TODO KRH: still needs to be uploaded - Example Scan Translator file (applicable to the example dataset): @@ -88,7 +84,7 @@ MetaMS package (wdl/metaMS_lipidomics.wdl). Example command to run the workflow: ``` -miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files +miniwdl run wdl/metaMS_lipidomics.wdl -i metams_input_lipidomics.json --verbose --no-cache --copy-input-files ``` ### Inputs @@ -104,9 +100,9 @@ The following inputs are required (declared in the input json file): - Workflow inputs - CoreMS Parameter file (.toml) - Scan Translator Parameter file (.toml) - - MetabRef configuration key (metabref.token). See \[MetabRef - documentation\] () for how - to generate a token. + - MetabRef configuration key (metabref.token). See MetabRef + documentation () for how to + generate a token. - Cores (optional input) - How many cores to use for processing. Default is 1. @@ -124,7 +120,7 @@ The following inputs are required (declared in the input json file): ## Version History - v1.0.0: Initial release of the lipidomics workflow #TODO KRH: update - wtih releease date when available + wtih release date when available ## Point of contact diff --git a/docs/index_lipid.html b/docs/index_lipid.html index 950df1a..f2d2276 100644 --- a/docs/index_lipid.html +++ b/docs/index_lipid.html @@ -1,7 +1,6 @@

    Lipidomics Workflow (v1.0.0)

    metamsworkflow.png -
    image

    #TODO KRH: replace with lipid diagram when available

    Overview

    @@ -61,8 +60,7 @@

    Test datasets

    href="https://nmdcdemo.emsl.pnnl.gov/lipidomics/blanchard_11_8ws97026/Blanch_Nat_Lip_H_32_AB_O_19_NEG_25Jan18_Brandi-WCSH5801.raw">https://nmdcdemo.emsl.pnnl.gov/lipidomics/blanchard_11_8ws97026/Blanch_Nat_Lip_H_32_AB_O_19_NEG_25Jan18_Brandi-WCSH5801.raw
  • Example CoreMS Parameter file (applicable to the example dataset): https://nmdcdemo.emsl.pnnl.gov/lipidomics/parameter_files/emsl_lipidomics_corems_params.toml -#TODO KRH: still needs to be uploaded
  • +href="https://nmdcdemo.emsl.pnnl.gov/lipidomics/parameter_files/emsl_lipidomics_corems_params.toml">https://nmdcdemo.emsl.pnnl.gov/lipidomics/parameter_files/emsl_lipidomics_corems_params.toml
  • Example Scan Translator file (applicable to the example dataset): https://nmdcdemo.emsl.pnnl.gov/lipidomics/parameter_files/emsl_lipidomics_scan_translator.toml
  • @@ -70,7 +68,7 @@

    Execution Details

    This workflow should be executed using the wdl file provided in the MetaMS package (wdl/metaMS_lipidomics.wdl).

    Example command to run the workflow:

    -
    miniwdl run wdl/metaMS_lipidomics.wdl -i wdl/metams_input_lipidomics.json --verbose --no-cache --copy-input-files
    +
    miniwdl run wdl/metaMS_lipidomics.wdl -i metams_input_lipidomics.json --verbose --no-cache --copy-input-files

    Inputs

    To use the wdl, inputs should be specified in a json file. See example input json file in wdl/metaMS_lipidomics.wdl.

    @@ -86,8 +84,8 @@

    Inputs

    @@ -114,7 +112,7 @@

    Outputs

    Version History

    • v1.0.0: Initial release of the lipidomics workflow #TODO KRH: update -wtih releease date when available
    • +wtih release date when available

    Point of contact

    Package maintainer: Katherine R. Heal < Date: Thu, 12 Dec 2024 15:35:47 -0800 Subject: [PATCH 13/20] Add lipid workflow diagrams --- docs/README_LCMS_LIPID.md | 16 +- docs/index_lipid.html | 14 +- docs/index_lipid.rst | 21 +-- docs/lipid_workflow_v1.png | Bin 0 -> 157532 bytes docs/lipid_workflow_v1.svg | 298 +++++++++++++++++++++++++++++++++++++ 5 files changed, 317 insertions(+), 32 deletions(-) create mode 100644 docs/lipid_workflow_v1.png create mode 100644 docs/lipid_workflow_v1.svg diff --git a/docs/README_LCMS_LIPID.md b/docs/README_LCMS_LIPID.md index b7f5157..50c080e 100644 --- a/docs/README_LCMS_LIPID.md +++ b/docs/README_LCMS_LIPID.md @@ -1,10 +1,8 @@ # Lipidomics Workflow (v1.0.0) -![](metamsworkflow.png) +![](lipid_workflow_v1.png) -#TODO KRH: replace with lipid diagram when available - -## Overview +## Workflow Overview The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics workflow (part of MetaMS) is built using PNNL's CoreMS software @@ -27,14 +25,12 @@ acquisition for MS2 is supported at this time. ## Workflow Availability The workflow is available in GitHub: - + #TODO KRH: update with direct +link to lipidomics wdl The container is available at Docker Hub (microbiomedata/metaMS): -The python package is available on PyPi: - - The database is available by request. Please contact NMDC () for access. @@ -54,7 +50,7 @@ speed, 8GB of RAM, 10GB of free hard disk space. - miniwdl (v1, ) *Note that the wdl file will automatically pull the necessary docker -with the required software dependencies.* +with the required workflow dependencies.* ### Database @@ -67,7 +63,7 @@ LipidBlast database (v68), found at there is no retention time in the PNNL version of the database and the workflow does not use retention time scoring. -## Test datasets +## Sample datasets - An example dataset can be downloaded from here: diff --git a/docs/index_lipid.html b/docs/index_lipid.html index f2d2276..8541948 100644 --- a/docs/index_lipid.html +++ b/docs/index_lipid.html @@ -1,9 +1,8 @@

    Lipidomics Workflow (v1.0.0)

    -metamsworkflow.png +lipid_workflow_v1.png
    -

    #TODO KRH: replace with lipid diagram when available

    -

    Overview

    +

    Workflow Overview

    The liquid chromatography-mass spectrometry (LC-MS)-based lipidomics workflow (part of MetaMS) is built using PNNL’s CoreMS software framework. The workflow leverages many features of CoreMS as well as @@ -22,11 +21,10 @@

    Overview

    data-dependent acquisition for MS2 is supported at this time.

    Workflow Availability

    The workflow is available in GitHub: https://github.com/microbiomedata/metaMS

    +href="https://github.com/microbiomedata/metaMS">https://github.com/microbiomedata/metaMS +#TODO KRH: update with direct link to lipidomics wdl

    The container is available at Docker Hub (microbiomedata/metaMS): https://hub.docker.com/r/microbiomedata/metams

    -

    The python package is available on PyPi: https://pypi.org/project/metaMS/

    The database is available by request. Please contact NMDC (support@microbiomedata.org) for access.

    @@ -43,7 +41,7 @@

    Software Requirements

    href="https://pypi.org/project/miniwdl/">https://pypi.org/project/miniwdl/)

    Note that the wdl file will automatically pull the necessary -docker with the required software dependencies.

    +docker with the required workflow dependencies.

    Database