Skip to content

Commit

Permalink
Merge pull request #614 from subinamehta/clinical-mp
Browse files Browse the repository at this point in the history
Add clinicalmp discovery workflow
  • Loading branch information
mvdbeek authored Dec 9, 2024
2 parents 490160d + d66c8c4 commit 5ed1732
Show file tree
Hide file tree
Showing 6 changed files with 1,411 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 1.2
workflows:
- name: main
subclass: Galaxy
publish: true
primaryDescriptorPath: /iwc-clinicalmp-discovery-workflow.ga
testParameterFiles:
- /iwc-clinicalmp-discovery-workflow-tests.yml
authors:
- name: Subina Mehta
orcid: 0000-0001-9818-0537
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Changelog

## [0.1] 2024-11-18
First release.
25 changes: 25 additions & 0 deletions workflows/proteomics/clinicalmp/clinicalmp-discovery/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Clinical Metaproteomics 2: Discovery

Discovery in clinical metaproteomics is greatly enhanced by using a well-curated database, particularly one generated with the **MetaNovo tool**. This tool creates a manageable and streamlined database by identifying proteins relevant to the dataset, reducing the complexity of downstream analysis. For optimal results, the MetaNovo-generated database can be merged with reviewed proteins from **Human SwissProt** and known contaminants from the **cRAP (common Repository of Adventitious Proteins)** database, resulting in a compact yet comprehensive database of approximately 21,200 protein sequences. This refined database serves as the foundation for peptide identification, where mass spectrometry (MS) data is matched against the database to identify relevant peptides efficiently and accurately. By reducing redundancy and focusing on clinically relevant sequences, this approach improves the discovery of biomarkers and key protein insights, allowing researchers to extract meaningful biological information with reduced noise and false positives. This streamlined process is particularly valuable in clinical studies, where precision and relevance are critical for advancing diagnostics and therapeutic research.

In this current workflow, we perform Discovery using the SearchGUI and MaxQuant tools. A GTN has been developed for this workflow.
[https://training.galaxyproject.org/training-material/topics/proteomics/tutorials/clinical-mp-2-discovery/tutorial.html](https://training.galaxyproject.org/training-material/topics/proteomics/tutorials/clinical-mp-2-discovery/tutorial.html)

## Inputs dataset

- `MSMS datasets` in RAW dataset collection
- `Databases for discovery` in Fasta (protein sequences for database searching)
- `Experimental-Design Discovery MaxQuant` in Tabular Format

## Inputs values

For MaxQuant and SearchGUI/PeptideShaker
- Peptide Length
- Variable modifications
- Labeled element


## Processing

- extract microbial proteins and peptides using text formating tools
- Grouping duplicates using the Group tool
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
- doc: Test outline for iwc-clinicalmp-discovery-workflow
job:
Human UniProt Microbial Proteins from MetaNovo and cRAP:
class: File
location: https://zenodo.org/records/10720030/files/Human_UniProt_Microbial_Proteins_(from_MetaNovo)_and_cRAP.fasta
filetype: fasta
Experimental Design Discovery MaxQuant:
class: File
path: test-data/Experimental Design Discovery MaxQuant.tabular
filetype: tabular
Tandem Mass Spectrometry MSMS files:
class: Collection
collection_type: list
elements:
- class: File
identifier: PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.raw
location: https://zenodo.org/records/14182981/files/PTRC_Skubitz_Plex2_F10_9Aug19_Rage_Rep-19-06-08.raw
outputs:
SGPS MQ Peptides:
asserts:
- has_n_columns:
n: 1
- has_text:
text: "AAFPNVTAMNITTNNGK"
Loading

0 comments on commit 5ed1732

Please sign in to comment.