Skip to content

Commit

Permalink
Merge branch 'master' into jpfeuffer-patch-6
Browse files Browse the repository at this point in the history
  • Loading branch information
poshul committed Feb 25, 2024
2 parents ad83953 + 6a0aab2 commit 075dff7
Show file tree
Hide file tree
Showing 40 changed files with 1,931 additions and 62 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/build-push-notebooks.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
name: Building latest jupyter Notebooks and push to master+ipynb

on:
schedule:
#schedule:
# Trigger 5:30 UTC
- cron: '30 5 * * *'
# - cron: '30 5 * * *'
push:
branches: [ master, merge-workflows ]
workflow_dispatch:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/code-blocks-linting.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v35
uses: tj-actions/changed-files@v41
with:
dir_names_max_depth: 0
files: |
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test-pr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:

- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v35
uses: tj-actions/changed-files@v41
with:
dir_names_max_depth: 0
files: |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,4 @@ Are created by CI and stored in master+ipynb to not clutter the master branch.

Binder integration
=============
Binder uses the Jupyter Notebooks in master+ipynb. The conda environment is described in environment.yml, the post-build event installs the nightly pyopenms wheel. Currently, only environment.yml is used by binder. Note: You can test a branch "jpfeuffer-patch-6" using https://mybinder.org/v2/gh/OpenMS/pyopenms-docs/jpfeuffer-patch-6
Binder uses the Jupyter Notebooks in master+ipynb. The conda environment is described in environment.yml, the post-build event installs the nightly pyopenms wheel. Currently, only environment.yml is used by binder. Note: You can test a branch "jpfeuffer-patch-6" using https://notebooks.gesis.org/binder/v2/gh/OpenMS/pyopenms-docs/jpfeuffer-patch-6
2 changes: 1 addition & 1 deletion docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Defining the exact version will make sure things don't break
sphinx==6.1.0
pydata_sphinx_theme
readthedocs-sphinx-search==0.3.1
readthedocs-sphinx-search==0.3.2
sphinx-copybutton==0.5.1
sphinx-hoverxref
sphinx-remove-toctrees
Expand Down
2 changes: 1 addition & 1 deletion docs/source/_templates/navbar-run-binder.html
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
<ul class="navbar-icon-links navbar-nav" aria-label="Custom Icon Links">
<li class="nav-item">
<a href="https://mybinder.org/v2/gh/{{ github_user }}/{{ github_repo }}/{{ github_version }}+ipynb?urlpath=lab/tree/{{ doc_path }}{{ pagename }}.ipynb" class="nav-link" rel="noopener" target="_blank" data-bs-toggle="tooltip" data-bs-original-title="Launch on Binder" data-bs-placement="bottom"><span><i class="fa fa-rocket fa-beat fa-lg"></i></span>
<a href="https://notebooks.gesis.org/binder/v2/gh/{{ github_user }}/{{ github_repo }}/{{ github_version }}+ipynb?urlpath=lab/tree/{{ doc_path }}{{ pagename }}.ipynb" class="nav-link" rel="noopener" target="_blank" data-bs-toggle="tooltip" data-bs-original-title="Launch on Binder" data-bs-placement="bottom"><span><i class="fa fa-rocket fa-beat fa-lg"></i></span>
<label class="sr-only">Launch on Binder</label></a>
</li>
</ul>
25 changes: 18 additions & 7 deletions docs/source/community/build_from_source.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,22 +35,33 @@ Depending on your systems setup, it may make sense to do this inside a virtual e
virtualenv pyopenms_venv
source pyopenms_venv/bin/activate
Next, configure OpenMS with pyOpenMS: execute ``cmake`` as usual, but with
parameters ``DPYOPENMS=ON``. Also, if using virtualenv or using a specific
Python version, add ``-DPYTHON_EXECUTABLE:FILEPATH=/path/to/python`` to ensure
Next, we will configure the CMake-based OpenMS build system
to enable the pyOpenMS target with the configuration option ``-DPYOPENMS=ON``.
If your are using virtualenv or a specific Python version,
add ``-DPYTHON_EXECUTABLE:FILEPATH=/path/to/python`` to ensure
that the correct Python executable is used. Compiling pyOpenMS can use a lot of
memory and take some time, however you can reduce the memory consumption by
breaking up the compilation into multiple units and compiling in parallel, for
example ``-DPY_NUM_THREADS=2 -DPY_NUM_MODULES=4`` will build 4 modules with 2
threads. You can then configure pyOpenMS:
threads. You can now configure pyOpenMS (inside your build folder) with:

.. code-block:: bash
cmake -DPYOPENMS=ON
make pyopenms
Remember, that you can pass the other options as described above to the first
command by adding ``-DOPTION=VALUE`` statements if you need them.

Now build pyOpenMS (now there should be pyOpenMS specific build targets).
If you are still inside your build folder, you can use "." as the build
folder parameter.

.. code-block:: bash
cmake --build $YOURBUILDFOLDER --target pyopenms --config Release
Build pyOpenMS (now there should be pyOpenMS specific build targets).
Afterwards, test that all went well by running the tests:

.. code-block:: bash
Expand Down
86 changes: 86 additions & 0 deletions docs/source/user_guide/adduct_detection.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
Adduct Detection
================

In mass spectrometry it is crucial to ionize analytes prior to detection, because they are accelerated and manipulated in electric fields, allowing their separation based on mass-to-charge ratio.
This happens by addition of protons in positive mode or loss of protons in negative mode. Other ions present in the buffer solution can ionize the analyte as well, e.g. sodium, potassium or formic acid.
Depending on the size and chemical compsition, multiple adducts can bind leading to multiple charges on the analyte. In metabolomics with smaller analytes the number of charges is typically low with one or two, whereas in proteomics the number of charges is much higher.
Furthermore, analytes can loose functional groups during ionization, e.g. a neutral water loss.
Since the ionization happens after liquid chromatography, different adducts for an analyte have similar retention times.

.. image:: img/adduct_detection.png

In pyOpenMS, :py:class:`~.MetaboliteFeatureDeconvolution` takes a :term:`feature map` as input adding adduct information as additional meta values. Features belonging to an adduct group will be stored in a :term:`consensus map`. The most important parameters are explained in the comments.

| **Input file generation:**
| The input :term:`feature map` can be obtained using a `feature finder algorithm <feature_detection.html>`_.
| **Suggested follow up step:**
| The resulting feature map can be exported to a pandas DataFrame with adduct information from the *dc_charge_adducts* feature meta values.
| Multiple feature maps can be `combined using the feature linking algorithms <feature_linking.html>`_. Each consensus feature will get a new meta value *best ion* based on the most common annotated adduct within the consensus feature group.
.. code-block:: python
from urllib.request import urlretrieve
import pyopenms as poms
# get example data file with metabolmics feature map
gh = "https://raw.githubusercontent.com/OpenMS/pyopenms-docs/master"
urlretrieve(gh + "/src/data/MetaboliteFeatureDeconvolution_input.featureXML", "example.featureXML")
# open example input feature map
feature_map = poms.FeatureMap()
poms.FeatureXMLFile().load("example.featureXML", feature_map)
# initialize MetaboliteFeatureDeconvolution
mfd = poms.MetaboliteFeatureDeconvolution()
# get default parameters
params = mfd.getDefaults()
# update/explain most important parameters
# adducts to expect: elements, charge and probability separated by colon
# the total probability of all charged adducts needs to be 1
# e.g. positive mode:
# proton dduct "H:+:0.6", sodium adduct "Na:+:0.4" and neutral water loss "H-2O-1:0:0.2"
# e.g. negative mode:
# with neutral formic acid adduct: "H-1:-:1", "CH2O2:0:0.5"
# multiples don't need to be specified separately:
# e.g. [M+H2]2+ and double water loss will be detected as well!
# optionally, retention time shifts caused by adducts can be added
# e.g. a formic acid adduct causes 3 seconds earlier elution "CH2O2:0:0.5:-3"
params.setValue("potential_adducts", ["H:+:0.6", "Na:+:0.4", "H-2O-1:0:0.2"])
# expected charge range
# e.g. for positive mode metabolomics:
# minimum of 1, maximum of 3, maximum charge span for a single feature 3
# for negative mode:
# charge_min = -3, charge_max = -1
params.setValue("charge_min", 1, "Minimal possible charge")
params.setValue("charge_max", 3, "Maximal possible charge")
params.setValue("charge_span_max", 3)
# maximum RT difference between any two features for grouping
# maximum RT difference between between two co-features, after adduct shifts have been accounted for
# (if you do not have any adduct shifts, this value should be equal to "retention_max_diff")
params.setValue("retention_max_diff", 3.0)
params.setValue("retention_max_diff_local", 3.0)
# set updated paramters object
mfd.setParameters(params)
# result feature map: will store features with adduct information
feature_map_MFD = poms.FeatureMap()
# result consensus map: will store grouped features belonging to a charge group
groups = poms.ConsensusMap()
# result consensus map: will store paired features connected by an edge
edges = poms.ConsensusMap()
# compute adducts
mfd.compute(feature_map, feature_map_MFD, groups, edges)
# export feature map as pandas DataFrame and append adduct information
df = feature_map_MFD.get_df(export_peptide_identifications=False)
df["adduct"] = [f.getMetaValue("dc_charge_adducts") for f in feature_map_MFD]
# display data
print(df.head())
2 changes: 1 addition & 1 deletion docs/source/user_guide/charge_isotope_deconvolution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Charge and Isotope Deconvolution
A single mass spectrum contains measurements of one or more analytes and the
m/z values recorded for these analytes. Most analytes produce multiple signals
in the mass spectrometer, due to the natural abundance of carbon :math:`13` (naturally
occurring at ca. :math:`1%` frequency) and the large amount of carbon atoms in most
occurring at ca. :math:`1\%` frequency) and the large amount of carbon atoms in most
organic molecules, most analytes produce a so-called isotopic pattern with a
monoisotopic peak (all carbon are :chem:`^{12}C`) and a first isotopic peak (exactly one
carbon atom is a :chem:`^{13}C`), a second isotopic peak (exactly two atoms are :chem:`^{13}C`) etc.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user_guide/chemistry.rst
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ The isotope distribution of oxygen and sulfur can be displayed with the followin
from matplotlib import pyplot as plt
# very simple overlappping correction of annotations
# very simple overlapping correction of annotations
def adjustText(x1, y1, x2, y2):
if y1 > y2:
plt.annotate(
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user_guide/export_files_GNPS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ With pyOpenMS you can automatically generate all files needed for GNPS Feature-B
Ion Identity Molecular Networking (IIMN).

Pre-requisites are your input :term:`mzML` files and a :py:class:`~.ConsensusMap`, generated by an
`untargeted metabolomics pre-processing workflow <metabolomics_preprocessing.html>`_.
`untargeted metabolomics pre-processing workflow <untargeted_metabolomics_preprocessing.html>`_.
Ensure that :term:`MS2` data has been mapped to the :py:class:`~.FeatureMap` objects with :py:class:`~.IDMapper`.
For IIMN adduct detection must have been performed on the :py:class:`~.FeatureMap`
objects during pre-processing with :py:class:`~.MetaboliteFeatureDeconvolution`.
Expand Down
1 change: 0 additions & 1 deletion docs/source/user_guide/feature_detection.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ FeatureFinders are available in pyOpenMS:
- :py:class:`~.FeatureFinderMultiplexAlgorithm` (e.g., :term:`SILAC`, Dimethyl labeling, (and label-free), identification free feature detection of peptides)
- :py:class:`~.FeatureFinderAlgorithmPicked` (Label-free, identification free feature detection of peptides)
- :py:class:`~.FeatureFinderIdentificationAlgorithm` (Label-free identification-guided feature detection of peptides)
- :py:class:`~.FeatureFinderAlgorithmIsotopeWavelet` (old instruments)
- :py:class:`~.FeatureFindingMetabo` (Label-free, identification free feature detection of metabolites)
- :py:class:`~.FeatureFinderAlgorithmMetaboIdent` (Label-free, identification guided feature detection of metabolites)

Expand Down
5 changes: 0 additions & 5 deletions docs/source/user_guide/fragment_spectrum_generation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ which you could plot with :py:meth:`pyopenms.plotting.plot_spectrum`, automatica
import matplotlib.pyplot as plt
from pyopenms.plotting import plot_spectrum
import matplotlib.pyplot as plt
plot_spectrum(spec1)
plt.show()
Expand Down Expand Up @@ -122,10 +121,6 @@ which you can again visualize with:
.. code-block:: python
:linenos:
import matplotlib.pyplot as plt
from pyopenms.plotting import plot_spectrum
import matplotlib.pyplot as plt
plot_spectrum(spec2, annotate_ions=False)
plt.show()
Expand Down
Binary file modified docs/source/user_guide/img/DFPIANGER_theo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/user_guide/img/DFPIANGER_theo_full.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file added docs/source/user_guide/img/adduct_detection.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/user_guide/img/nlargest.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/user_guide/img/spec_alignment_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/source/user_guide/img/spec_alignment_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/user_guide/img/spec_averaging.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/user_guide/img/spec_merging_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/user_guide/img/spec_merging_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/user_guide/img/spec_merging_3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/user_guide/img/threshold_mower.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/source/user_guide/img/window_mower.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 2 additions & 0 deletions docs/source/user_guide/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,11 @@ headings and structure.
smoothing
centroiding
spectrum_normalization
spectrum_merging
charge_isotope_deconvolution
feature_detection
map_alignment
adduct_detection
feature_linking
peptide_search
chromatographic_analysis
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user_guide/interactive_plots.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ Result:


With this you can also easily create whole dashboards like the one
hosted `here <https://mybinder.org/v2/gh/OpenMS/pyopenms-docs/master+ipynb?urlpath=msbokehapps>`_ on a Binder instance.
hosted `here <https://notebooks.gesis.org/binder/v2/gh/OpenMS/pyopenms-docs/master+ipynb?urlpath=msbokehapps>`_ on a Binder instance.
If you are reading/executing this on Binder already, execute the next cell to get a link to your current instance.

.. code-block:: python
Expand Down
114 changes: 110 additions & 4 deletions docs/source/user_guide/ms_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,6 @@ We can also visualize our mass spectrum from before using the :py:func:`~.plot_s
import matplotlib.pyplot as plt
from pyopenms.plotting import plot_spectrum
import matplotlib.pyplot as plt
plot_spectrum(spectrum)
plt.show()
Expand Down Expand Up @@ -639,8 +638,8 @@ But first, we will load some test data:
oms.MzMLFile().load("test.mzML", inp)
Filtering Mass Spectra by :term`MS` Level
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Filtering Mass Spectra by MS Level
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We will filter the data from ``test.mzML`` file by only retaining
mass spectra that are not :term:`MS1` spectra
Expand Down Expand Up @@ -707,4 +706,111 @@ Similarly we could only retain peaks above a certain
intensity or keep only the top N peaks in each mass spectrum.

For more advanced filtering tasks pyOpenMS provides special algorithm classes.
We will take a closer look at some of them in the algorithm section.
We will take a closer look at some of them in the next section.


Filtering Mass Spectra with TOPP Tools
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We can also use predefined TOPP tools to filter our data. First we need to load in the data:

.. code-block:: python
:linenos:
import matplotlib.pyplot as plt
from pyopenms.plotting import plot_spectrum, mirror_plot_spectrum
gh = "https://raw.githubusercontent.com/OpenMS/pyopenms-docs/master"
urlretrieve(
gh + "/src/data/YIC(Carbamidomethyl)DNQDTISSK.mzML", "observed.mzML"
)
exp = oms.MSExperiment()
# Load mzML file and obtain spectrum for peptide YIC(Carbamidomethyl)DNQDTISSK
oms.MzMLFile().load("observed.mzML", exp)
# Get first spectrum
spectra = exp.getSpectra()
observed_spectrum = spectra[0]
The :py:class:`~.WindowMower` tool can be used to remove peaks in a sliding or jumping window. The window size,
number of highest peaks to keep and move type can be set with a :py:class:`~.Param` object

.. code-block:: python
:linenos:
from copy import deepcopy
window_mower_filter = oms.WindowMower()
# Copy the original spectrum
mowed_spectrum = deepcopy(observed_spectrum)
# Set parameters
params = oms.Param()
# Defines the m/z range of the sliding window
params.setValue("windowsize", 100.0, "")
# Defines the number of highest peaks to keep in the sliding window
params.setValue("peakcount", 1, "")
# Defines the type of window movement: jump (window size steps) or slide (one peak steps)
params.setValue("movetype", "jump", "")
# Apply window mowing
window_mower_filter.setParameters(params)
window_mower_filter.filterPeakSpectrum(mowed_spectrum)
# Visualize the resulting data together with the original spectrum
mirror_plot_spectrum(observed_spectrum, mowed_spectrum)
plt.show()
.. image:: img/window_mower.png


Noise can be easily removed with :py:class:`~.ThresholdMower` by setting a threshold value for the intensity of peaks
and cutting off everything below.

.. code-block:: python
:linenos:
# Copy spectrum
threshold_mower_spectrum = deepcopy(observed_spectrum)
threshold_mower_filter = oms.ThresholdMower()
# Set parameters
params = oms.Param()
params.setValue("threshold", 20.0, "")
# Apply threshold mowing
threshold_mower_filter.setParameters(params)
threshold_mower_filter.filterPeakSpectrum(threshold_mower_spectrum)
mirror_plot_spectrum(observed_spectrum, threshold_mower_spectrum)
plt.show()
.. image:: img/threshold_mower.png


We can also use e.g. :py:class:`~.NLargest` to keep only the N highest peaks in a spectrum.

.. code-block:: python
:linenos:
# Copy spectrum
nlargest_spectrum = deepcopy(observed_spectrum)
nlargest_filter = oms.NLargest()
# Set parameters
params = oms.Param()
params.setValue("n", 4, "")
# Apply N-Largest filter
nlargest_filter.setParameters(params)
nlargest_filter.filterPeakSpectrum(nlargest_spectrum)
mirror_plot_spectrum(observed_spectrum, nlargest_spectrum)
plt.show()
# Two peaks are overlapping, so only three peaks are really visible in the plot
.. image:: img/nlargest.png
Loading

0 comments on commit 075dff7

Please sign in to comment.