-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #46 from MetaboHUB-MetaToul-FluxoMet/31-developmen…
…t-guide Add guidelines for further development
- Loading branch information
Showing
17 changed files
with
573 additions
and
15 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
sphinx-rtd-theme==2.0.0 | ||
sphinx-design==0.6.0 | ||
metomi-rose | ||
cylc-flow | ||
cylc-flow==8.3.0 | ||
metomi-rose==2.3.0 | ||
cylc-sphinx-extensions |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,6 +46,8 @@ | |
"exec", | ||
] | ||
|
||
pygments_style = "dracula" # 🧛🏻♂️ | ||
|
||
templates_path = ["_templates"] | ||
exclude_patterns = [] | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
.. _development.add-config_option: | ||
|
||
==================================== | ||
Adding an item to user configuration | ||
==================================== | ||
|
||
.. note:: | ||
Prerequisites: | ||
* :ref:`tutorial.user-config` | ||
* :ref:`reference.user-config` | ||
* :ref:`development.add-task` | ||
|
||
Write a new item in :rose:file:`rose-suite.conf` | ||
================================================ | ||
|
||
ThermoRawFileParser can output the metadata in text or json format. Right now, the workflow only | ||
outputs metadata in json. We can give the user the option to choose between the two formats. | ||
|
||
At the end of the :strong:`[template variables]` section, add the following line: | ||
|
||
.. code-block:: ini | ||
:caption: :file:`rose-suite.conf` | ||
# ... | ||
cfg__raw_meta_format = txt | ||
Use the template variable in the workflow definition | ||
==================================================== | ||
|
||
In the :strong:`[validate_cfg]` task, change the :strong:`metadata` environment variable to: | ||
|
||
.. code-block:: jinja | ||
:caption: :file:`flow.cylc` | ||
[runtime] | ||
[[convert_raw]] | ||
[[[environment]]] | ||
- metadata = json | ||
+ metadata = {{ cfg__raw_meta_format }} | ||
During run installation, the value will now be replaced by the one set in :rose:file:`rose-suite.conf`. | ||
If you want to change the value at runtime, you can follow the instructions in :ref:`tutorial.user-config`. | ||
|
||
Validate the new configuration item | ||
=================================== | ||
|
||
Rose (the configuration manager) allows us to validate the user configuration. It is done at runtime | ||
at cyclepoint 0 with the :strong:`[validate_cfg]` task. Let's add a new validation rule for our item. | ||
Locate the :file:`meta/rose-meta.conf` file in the workflow source directory, and add the following: | ||
|
||
.. code-block:: ini | ||
[template variables=cfg__raw_meta_format] | ||
compulsory=true | ||
type=character | ||
values='json', 'txt' | ||
The :strong:`[validate_cfg]` will now check that the value of :strong:`cfg__raw_meta_format` is | ||
either 'json' or 'txt', and that the item is indeed present. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,149 @@ | ||
.. _development.add-task: | ||
|
||
============================= | ||
Adding a task to the workflow | ||
============================= | ||
|
||
In this tutorial, we will see how to add a new task to the workflow. We will use the example of a | ||
task that extract the number of scans from a mzML file, using the pyOpenMS library. | ||
|
||
Adding a python script to the workflow executables | ||
================================================== | ||
|
||
In :file:`cylc-src/bioreactor-workflow/bin/`, create a new file named :file:`get-scans-number` and | ||
paste the following content: | ||
|
||
.. code-block:: python | ||
:caption: :file:`bin/get-scans-number` | ||
#!/usr/bin/env python | ||
import os | ||
import sys | ||
from pathlib import Path | ||
from pyopenms import MzMLFile, MSExperiment | ||
MZML = os.getenv("mzml") | ||
def main(): | ||
""" | ||
Usage: | ||
./get-scans-number | ||
Get number of scans from mzML file. `$mzml` shell | ||
environment variable must be set to the path of the file. | ||
""" | ||
exp = MSExperiment() | ||
MzMLFile().load(MZML, exp) | ||
sys.stdout.write(str(exp.getNrSpectra())) | ||
if __name__ == "__main__": | ||
if len(sys.argv) > 1: | ||
sys.stderr.write(main.__doc__) | ||
elif not MZML: | ||
sys.stderr.write("$mzml environment variable not set.\n") | ||
sys.exit() | ||
elif not Path(MZML).exists(): | ||
sys.stderr.write(f"mzML file not found: {MZML}\n") | ||
sys.exit() | ||
main() | ||
Make the script executable: | ||
|
||
.. code-block:: console | ||
$ chmod +x get-scans-number | ||
Creating a new task in the [runtime] section | ||
================================================ | ||
|
||
Open :file:`cylc-src/bioreactor-workflow/flow.cylc` and add the following task definition at the end: | ||
|
||
.. code-block:: cylc | ||
:caption: :file:`flow.cylc` | ||
:emphasize-lines: 3- | ||
[runtime] | ||
# ... | ||
[[get_scans_number]] | ||
# The task will run in the wf-openms conda environment | ||
# Adding None makes the task appear at the root in the TUI/GUI | ||
inherit = None, CONDA_OPENMS | ||
script = """ | ||
echo "The script lauched by this task will extract the number of scans from the mzML file." | ||
get-scans-number > ${output_file} | ||
echo "The number of scans has been saved to ${output_file}" | ||
echo "Number of scans: $(cat ${output_file})" | ||
""" | ||
[[[environment]]] | ||
# The python script will use the $mzml environment | ||
# variable to get the path of the file. | ||
mzml = ${MAIN_RESULTS_DIR}/${RAWFILE_STEM}.mzML | ||
output_file = ${MAIN_RESULTS_DIR}/scans_number.txt | ||
This task will run the :file:`get-scans-number` script and save the output to a file named | ||
:file:`scans_number.txt` in the main results directory. This directory | ||
(:file:`share/cycle/n/dataflow/`) is specific to each cyclepoint ``n``. | ||
|
||
Adding the task to the graph | ||
============================ | ||
|
||
Add a new graph string to the :strong:`+P1/P1` recurrence, inside the :strong:`[graph]` section | ||
of the workflow definition: | ||
|
||
.. code-block:: cylc | ||
:caption: :file:`flow.cylc` | ||
:emphasize-lines: 8 | ||
[[graph]] | ||
R1/^ = validate_cfg => validate_compounds_db & validate_met_model => is_setup | ||
R1/+P1 = convert_raw => get_instrument => extract_features | ||
+P1/P1 = """ | ||
is_setup[^] => _catch_raw | ||
@catch_raw => _catch_raw => convert_raw => get_timestamp & | ||
trim_spectra => extract_features => annotate => quantify | ||
convert_raw => get_scans_number | ||
""" | ||
The task will be executed for each cyclepoint (/P1) starting from the second one (+P1). It will run after the | ||
:strong:`convert_raw` task as it depends on the mzML file generated by it. No other task depends on | ||
the one we just added. | ||
|
||
You can check that the task has been added correctly by running: | ||
|
||
.. code-block:: console | ||
$ cylc graph bioreactor-workflow 0 1 | ||
.. figure:: /_static/graphs/added-task-graph.png | ||
:alt: Graph with the new task added | ||
:scale: 50% | ||
:align: center | ||
|
||
Testing the new task | ||
==================== | ||
|
||
Install and start a new run of the workflow, and add a mzML file to the :file:`raws/` directory. The task should | ||
start immediately after the :strong:`convert_raw` task and generate a :file:`scans_number.txt` file | ||
in the :file:`cylc-run/your_run_name/share/cycle/1/dataflow/` directory. | ||
|
||
.. code-block:: output | ||
:caption: :file:`job.out` in logs | ||
Workflow : bioreactor-workflow/task-added | ||
Job : 1/get_scans_number/01 (try 1) | ||
User@Host: [email protected] | ||
2024-07-22T14:18:50+02:00 INFO - started | ||
The script lauched by this task will extract the number of scans from the mzML file. | ||
The number of scans has been saved to /Users/elliotfontaine/cylc-run/bioreactor-workflow/task-added/share/cycle/1/dataflow/scans_number.txt | ||
Number of scans: 35 | ||
2024-07-22T14:18:52+02:00 INFO - succeeded | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
.. _development.coding-style: | ||
|
||
============ | ||
Coding style | ||
============ | ||
|
||
:file:`bin/` scripts: environment variables or command line arguments? | ||
====================================================================== | ||
|
||
When writing scripts (Python, R, Bash) for the workflow, you have the choice between loading | ||
environment variables from inside the script, or parsing command line arguments. | ||
|
||
As a rule of thumb, use environment variables when you don't expect the script to be reused outside | ||
the workflow, and command line arguments with strong input validation when you want to make the script | ||
more portable. | ||
|
||
Cylc | ||
==== | ||
|
||
In general, follow Cylc :doc:`cylc:workflow-design-guide/style-guide`. When creating tasks, | ||
set the :strong:`[meta]` title and description fields to describe what the task does. You can also | ||
add custom field like :strong:`categories` if you want. | ||
|
||
Use uppercase for: | ||
* family tasks (notably the conda ones, e.g. :strong:`CONDA_OPENMS`), | ||
* global environment variables set in :strong:`[runtime][root]` and broadcasted ones (e.g. | ||
:strong:`RAWFILE_STEM`). | ||
|
||
Use lowercase for: | ||
* local environment variables set in :strong:`[environment]` blocks inside tasks. | ||
* task names. | ||
|
||
Add :strong:`None` before the name of inherited family tasks to make the task in question appear at | ||
the root when using the TUI or GUI. Otherwise, the task will be nested under the family task. The | ||
exception are InfluxDB tasks, which are always nested under the :strong:`INFLUXDB` family task. | ||
|
||
|
||
When using global environment variables or Jinja2 template variables to build CLI arguments, | ||
do it in the :strong:`[environment]` block of the task, not in the script itself: | ||
|
||
.. code-block:: cylc | ||
:caption: :file:`flow.cylc` | ||
:emphasize-lines: 4, 7-9 | ||
[[trim_spectra]] | ||
inherit = None, CONDA_OPENMS | ||
script = """ | ||
trimms ${mzml} ${n_start} ${n_end} | ||
""" | ||
[[[environment]]] | ||
mzml = ${MAIN_RESULTS_DIR}/${RAWFILE_STEM}.mzML | ||
n_start = {{ cfg__trim_values[0] }} | ||
n_end = {{ cfg__trim_values[1] }} | ||
[[[meta]]] | ||
title = Trim Spectra | ||
description = """ | ||
Remove the first `n_start` and last `n_end` scans from the mzML file. This is useful | ||
if the shape of the flowgram is not stable at the beginning or end of the run. | ||
""" | ||
categories = bioinformatics | ||
Python | ||
====== | ||
|
||
Python code should follow the `PEP 8`_ style guide. The `Black`_ code formatter should be used to | ||
automatically format the code. | ||
|
||
You should also use a linter / static code analyser like `Pylint`_ to catch potential bugs, commented | ||
out code, code smells, etc. | ||
|
||
Bash | ||
==== | ||
[TODO] | ||
|
||
R | ||
= | ||
[TODO] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
.. _development: | ||
|
||
=========== | ||
Development | ||
=========== | ||
|
||
Here are discussed some of the choices made during the development of the project (coding styles for | ||
different languages, pattern used in Cylc, etc). | ||
|
||
You'll also find some guidelines on how to add a new task or configuration option to the workflow. | ||
|
||
.. note:: | ||
It is assumed that you have a basic understanding of: | ||
* Cylc, | ||
* Python, R and Bash. | ||
|
||
For further information on Cylc, please consult their :ref:`cylc:user guide`. | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
|
||
workflow_design | ||
coding_style | ||
add_task | ||
add_config_option | ||
|
||
|
||
|
Oops, something went wrong.