Skip to content

Commit

Permalink
add intro detail of app package cwl
Browse files Browse the repository at this point in the history
  • Loading branch information
fmigneault committed Jun 8, 2020
1 parent 2bdcc3f commit 60f85af
Show file tree
Hide file tree
Showing 5 changed files with 111 additions and 17 deletions.
8 changes: 0 additions & 8 deletions docs/source/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -201,11 +201,3 @@ Starting the Application
- need to start ``gunicorn/pserve`` (example `Dockerfile-manager`_)
- need to start ``celery`` worker (example `Dockerfile-worker`_)


.. _weaver.config: ../../../config
.. _weaver.ini.example: ../../../config/weaver.ini.example
.. _data_sources.json.example: ../../../config/data_sources.json.example
.. _wps_processes.yml.example: ../../../config/wps_processes.yml.example
.. _request_options.yml.example: ../../../config/request_options.yml.example
.. _Dockerfile-manager: ../../../docker/Dockerfile-manager
.. _Dockerfile-worker: ../../../docker/Dockerfile-worker
58 changes: 54 additions & 4 deletions docs/source/package.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
.. package:
.. application-package:
.. _package:
.. _application-package:
.. include:: references.rst

*************************
Expand All @@ -19,17 +19,67 @@ internal execution of the process allows it to run multiple type of applications
section and existing `Weaver Issues`_. Ultimately if no solution can be found, open an new issue about your specific
problem.

.. |pkg-req| replace:: ``GET /processes/{id}/package``

All processes deployed locally into `Weaver` using a `CWL` package definition will have their full package definition
available with ``GET /processes/{id}/package`` |pkg-req|_ request.

.. |pkg-req| replace:: Package
.. _pkg-req: https://pavics-weaver.readthedocs.io/en/setup-docs/api.html#tag/Processes%2Fpaths%2F~1processes~1%7Bprocess_id%7D~1package%2Fget

.. note::

|pkg-req|_ is `Weaver`-specific implementation, and therefore, is not necessarily available on other `ADES`/`EMS`
implementation as this feature is not part of |ogc-proc-api|_ specification.


Typical CWL Package Definition
===========================================

.. todo:: CommandLineTool
CWL CommandLineTool
------------------------

Following CWL package definition represents the :py:mod:`weaver.processes.builtin.jsonarray2netcdf` process.


.. literalinclude:: ../../weaver/processes/builtin/jsonarray2netcdf.cwl
:language: YAML




CWL Workflow
------------------------


Correspondance between CWL and WPS fields
===========================================

Because `CWL` definition and `WPS` process description inherently provide "duplicate" information, many fields can be
mapped between one another. In order to handle any provided metadata in the various supported location by both
specifications, as well as to extend details of deployed processes, each `Application Package` get its details merged
with complementary `WPS` description.

In some cases, complementary details are only documentation-related, but some information directly affect the format or
execution behaviour of some parameters. A common example is the ``maxOccurs`` field provided by `WPS` that does not
have a corresponding specification in `CWL` (any-sized array). On the other hand, `CWL` also provides data preparation
steps such as initial staging (i.e.: ``InitialWorkDirRequirement``) that doesn't have an equivalent under the `WPS`
process description. For this reason, complementary details are merged and reflected on both sides (as applicable),
when non-ambiguous resolution is possible.

In case of conflicting metadata, the `CWL` specification will most of the time prevail over the `WPS` metadata fields
simply because it is expected that a strict `CWL` specification is provided upon deployment. The only exceptions to this
situation are when `WPS` specification help resolve some ambiguity or when `WPS` reinforce the parametrisation of some
elements, such as with ``maxOccurs`` field.

.. note::

Metadata merge operation between `CWL` and `WPS` is accomplished on *per-mapped-field* basis. In other words, more
explicit details such as ``abstract`` could be obtained from `WPS` *while* an input file format could be obtained
from the `CWL` side. Merge occurs bidirectionally for corresponding information.

In order to help understand the resolution methodology, following sub-section cover the supported mapping between the
two specifications, and more specifically, how each field impacts the mapped equivalent metadata.

Input / Outputs
-----------------------

Expand Down
50 changes: 47 additions & 3 deletions docs/source/processes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,9 @@ As of the latest release, following `builtin` processes are available:
- :py:mod:`weaver.processes.builtin.jsonarray2netcdf`


All `builtin` processes are marked with :py:data:`weaver.processes.constants.CWL_REQUIREMENT_APP_BUILTIN` in the `CWL`
hints* section.

WPS-1/2
-------

Expand All @@ -55,6 +58,7 @@ A minimal `Deploy`_ request body for this kind of process could be as follows:
"processDescription": {
"process": {
"id": "my-process-reference"
}
},
"executionUnit": [
{
Expand Down Expand Up @@ -84,7 +88,47 @@ Please refer to `Configuration of WPS Processes`_ section for more details on th
WPS-REST
--------

.. todo:: wps-rest process doc
This process type is the main component of `Weaver`. All other process types are converted to this one either
through some parsing (e.g.: `WPS-1/2`_) or with some requirement indicators (e.g.: `Builtin`_, `Workflow`_) for
special handling.

When deploying one such process directly, it is expected to have a reference to a CWL `Application Package`_. This is
most of the time employed to wrap a reference docker image process. The reference package can be provided in multiple
ways as presented below.

.. note::

When a process is deployed with any of the below supported `Application Package` formats, additional parsing of
this `CWL` as well as complementary details directly within the `WPS` deployment body is accomplished.
See `Correspondance between CWL and WPS fields`_ section for more details.


Package as Literal Unit Block
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In this situation, the `CWL` definition is provided as is using tje JSON-formatted package embedded within the
|deploy-req|_ request. The request payload would take the following shape:

.. code-block:: json
{
"processDescription": {
"process": {
"id": "my-process-reference"
}
},
"executionUnit": [
{
"unit": {
"cwlVersion": "v1.0",
"class": "CommandLineTool",
"inputs": [<...>],
"outputs": [<...>],
[<...>]
}
}
]
}
ESGF-CWT
Expand Down Expand Up @@ -177,13 +221,13 @@ Following steps represent the typical steps applied to deploy a process, execute
Register a new process (Deploy)
-----------------------------------------

Deployment of a new process is accomplished through the ``POST {WEAVER_URL}/processes`` |deploy-req|_.
Deployment of a new process is accomplished through the ``POST {WEAVER_URL}/processes`` |deploy-req|_ request.
The request body requires mainly two components:

- ``processDescription``: defines the process identifier, metadata, inputs, outputs, and some execution specifications.
- ``executionUnit``: defines the main core details of the `Application Package`_.

.. |deploy-req| replace:: request
.. |deploy-req| replace:: Deploy
.. _deploy-req: https://pavics-weaver.readthedocs.io/en/latest/api.html#tag/Processes%2Fpaths%2F~1processes%2Fpost
.. _Application Package: docs/source/package.rst

Expand Down
8 changes: 8 additions & 0 deletions docs/source/references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,11 @@
.. _Celery: https://docs.celeryproject.org/en/latest/
.. _Gunicorn: https://gunicorn.org/
.. _MongoDB: https://www.mongodb.com/

.. _weaver.config: ../../../config
.. _weaver.ini.example: ../../../config/weaver.ini.example
.. _data_sources.json.example: ../../../config/data_sources.json.example
.. _wps_processes.yml.example: ../../../config/wps_processes.yml.example
.. _request_options.yml.example: ../../../config/request_options.yml.example
.. _Dockerfile-manager: ../../../docker/Dockerfile-manager
.. _Dockerfile-worker: ../../../docker/Dockerfile-worker
4 changes: 2 additions & 2 deletions docs/source/running.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@ It will be available under the configured URL endpoint in ``weaver/config/weaver
If everything was configured correctly, calling this URL (default: ``http://localhost:4001``) should
provide a response containing a JSON body with basic information about Weaver.

Details
----------------
Execution Details
----------------------

To execute, `Weaver` requires two type of application executed in parallel. First, it requires a WSGI HTTP server
that will run the application to provide API endpoints. This is referred to as ``weaver-manager`` in the provided
Expand Down

0 comments on commit 60f85af

Please sign in to comment.