diff --git a/docs/source/configuration.rst b/docs/source/configuration.rst index f10937d53..a06aae88e 100644 --- a/docs/source/configuration.rst +++ b/docs/source/configuration.rst @@ -201,11 +201,3 @@ Starting the Application - need to start ``gunicorn/pserve`` (example `Dockerfile-manager`_) - need to start ``celery`` worker (example `Dockerfile-worker`_) - -.. _weaver.config: ../../../config -.. _weaver.ini.example: ../../../config/weaver.ini.example -.. _data_sources.json.example: ../../../config/data_sources.json.example -.. _wps_processes.yml.example: ../../../config/wps_processes.yml.example -.. _request_options.yml.example: ../../../config/request_options.yml.example -.. _Dockerfile-manager: ../../../docker/Dockerfile-manager -.. _Dockerfile-worker: ../../../docker/Dockerfile-worker diff --git a/docs/source/package.rst b/docs/source/package.rst index 80af42b92..b14fa69ef 100644 --- a/docs/source/package.rst +++ b/docs/source/package.rst @@ -1,5 +1,5 @@ -.. package: -.. application-package: +.. _package: +.. _application-package: .. include:: references.rst ************************* @@ -19,17 +19,67 @@ internal execution of the process allows it to run multiple type of applications section and existing `Weaver Issues`_. Ultimately if no solution can be found, open an new issue about your specific problem. -.. |pkg-req| replace:: ``GET /processes/{id}/package`` + +All processes deployed locally into `Weaver` using a `CWL` package definition will have their full package definition +available with ``GET /processes/{id}/package`` |pkg-req|_ request. + +.. |pkg-req| replace:: Package .. _pkg-req: https://pavics-weaver.readthedocs.io/en/setup-docs/api.html#tag/Processes%2Fpaths%2F~1processes~1%7Bprocess_id%7D~1package%2Fget +.. note:: + + |pkg-req|_ is `Weaver`-specific implementation, and therefore, is not necessarily available on other `ADES`/`EMS` + implementation as this feature is not part of |ogc-proc-api|_ specification. + + Typical CWL Package Definition =========================================== -.. todo:: CommandLineTool +CWL CommandLineTool +------------------------ + +Following CWL package definition represents the :py:mod:`weaver.processes.builtin.jsonarray2netcdf` process. + + +.. literalinclude:: ../../weaver/processes/builtin/jsonarray2netcdf.cwl + :language: YAML + + + + +CWL Workflow +------------------------ + Correspondance between CWL and WPS fields =========================================== +Because `CWL` definition and `WPS` process description inherently provide "duplicate" information, many fields can be +mapped between one another. In order to handle any provided metadata in the various supported location by both +specifications, as well as to extend details of deployed processes, each `Application Package` get its details merged +with complementary `WPS` description. + +In some cases, complementary details are only documentation-related, but some information directly affect the format or +execution behaviour of some parameters. A common example is the ``maxOccurs`` field provided by `WPS` that does not +have a corresponding specification in `CWL` (any-sized array). On the other hand, `CWL` also provides data preparation +steps such as initial staging (i.e.: ``InitialWorkDirRequirement``) that doesn't have an equivalent under the `WPS` +process description. For this reason, complementary details are merged and reflected on both sides (as applicable), +when non-ambiguous resolution is possible. + +In case of conflicting metadata, the `CWL` specification will most of the time prevail over the `WPS` metadata fields +simply because it is expected that a strict `CWL` specification is provided upon deployment. The only exceptions to this +situation are when `WPS` specification help resolve some ambiguity or when `WPS` reinforce the parametrisation of some +elements, such as with ``maxOccurs`` field. + +.. note:: + + Metadata merge operation between `CWL` and `WPS` is accomplished on *per-mapped-field* basis. In other words, more + explicit details such as ``abstract`` could be obtained from `WPS` *while* an input file format could be obtained + from the `CWL` side. Merge occurs bidirectionally for corresponding information. + +In order to help understand the resolution methodology, following sub-section cover the supported mapping between the +two specifications, and more specifically, how each field impacts the mapped equivalent metadata. + Input / Outputs ----------------------- diff --git a/docs/source/processes.rst b/docs/source/processes.rst index 5ff93063e..d13662130 100644 --- a/docs/source/processes.rst +++ b/docs/source/processes.rst @@ -38,6 +38,9 @@ As of the latest release, following `builtin` processes are available: - :py:mod:`weaver.processes.builtin.jsonarray2netcdf` +All `builtin` processes are marked with :py:data:`weaver.processes.constants.CWL_REQUIREMENT_APP_BUILTIN` in the `CWL` +hints* section. + WPS-1/2 ------- @@ -55,6 +58,7 @@ A minimal `Deploy`_ request body for this kind of process could be as follows: "processDescription": { "process": { "id": "my-process-reference" + } }, "executionUnit": [ { @@ -84,7 +88,47 @@ Please refer to `Configuration of WPS Processes`_ section for more details on th WPS-REST -------- -.. todo:: wps-rest process doc +This process type is the main component of `Weaver`. All other process types are converted to this one either +through some parsing (e.g.: `WPS-1/2`_) or with some requirement indicators (e.g.: `Builtin`_, `Workflow`_) for +special handling. + +When deploying one such process directly, it is expected to have a reference to a CWL `Application Package`_. This is +most of the time employed to wrap a reference docker image process. The reference package can be provided in multiple +ways as presented below. + +.. note:: + + When a process is deployed with any of the below supported `Application Package` formats, additional parsing of + this `CWL` as well as complementary details directly within the `WPS` deployment body is accomplished. + See `Correspondance between CWL and WPS fields`_ section for more details. + + +Package as Literal Unit Block +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In this situation, the `CWL` definition is provided as is using tje JSON-formatted package embedded within the +|deploy-req|_ request. The request payload would take the following shape: + +.. code-block:: json + + { + "processDescription": { + "process": { + "id": "my-process-reference" + } + }, + "executionUnit": [ + { + "unit": { + "cwlVersion": "v1.0", + "class": "CommandLineTool", + "inputs": [<...>], + "outputs": [<...>], + [<...>] + } + } + ] + } ESGF-CWT @@ -177,13 +221,13 @@ Following steps represent the typical steps applied to deploy a process, execute Register a new process (Deploy) ----------------------------------------- -Deployment of a new process is accomplished through the ``POST {WEAVER_URL}/processes`` |deploy-req|_. +Deployment of a new process is accomplished through the ``POST {WEAVER_URL}/processes`` |deploy-req|_ request. The request body requires mainly two components: - ``processDescription``: defines the process identifier, metadata, inputs, outputs, and some execution specifications. - ``executionUnit``: defines the main core details of the `Application Package`_. -.. |deploy-req| replace:: request +.. |deploy-req| replace:: Deploy .. _deploy-req: https://pavics-weaver.readthedocs.io/en/latest/api.html#tag/Processes%2Fpaths%2F~1processes%2Fpost .. _Application Package: docs/source/package.rst diff --git a/docs/source/references.rst b/docs/source/references.rst index 18d384828..1b601d8f4 100644 --- a/docs/source/references.rst +++ b/docs/source/references.rst @@ -28,3 +28,11 @@ .. _Celery: https://docs.celeryproject.org/en/latest/ .. _Gunicorn: https://gunicorn.org/ .. _MongoDB: https://www.mongodb.com/ + +.. _weaver.config: ../../../config +.. _weaver.ini.example: ../../../config/weaver.ini.example +.. _data_sources.json.example: ../../../config/data_sources.json.example +.. _wps_processes.yml.example: ../../../config/wps_processes.yml.example +.. _request_options.yml.example: ../../../config/request_options.yml.example +.. _Dockerfile-manager: ../../../docker/Dockerfile-manager +.. _Dockerfile-worker: ../../../docker/Dockerfile-worker diff --git a/docs/source/running.rst b/docs/source/running.rst index 58e629fd5..c542780e3 100644 --- a/docs/source/running.rst +++ b/docs/source/running.rst @@ -32,8 +32,8 @@ It will be available under the configured URL endpoint in ``weaver/config/weaver If everything was configured correctly, calling this URL (default: ``http://localhost:4001``) should provide a response containing a JSON body with basic information about Weaver. -Details ----------------- +Execution Details +---------------------- To execute, `Weaver` requires two type of application executed in parallel. First, it requires a WSGI HTTP server that will run the application to provide API endpoints. This is referred to as ``weaver-manager`` in the provided