Skip to content

Commit

Permalink
Improve documentation on Params (#20567)
Browse files Browse the repository at this point in the history
I think that this doc could be improved by adding examples of how to reference the params in your dag. (Also, the current example code causes this: #20559.)

While trying to find the right place to work a few reference examples in, I ended up rewriting quite a lot of it.
Let me know if you think that this is an improvement.

I haven't yet figured out how to build this and view it locally, and I'd want to do that as a sanity check before merging it, but I figured get feedback on what I've written before I do that.
  • Loading branch information
MatrixManAtYrService authored Jan 4, 2022
1 parent 8b2299b commit 064efbe
Showing 1 changed file with 119 additions and 27 deletions.
146 changes: 119 additions & 27 deletions docs/apache-airflow/concepts/params.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,50 +15,142 @@
specific language governing permissions and limitations
under the License.
.. _concepts:params:

Params
======

Params are Airflow's concept of providing runtime configuration to tasks when a DAG gets triggered manually.
Params are configured while defining the DAG & tasks, that can be altered while doing a manual trigger. The
ability to update params while triggering a DAG depends on the flag ``core.dag_run_conf_overrides_params``,
so if that flag is ``False``, params would behave like constants.
Params are how Airflow provides runtime configuration to tasks.
When you trigger a DAG manually, you can modify its Params before the dagrun starts.
If the user-supplied values don't pass validation, Airflow shows a warning instead of creating the dagrun.
(For scheduled runs, the default values are used.)

Adding Params to a DAG
----------------------

To use them, one can use the ``Param`` class for complex trigger-time validations or simply use primitive types,
which won't be doing any such validations.
To add Params to a :class:`~airflow.models.dag.DAG`, initialize it with the ``params`` kwarg.
Use a dictionary that maps Param names to a either a :class:`~airflow.models.param.Param` or an object indicating the parameter's default value.

.. code-block::
from airflow import DAG
from airflow.models.param import Param
with DAG(
'my_dag',
"the_dag",
params={
'int_param': Param(10, type='integer', minimum=0, maximum=20), # a int param with default value
'str_param': Param(type='string', minLength=2, maxLength=4), # a mandatory str param
'dummy_param': Param(type=['null', 'number', 'string']) # a param which can be None as well
'old_param': 'old_way_of_passing', # i.e. no data or type validations
'simple_param': Param('im_just_like_old_param'), # i.e. no data or type validations
'email_param': Param(
default='[email protected]',
type='string',
format='idn-email',
minLength=5,
maxLength=255,
),
"x": Param(5, type="integer", minimum=3),
"y": 6
},
) as the_dag:
Referencing Params in a Task
----------------------------

Params are stored as ``params`` in the :ref:`template context <templates-ref>`.
So you can reference them in a template.

.. code-block::
PythonOperator(
task_id="from_template",
op_args=[
"{{ params.int_param + 10 }}",
],
python_callable=(
lambda x: print(x)
),
)
Even though Params can use a variety of types, the default behavior of templates is to provide your task with a string.
You can change this by setting ``render_template_as_native_obj=True`` while initializing the :class:`~airflow.models.dag.DAG`.

.. code-block::
with DAG(
"the_dag",
params={"x": Param(5, type="integer", minimum=3)},
render_template_as_native_obj=True
) as the_dag:
This way, the Param's type is respected when its provided to your task.

.. code-block::
# prints <class 'str'> by default
# prints <class 'int'> if render_template_as_native_obj=True
PythonOperator(
task_id="template_type",
op_args=[
"{{ params.int_param }}",
],
python_callable=(
lambda x: print(type(x))
),
)
``Param`` make use of `json-schema <https://json-schema.org/>`__ to define the properties and doing the
validation, so one can use the full json-schema specifications mentioned at
https://json-schema.org/draft/2020-12/json-schema-validation.html to define the construct of a ``Param``
objects.
Another way to access your param is via a task's ``context`` kwarg.

Also, it worthwhile to note that if you have any DAG which uses a mandatory param value, i.e. a ``Param``
object with no default value or ``null`` as an allowed type, that DAG schedule has to be ``None``. However,
if such ``Param`` has been defined at task level, Airflow has no way to restrict that & the task would be
failing at the execution time.
.. code-block::
def print_x(**context):
print(context["params"]["x"])
PythonOperator(
task_id="print_x",
python_callable=print_it,
)
Task-level Params
-----------------

You can also add Params to individual tasks.

.. code-block::
PythonOperator(
task_id="print_x",
params={"x": 10},
python_callable=print_it,
)
If there's already a dag param with that name, the task-level default will take precedence over the dag-level default.
If a user supplies their own value when the DAG was triggered, Airflow ignores all defaults and uses the user's value.

JSON Schema Validation
----------------------

:class:`~airflow.modules.param.Param` makes use of ``json-schema <https://json-schema.org/>``, so you can use the full json-schema specifications mentioned at https://json-schema.org/draft/2020-12/json-schema-validation.html to define ``Param`` objects.

.. code-block::
with DAG(
"my_dag",
params={
# a int with a default value
"int_param": Param(10, type="integer", minimum=0, maximum=20),
# a required param which can be of multiple types
"dummy": Param(type=["null", "number", "string"]),
# a param which uses json-schema formatting
"email": Param(
default="[email protected]",
type="string",
format="idn-email",
minLength=5,
maxLength=255,
),
},
) as my_dag:
.. note::
As of now, for security reasons, one can not use Param objects derived out of custom classes. We are
planning to have a registration system for custom Param classes, just like we've for Operator ExtraLinks.

Disabling Runtime Param Modification
------------------------------------

The ability to update params while triggering a DAG depends on the flag ``core.dag_run_conf_overrides_params``.
Setting this config to ``False`` will effectively turn your default params into constants.

0 comments on commit 064efbe

Please sign in to comment.