-
Notifications
You must be signed in to change notification settings - Fork 14.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve documentation on
Params
(#20567)
I think that this doc could be improved by adding examples of how to reference the params in your dag. (Also, the current example code causes this: #20559.) While trying to find the right place to work a few reference examples in, I ended up rewriting quite a lot of it. Let me know if you think that this is an improvement. I haven't yet figured out how to build this and view it locally, and I'd want to do that as a sanity check before merging it, but I figured get feedback on what I've written before I do that.
- Loading branch information
1 parent
8b2299b
commit 064efbe
Showing
1 changed file
with
119 additions
and
27 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,50 +15,142 @@ | |
specific language governing permissions and limitations | ||
under the License. | ||
.. _concepts:params: | ||
|
||
Params | ||
====== | ||
|
||
Params are Airflow's concept of providing runtime configuration to tasks when a DAG gets triggered manually. | ||
Params are configured while defining the DAG & tasks, that can be altered while doing a manual trigger. The | ||
ability to update params while triggering a DAG depends on the flag ``core.dag_run_conf_overrides_params``, | ||
so if that flag is ``False``, params would behave like constants. | ||
Params are how Airflow provides runtime configuration to tasks. | ||
When you trigger a DAG manually, you can modify its Params before the dagrun starts. | ||
If the user-supplied values don't pass validation, Airflow shows a warning instead of creating the dagrun. | ||
(For scheduled runs, the default values are used.) | ||
|
||
Adding Params to a DAG | ||
---------------------- | ||
|
||
To use them, one can use the ``Param`` class for complex trigger-time validations or simply use primitive types, | ||
which won't be doing any such validations. | ||
To add Params to a :class:`~airflow.models.dag.DAG`, initialize it with the ``params`` kwarg. | ||
Use a dictionary that maps Param names to a either a :class:`~airflow.models.param.Param` or an object indicating the parameter's default value. | ||
|
||
.. code-block:: | ||
from airflow import DAG | ||
from airflow.models.param import Param | ||
with DAG( | ||
'my_dag', | ||
"the_dag", | ||
params={ | ||
'int_param': Param(10, type='integer', minimum=0, maximum=20), # a int param with default value | ||
'str_param': Param(type='string', minLength=2, maxLength=4), # a mandatory str param | ||
'dummy_param': Param(type=['null', 'number', 'string']) # a param which can be None as well | ||
'old_param': 'old_way_of_passing', # i.e. no data or type validations | ||
'simple_param': Param('im_just_like_old_param'), # i.e. no data or type validations | ||
'email_param': Param( | ||
default='[email protected]', | ||
type='string', | ||
format='idn-email', | ||
minLength=5, | ||
maxLength=255, | ||
), | ||
"x": Param(5, type="integer", minimum=3), | ||
"y": 6 | ||
}, | ||
) as the_dag: | ||
Referencing Params in a Task | ||
---------------------------- | ||
|
||
Params are stored as ``params`` in the :ref:`template context <templates-ref>`. | ||
So you can reference them in a template. | ||
|
||
.. code-block:: | ||
PythonOperator( | ||
task_id="from_template", | ||
op_args=[ | ||
"{{ params.int_param + 10 }}", | ||
], | ||
python_callable=( | ||
lambda x: print(x) | ||
), | ||
) | ||
Even though Params can use a variety of types, the default behavior of templates is to provide your task with a string. | ||
You can change this by setting ``render_template_as_native_obj=True`` while initializing the :class:`~airflow.models.dag.DAG`. | ||
|
||
.. code-block:: | ||
with DAG( | ||
"the_dag", | ||
params={"x": Param(5, type="integer", minimum=3)}, | ||
render_template_as_native_obj=True | ||
) as the_dag: | ||
This way, the Param's type is respected when its provided to your task. | ||
|
||
.. code-block:: | ||
# prints <class 'str'> by default | ||
# prints <class 'int'> if render_template_as_native_obj=True | ||
PythonOperator( | ||
task_id="template_type", | ||
op_args=[ | ||
"{{ params.int_param }}", | ||
], | ||
python_callable=( | ||
lambda x: print(type(x)) | ||
), | ||
) | ||
``Param`` make use of `json-schema <https://json-schema.org/>`__ to define the properties and doing the | ||
validation, so one can use the full json-schema specifications mentioned at | ||
https://json-schema.org/draft/2020-12/json-schema-validation.html to define the construct of a ``Param`` | ||
objects. | ||
Another way to access your param is via a task's ``context`` kwarg. | ||
|
||
Also, it worthwhile to note that if you have any DAG which uses a mandatory param value, i.e. a ``Param`` | ||
object with no default value or ``null`` as an allowed type, that DAG schedule has to be ``None``. However, | ||
if such ``Param`` has been defined at task level, Airflow has no way to restrict that & the task would be | ||
failing at the execution time. | ||
.. code-block:: | ||
def print_x(**context): | ||
print(context["params"]["x"]) | ||
PythonOperator( | ||
task_id="print_x", | ||
python_callable=print_it, | ||
) | ||
Task-level Params | ||
----------------- | ||
|
||
You can also add Params to individual tasks. | ||
|
||
.. code-block:: | ||
PythonOperator( | ||
task_id="print_x", | ||
params={"x": 10}, | ||
python_callable=print_it, | ||
) | ||
If there's already a dag param with that name, the task-level default will take precedence over the dag-level default. | ||
If a user supplies their own value when the DAG was triggered, Airflow ignores all defaults and uses the user's value. | ||
|
||
JSON Schema Validation | ||
---------------------- | ||
|
||
:class:`~airflow.modules.param.Param` makes use of ``json-schema <https://json-schema.org/>``, so you can use the full json-schema specifications mentioned at https://json-schema.org/draft/2020-12/json-schema-validation.html to define ``Param`` objects. | ||
|
||
.. code-block:: | ||
with DAG( | ||
"my_dag", | ||
params={ | ||
# a int with a default value | ||
"int_param": Param(10, type="integer", minimum=0, maximum=20), | ||
# a required param which can be of multiple types | ||
"dummy": Param(type=["null", "number", "string"]), | ||
# a param which uses json-schema formatting | ||
"email": Param( | ||
default="[email protected]", | ||
type="string", | ||
format="idn-email", | ||
minLength=5, | ||
maxLength=255, | ||
), | ||
}, | ||
) as my_dag: | ||
.. note:: | ||
As of now, for security reasons, one can not use Param objects derived out of custom classes. We are | ||
planning to have a registration system for custom Param classes, just like we've for Operator ExtraLinks. | ||
|
||
Disabling Runtime Param Modification | ||
------------------------------------ | ||
|
||
The ability to update params while triggering a DAG depends on the flag ``core.dag_run_conf_overrides_params``. | ||
Setting this config to ``False`` will effectively turn your default params into constants. |