A Hook consists of a Hook specification, and Hook implementation.
Kedro defines Hook specifications for particular execution points where users can inject additional behaviour. Currently, the following Hook specifications are provided in kedro.framework.hooks:
after_context_created
after_catalog_created
before_pipeline_run
before_dataset_loaded
after_dataset_loaded
before_node_run
after_node_run
before_dataset_saved
after_dataset_saved
after_pipeline_run
on_node_error
on_pipeline_error
The naming convention for non-error Hooks is <before/after>_<noun>_<past_participle>
, in which:
<before/after>
and<past_participle>
refers to when the Hook executed, e.g.before <something> was run
orafter <something> was created
.<noun>
refers to the relevant component in the Kedro execution timeline for which this Hook adds extra behaviour, e.g.catalog
,node
andpipeline
.
The naming convention for error hooks is on_<noun>_error
, in which:
<noun>
refers to the relevant component in the Kedro execution timeline that throws the error.
kedro.framework.hooks lists the full specifications for which you can inject additional behaviours by providing an implementation.
This diagram illustrates the execution order of hooks during kedro run
:
Kedro defines a small set of CLI hooks that inject additional behaviour around execution of a Kedro CLI command:
before_command_run
after_command_run
This is what the kedro-telemetry
plugin relies on under the hood in order to be able to collect CLI usage statistics.
To add Hooks to your Kedro project, you must:
- Create or modify the file
src/<package_name>/hooks.py
to define a Hook implementation for the particular Hook specification that describes the point at which you want to inject additional behaviour - Register that Hook implementation in the
src/<package_name>/settings.py
file under theHOOKS
key
The Hook implementation should have the same name as the specification. The Hook must provide a concrete implementation with a subset of the corresponding specification's parameters (you do not need to use them all).
To declare a Hook implementation, use the @hook_impl
decorator.
For example, the full signature of the after_data_catalog_created
Hook specification is:
@hook_spec
def after_catalog_created(
self,
catalog: DataCatalog,
conf_catalog: Dict[str, Any],
conf_creds: Dict[str, Any],
save_version: str,
load_versions: Dict[str, str],
) -> None:
pass
However, if you just want to use this Hook to list the contents of a data catalog after it is created, your Hook implementation can be as simple as:
# src/<package_name>/hooks.py
import logging
from kedro.framework.hooks import hook_impl
from kedro.io import DataCatalog
class DataCatalogHooks:
@property
def _logger(self):
return logging.getLogger(__name__)
@hook_impl
def after_catalog_created(self, catalog: DataCatalog) -> None:
self._logger.info(catalog.list())
The name of a module that contains Hooks implementation is arbitrary and is not restricted to `hooks.py`.
We recommend that you group related Hook implementations under a namespace, preferably a class, within a hooks.py
file that you create in your project.
Hook implementations should be registered with Kedro using the src/<package_name>/settings.py
file under the HOOKS
key.
You can register more than one implementation for the same specification. They will be called in LIFO (last-in, first-out) order.
The following example sets up a Hook so that the after_data_catalog_created
implementation is called, every time, after a data catalog is created.
# src/<package_name>/settings.py
from <package_name>.hooks import ProjectHooks, DataCatalogHooks
HOOKS = (ProjectHooks(), DataCatalogHooks())
Kedro also has auto-discovery enabled by default. This means that any installed plugins that declare a Hooks entry-point will be registered. To learn more about how to enable this for your custom plugin, see our plugin development guide.
Auto-discovered Hooks will run *first*, followed by the ones specified in `settings.py`.
You can auto-register a Hook (pip-installable) by creating a Kedro plugin. Kedro provides kedro.hooks
entrypoints to extend this easily.
Auto-registered plugins' Hooks can be disabled via settings.py
as follows:
# src/<package_name>/settings.py
DISABLE_HOOKS_FOR_PLUGINS = ("<plugin_name>",)
where <plugin_name>
is the name of an installed plugin for which the auto-registered Hooks must be disabled.
Hooks follow a Last-In-First-Out (LIFO) order, which means the first registered Hook will be executed last.
Hooks are registered in the following order:
- Project Hooks in
settings.py
- If you haveHOOKS = (hook_a, hook_b,)
,hook_b
will be executed beforehook_a
- Plugin Hooks registered in
kedro.hooks
, which follows alphabetical order
In general, Hook execution order is not guaranteed and you should not rely on it. If you need to make sure a particular Hook is executed first or last, you can use the tryfirst
or trylast
argument for hook_impl
.
Under the hood, we use pytest's pluggy to implement Kedro's Hook mechanism. We recommend reading their documentation to find out more about the underlying implementation.
Plugin Hooks are registered using importlib_metadata
's EntryPoints
API.