diff --git a/docs/docker-stack/entrypoint.rst b/docs/docker-stack/entrypoint.rst index 3f4d8e4e382b8..b403a49c9db5f 100644 --- a/docs/docker-stack/entrypoint.rst +++ b/docs/docker-stack/entrypoint.rst @@ -194,6 +194,72 @@ If there are any other arguments - they are simply passed to the "airflow" comma optional arguments: -h, --help show this help message and exit +Execute custom code before the Airflow entrypoint +------------------------------------------------- + +If you want to execute some custom code before Airflow's entrypoint you can also add extra code executed +before the Airflow one by using your custom script and calling Airflow's entrypoint as +last ``exec`` instruction in your custom one. However you have to remember to use ``dumb-init`` in the same +way as it is used with Airflow's entrypoint, otherwise you might have problems with proper signal +propagation (See the next chapter). + + +.. code-block:: Dockerfile + + FROM airflow::2.3.0.dev0 + COPY my_entrypoint.sh / + ENTRYPOINT ["/usr/bin/dumb-init", "--", "/my_entrypoint.sh"] + +Your entrypoint might for example modify or add variables on the flight. For example the below +entrypoint sets max count of DB checks from the first parameter passed as parameter of the image +execution (A bit useless example but should give the reader an example of how you could use it). + +.. code-block:: bash + + #!/bin/bash + export CONNECTION_CHECK_MAX_COUNT=${1} + shift + exec /entrypoint "${@}" + +Make sure, Airflow's entrypoint is run as ``exec /entrypoint "${@}"`` as the last command in your +custom entrypoint. This way signals will be properly propagated and arguments will be passed +to the entrypoint as usual (you can use ``shift`` as above if you need to pass some extra +arguments. Note that passing secret values this way or storing secrets inside the image is a bad +idea from security point of view - as both image and parameters to run the image with are accessible +to anyone who has access to logs of your Kubernetes or image registry. + +Also be aware that code executed before Airflow's entrypoint should not create any files or +directories inside the image and not everything might work the same way as after it is executed. +Before Airflow entrypoint is executed, the following functionalities are not available: + +* umask is not set properly to allow ``group`` write access +* user is not yet created in /etc/passwd in case arbitrary user is used to run the image +* the database and brokers might not be available yet + +Adding custom image behaviour +----------------------------- + +The Airflow image executes a lot of steps in the entrypoint, and sets the right environment, but +you might want to run additional code after the entrypoint creating the user, set the umask, setting the +variables and checking that database is runnint. + +Rather than running regular commands - ``scheduler``, ``webserver`` you can run *custom* script that +you can embed into the image. You can even execute the usual components of airflow - +``scheduler``, ``webserver`` in your custom script when you finish your custom setup. +Similarly to custom entrypoint, it can be added to the image by extending it. + +.. code-block:: Dockerfile + + FROM airflow::2.3.0.dev0 + COPY my_after_entrypoint_script.sh / + + +And then you can run this script by running the command: + +.. code-block:: bash + + docker run -it apache/airflow:2.3.0.dev0-python3.6 bash -c "/my_after_entrypoint_script.sh" + Signal propagation ------------------