Add decription on how you can customize image entrypoint

apache · Oct 14, 2021 · bb4fd13 · bb4fd13
1 parent 52cc84c
commit bb4fd13
Showing 1 changed file with 66 additions and 0 deletions.
diff --git a/docs/docker-stack/entrypoint.rst b/docs/docker-stack/entrypoint.rst
@@ -194,6 +194,72 @@ If there are any other arguments - they are simply passed to the "airflow" comma
     optional arguments:
       -h, --help         show this help message and exit
 
+Execute custom code before the Airflow entrypoint
+-------------------------------------------------
+
+If you want to execute some custom code before Airflow's entrypoint you can also add extra code executed
+before the Airflow one by using your custom script and calling Airflow's entrypoint as
+last ``exec`` instruction in your custom one. However you have to remember to use ``dumb-init`` in the same
+way as it is used with Airflow's entrypoint, otherwise you might have problems with proper signal
+propagation (See the next chapter).
+
+
+.. code-block:: Dockerfile
+
+    FROM airflow::2.3.0.dev0
+    COPY my_entrypoint.sh /
+    ENTRYPOINT ["/usr/bin/dumb-init", "--", "/my_entrypoint.sh"]
+
+Your entrypoint might for example modify or add variables on the flight. For example the below
+entrypoint sets max count of DB checks from the first parameter passed as parameter of the image
+execution (A bit useless example but should give the reader an example of how you could use it).
+
+.. code-block:: bash
+
+    #!/bin/bash
+    export CONNECTION_CHECK_MAX_COUNT=${1}
+    shift
+    exec /entrypoint "${@}"
+
+Make sure, Airflow's entrypoint is run as ``exec /entrypoint "${@}"`` as the last command in your
+custom entrypoint. This way signals will be properly propagated and arguments will be passed
+to the entrypoint as usual (you can use ``shift`` as above if you need to pass some extra
+arguments. Note that passing secret values this way or storing secrets inside the image is a bad
+idea from security point of view - as both image and parameters to run the image with are accessible
+to anyone who has access to logs of your Kubernetes or image registry.
+
+Also be aware that code executed before Airflow's entrypoint should not create any files or
+directories inside the image and not everything might work the same way as after it is executed.
+Before Airflow entrypoint is executed, the following functionalities are not available:
+
+* umask is not set properly to allow ``group`` write access
+* user is not yet created in /etc/passwd in case arbitrary user is used to run the image
+* the database and brokers might not be available yet
+
+Adding custom image behaviour
+-----------------------------
+
+The Airflow image executes a lot of steps in the entrypoint, and sets the right environment, but
+you might want to run additional code after the entrypoint creating the user, set the umask, setting the
+variables and checking that database is runnint.
+
+Rather than running regular commands - ``scheduler``, ``webserver`` you can run *custom* script that
+you can embed into the image. You can even execute the usual components of airflow -
+``scheduler``, ``webserver`` in your custom script when you finish your custom setup.
+Similarly to custom entrypoint, it can be added to the image by extending it.
+
+.. code-block:: Dockerfile
+
+    FROM airflow::2.3.0.dev0
+    COPY my_after_entrypoint_script.sh /
+
+
+And then you can run this script by running the command:
+
+.. code-block:: bash
+
+  docker run -it apache/airflow:2.3.0.dev0-python3.6 bash -c "/my_after_entrypoint_script.sh"
+
 
 Signal propagation
 ------------------