Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup Docker operator logging #33914

Merged

Conversation

wolfdn
Copy link
Contributor

@wolfdn wolfdn commented Aug 30, 2023

We are using the DockerOperator and noticed that the logs are a bit confusing when there is an error in a Docker container (return code != 0) because Airflow shows the log output of the Docker container 3 times.

The logs of a failed run with the DockerOperator currently look like this:

[2023-08-29T13:12:32.261+0000] {docker.py:160} INFO - << log line from Docker container >>
[2023-08-29T13:12:34.308+0000] {docker.py:474} INFO - << log line from Docker container >>
[2023-08-29T13:12:34.308+0000] {docker.py:474} INFO - << log line from Docker container >>
...
...
...
[2023-08-29T13:39:15.959+0000] {taskinstance.py:1943} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/docker/operators/docker.py", line 476, in execute
    return self._run_image()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/docker/operators/docker.py", line 338, in _run_image
    return self._run_image_with_mounts(self.mounts + [tmp_mount], add_tmp_variable=True)
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/docker/operators/docker.py", line 411, in _run_image_with_mounts
    raise AirflowException(f"Docker container failed: {repr(result)} lines {joined_log_lines}")
airflow.exceptions.AirflowException: Docker container failed: {'StatusCode': 1} lines << log line from Docker container >>
<< log line from Docker container >>
<< log line from Docker container >>
...
...
...
[2023-08-29T13:39:15.991+0000] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=<< dag id >>, task_id=<< task id >>, execution_date=20230829T131212, start_date=20230829T131229, end_date=20230829T133915
[2023-08-29T13:39:16.226+0000] {standard_task_runner.py:104} ERROR - Failed to execute job 1948 for task << task id >> (Docker container failed: {'StatusCode': 1} lines << log line from Docker container >>
<< log line from Docker container >>
<< log line from Docker container >>
...
...
...
[2023-08-29T13:39:16.248+0000] {local_task_job_runner.py:228} INFO - Task exited with return code 1
[2023-08-29T13:39:16.463+0000] {taskinstance.py:2784} INFO - 0 downstream tasks scheduled from follow-on schedule check

This PR removes the output of the Docker container from the Airflow exception that is thrown. With this change the log output of the Docker container is only shown one single time in Airflow (regardless of the return code).


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg
Copy link

boring-cyborg bot commented Aug 30, 2023

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@eladkal
Copy link
Contributor

eladkal commented Aug 30, 2023

Does this PR aim to solve #33692 ?

@wolfdn
Copy link
Contributor Author

wolfdn commented Aug 30, 2023

Does this PR aim to solve #33692 ?

I think #33692 is unrelated to the issue that I'm trying to solve here.

This PR is about the logging output from Docker containers which is printed multiple times to the Airflow logs, whereas #33692 is about an issue that Python 3.11 prints exceptions differently, which messes up the logs in Airflow.
At least that's how I understood the description of that bug ticket - however I would assume that the issue in #33692 is not specific to the Docker operator.

@wolfdn
Copy link
Contributor Author

wolfdn commented Aug 31, 2023

I just noticed that there are some unit tests that check if the raised Exception of the Docker operator contains the container logs.

Before I'm adapting / removing those checks: Is there actually a reason why we want to have the container logs attached to the raised exception? I mean the logs of the Docker container are displayed anyways in Airflow (even before the exception is raised).
I currently only see the downside that the logs are printed 3 times in Airflow if we also attach them to the exception.
But maybe there is a good reason why we want/need this?

@uranusjr
Copy link
Member

Including logs in the exception message was added in #14761 (confusingly in a pull request that otherwise has nothing to do with the Docker operator) here: 5661273#diff-5ec82e1fcaaa38e04471f56b64ec319f67b146d3a18d8bdb69c2f973ea3dd585

But the behaviour was not explicitly tested until much later, in another seemingly unrelated pull request #21175.

So I think there’s really no particular reason to maintain the behaviour if it is problematic. It is probably still a good idea to keep the log somewhere (for example, raise a custom exception that has a logs attribute that holds the log message).

airflow/exceptions.py Outdated Show resolved Hide resolved
Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @eladkal on moving the exceptions to a separate module. Otherwise, LGTM

@potiuk potiuk merged commit 3d27504 into apache:main Sep 4, 2023
42 checks passed
@boring-cyborg
Copy link

boring-cyborg bot commented Sep 4, 2023

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants