Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tasks occasionally receive SIGTERM at the end of a successful run on Celery Executor #17300

Closed
jhh3000 opened this issue Jul 29, 2021 · 2 comments
Labels
duplicate Issue that is duplicated kind:bug This is a clearly a bug

Comments

@jhh3000
Copy link

jhh3000 commented Jul 29, 2021

Apache Airflow version: 2.1.0 using Celery Executor

Environment: Docker

  • Cloud provider or hardware configuration: AWS ECS Fargate
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others: docker base python:3.8.10 using direct pip install of apache-airflow[sentry]==2.1.0

What happened:

At the end of task execution, Airflow will randomly send a SIGTERM to the task even though it was marked as being successful after finishing. This doesn't happen to every task, nor to every execution.

The strange part is that it seems to be saying that the task was externally marked as successfully executing -- but that's not the case here, task status and execution are being managed by Airflow worker / scheduler alone.

One thing I've tried as a hail mary (i know it shouldn't affect anything) is to set the killed_task_cleanup_time = 99999999... so I'm out of ideas.

[2021-07-28 23:06:27,264] {{python.py:151}} INFO - Done. Returned value was: None
[2021-07-28 23:06:27,296] {{taskinstance.py:1184}} INFO - Marking task as SUCCESS. dag_id=sample_dag_id, task_id=sample_task_id, execution_date=20210727T230600, start_date=20210728T230621, end_date=20210728T230627
[2021-07-28 23:06:27,585] {{local_task_job.py:196}} WARNING - State of this instance has been externally set to success. Terminating instance.
[2021-07-28 23:06:27,665] {{process_utils.py:100}} INFO - Sending Signals.SIGTERM to GPID 6562
[2021-07-28 23:06:27,666] {{taskinstance.py:1264}} ERROR - Received SIGTERM. Terminating subprocesses.
[2021-07-28 23:06:28,405] {{process_utils.py:66}} INFO - Process psutil.Process(pid=6562, status='terminated', exitcode=1, started='23:06:21') (6562) terminated with exit code 1

What you expected to happen:

These tasks should exit without requiring a SIGTERM

How to reproduce it:

This bug is hard to reproduce since it randomly affects a running task at the end of its execution. I have been unable to find a reliable way of reproducing it, but I'll keep trying.

@jhh3000 jhh3000 added the kind:bug This is a clearly a bug label Jul 29, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Jul 29, 2021

Thanks for opening your first issue here! Be sure to follow the issue template!

@jhh3000 jhh3000 changed the title Tasks occasionally receive SIGTERM at the end of a successful run Tasks occasionally receive SIGTERM at the end of a successful run on Celery Executor Jul 29, 2021
@jedcunningham
Copy link
Member

Thanks @jhh3000. This was already reported in #16227 and was fixed in #16289 which will be in the next release (2.1.3).

This is a workaround until it is available: #16227 (comment)

@jedcunningham jedcunningham added the duplicate Issue that is duplicated label Jul 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate Issue that is duplicated kind:bug This is a clearly a bug
Projects
None yet
Development

No branches or pull requests

2 participants