-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task stuck in "scheduled" when running in backfill job #23145
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
I tried with this workaround #13542 (comment) and it made the stuck tasks running again. |
Is there any update regarding this issue? Apache Airflow 2.3.4 / Kubernetes Executor |
I think there were quite q few changes to backfill - so you might try 2.4.0rc1 @wookiist - and see if it solves the problem. It might not be it, but worth trying |
Is there any update regarding this issue? Apache Airflow 2.4.0 / Kubernetes Executor |
There will be no update to that issue any more unless the author retests it in latest version of Airlfow and confirms whether it was solved or not and provide some more evidences from the latest version. This issue is presumed closed. And your problem might or might not be the same even if it looks similar. So if you want to have some help on your problem, you should open a new issue with all the details you can provide - reproducible path, circumstances, logs, screenshots and all evidences you can find. Ideally in the latest released version: 2.5.0 as of now. In any case any fixes will be implemented in 2.5.* so upgrading to latest version is something that you will have to do anyway to fix, so better to do it sooner to report it already based on 2.5.* - and maybe you will find that the issue is already solved there, which will save time both - you and anyone who would look at the issue. There is no action possible to be taken by somone stating "I have the same issue" without providing any evidences, and since it is really difficult to asses if the issue is the same or not, so the only way (if you want help) is to report a new issue with the details - feel free to refer to that issue by number as likely similar issue, that might help with finding out if they are related. |
Apache Airflow version
2.2.4
What happened
We are running airflow 2.2.4 with KubernetesExecutor. I have created a dag to run airflow backfill command with SubprocessHook. What was observed is that when I started to backfill a few days' dagruns the backfill would get stuck with some dag runs having tasks staying in the "scheduled" state and never getting running.
We are using the default pool and the pool is totoally free when the tasks got stuck.
I could find some logs saying:
TaskInstance: <TaskInstance: test_dag_2.task_1 backfill__2022-03-29T00:00:00+00:00 [queued]> found in queued state but was not launched, rescheduling
and nothing else in the log.What you think should happen instead
The tasks stuck in "scheduled" should start running when there is free slot in the pool.
How to reproduce
Airflow 2.2.4 with python 3.8.13, KubernetesExecutor running in AWS EKS.
One backfill command example is:
airflow dags backfill test_dag_2 -s 2022-03-01 -e 2022-03-10 --rerun-failed-tasks
The test_dag_2 dag is like:
Operating System
Debian GNU/Linux
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==3.0.0
apache-airflow-providers-celery==2.1.0
apache-airflow-providers-cncf-kubernetes==3.0.2
apache-airflow-providers-docker==2.4.1
apache-airflow-providers-elasticsearch==2.2.0
apache-airflow-providers-ftp==2.0.1
apache-airflow-providers-google==6.4.0
apache-airflow-providers-grpc==2.0.1
apache-airflow-providers-hashicorp==2.1.1
apache-airflow-providers-http==2.0.3
apache-airflow-providers-imap==2.2.0
apache-airflow-providers-microsoft-azure==3.6.0
apache-airflow-providers-microsoft-mssql==2.1.0
apache-airflow-providers-odbc==2.0.1
apache-airflow-providers-postgres==3.0.0
apache-airflow-providers-redis==2.0.1
apache-airflow-providers-sendgrid==2.0.1
apache-airflow-providers-sftp==2.4.1
apache-airflow-providers-slack==4.2.0
apache-airflow-providers-snowflake==2.5.0
apache-airflow-providers-sqlite==2.1.0
apache-airflow-providers-ssh==2.4.0
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: