-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trigger_rule=TriggerRule.ONE_FAILED doesn't work properly with task_groups #30333
Comments
checked on 2.5.3, still reproduced |
look like this happening when trying to expand on task_group. |
@pankajastro yes, this happens only with mapped task groups, but I confirm that there is a bug on this part. Unfortunately I didn't have time to work on it. |
@hussein-awala @eladkal I can take it and the related issue #30334 |
@dzhigimont are you still working on this issue? |
Hi @eladkal I am working on it. I had some pauses in the process |
OK, so I went on a debug adventure, and here’s my conclusion. The analysis in #32701 (comment) is spot on, but the problem is deeper. Currently, we first check whether a task has all its dependencies satisfied, and only expand it if so: airflow/airflow/models/dagrun.py Lines 853 to 869 in 9f3af9c
(Notice how Whether a trigger rule dependency is satisfied, however, is not known before expansion, since we only want to consider “relevant” upstreams. This is why #32802 doesn’t work. The only way to work around this is to “fake” a dependency passage to allow expansion to happen. After expansion, the scheduling code would re-check deps, which would work as expected since now the tis have correct map index values. I actually don’t think #32701 completely fixes the issue—if I understand the code correctly, it will fail when we add nested task mapping (i.e. a mapped task inside a mapped task group), since that would make another dep not passing and break the workaround. But I guess that’s a problem we can address when (if) we ever reach there. I’ll think about if it’s possible to enhance #32701, or how we can better to build on it if it’d involve too much change. |
On further consideration, I don’t think #32701 should be merged as-is. While marking all upstream tis as relevant fixed ONE_FAILED, it would create a bug in the reverse direction for things such as ALL_FAILED, since in that case, TriggerRuleDep would over-consider upstream tis. I think TriggerRuleDep needs a much more significant rewrite to implement a proper fix. |
Is there any kind of workaround for this issue? |
The DAG in the |
Apache Airflow version
2.5.2 (checked on 2.5.3 also)
What happened
I'd like to setup "watcher" pattern inside of task_group, but the task always marked as "skipped".
reference to a similar issue: #30334 30334
What you think should happen instead
No response
How to reproduce
Operating System
Arch Linux (kernel 6.2.6)
Versions of Apache Airflow Providers
apache-airflow-providers-amazon==7.3.0
apache-airflow-providers-common-sql==1.3.4
apache-airflow-providers-ftp==3.3.1
apache-airflow-providers-google==8.11.0
apache-airflow-providers-http==4.2.0
apache-airflow-providers-imap==3.1.1
apache-airflow-providers-postgres==5.4.0
apache-airflow-providers-sqlite==3.3.1
Deployment
Docker-Compose
Deployment details
No response
Anything else
the result graph screenshot:
task instance details screenshot:
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: