-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Task group gets marked as upstream_failed when dynamically mapped with expand_kwargs even though all upstream tasks were skipped or successfully finished. #33446
Labels
affected_version:2.6
Issues Reported for 2.6
area:dynamic-task-mapping
AIP-42
kind:bug
This is a clearly a bug
Comments
danielcham
added
area:core
kind:bug
This is a clearly a bug
needs-triage
label for new issues that we didn't triage yet
labels
Aug 16, 2023
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
RNHTTR
added
area:dynamic-task-mapping
and removed
area:core
needs-triage
label for new issues that we didn't triage yet
labels
Aug 16, 2023
eladkal
added
affected_version:2.6
Issues Reported for 2.6
area:dynamic-task-mapping
AIP-42
and removed
area:dynamic-task-mapping
labels
Aug 16, 2023
ephraimbuddy
added a commit
to astronomer/airflow
that referenced
this issue
Aug 29, 2023
When a MappedTaskGroup has upstream dependencies, the tasks in the group don't wait for the upstream tasks before they start running, this causes the tasks to fail. From my investigation, the tasks inside the MappedTaskGroup don't have upstream tasks while the MappedTaskGroup has the upstream tasks properly set. Due to this, the task's dependencies are met even though the Group has upstreams that haven't finished. The Fix was to set upstreams after creating the task group with the factory Closes: apache#33446
ephraimbuddy
added a commit
that referenced
this issue
Aug 29, 2023
* Fix MappedTaskGroup tasks not respecting upstream dependency When a MappedTaskGroup has upstream dependencies, the tasks in the group don't wait for the upstream tasks before they start running, this causes the tasks to fail. From my investigation, the tasks inside the MappedTaskGroup don't have upstream tasks while the MappedTaskGroup has the upstream tasks properly set. Due to this, the task's dependencies are met even though the Group has upstreams that haven't finished. The Fix was to set upstreams after creating the task group with the factory Closes: #33446 * set the relationship in __exit__
ephraimbuddy
added a commit
that referenced
this issue
Sep 1, 2023
* Fix MappedTaskGroup tasks not respecting upstream dependency When a MappedTaskGroup has upstream dependencies, the tasks in the group don't wait for the upstream tasks before they start running, this causes the tasks to fail. From my investigation, the tasks inside the MappedTaskGroup don't have upstream tasks while the MappedTaskGroup has the upstream tasks properly set. Due to this, the task's dependencies are met even though the Group has upstreams that haven't finished. The Fix was to set upstreams after creating the task group with the factory Closes: #33446 * set the relationship in __exit__ (cherry picked from commit fe27031)
2 tasks
ahidalgob
pushed a commit
to GoogleCloudPlatform/composer-airflow
that referenced
this issue
May 15, 2024
* Fix MappedTaskGroup tasks not respecting upstream dependency When a MappedTaskGroup has upstream dependencies, the tasks in the group don't wait for the upstream tasks before they start running, this causes the tasks to fail. From my investigation, the tasks inside the MappedTaskGroup don't have upstream tasks while the MappedTaskGroup has the upstream tasks properly set. Due to this, the task's dependencies are met even though the Group has upstreams that haven't finished. The Fix was to set upstreams after creating the task group with the factory Closes: apache/airflow#33446 * set the relationship in __exit__ (cherry picked from commit fe27031382e2034b59a23db1c6b9bdbfef259137) GitOrigin-RevId: 22df7b111261c78fbeeb38191226f9694986bd05
kosteev
pushed a commit
to GoogleCloudPlatform/composer-airflow
that referenced
this issue
Jul 18, 2024
* Fix MappedTaskGroup tasks not respecting upstream dependency When a MappedTaskGroup has upstream dependencies, the tasks in the group don't wait for the upstream tasks before they start running, this causes the tasks to fail. From my investigation, the tasks inside the MappedTaskGroup don't have upstream tasks while the MappedTaskGroup has the upstream tasks properly set. Due to this, the task's dependencies are met even though the Group has upstreams that haven't finished. The Fix was to set upstreams after creating the task group with the factory Closes: apache/airflow#33446 * set the relationship in __exit__ GitOrigin-RevId: fe27031382e2034b59a23db1c6b9bdbfef259137
kosteev
pushed a commit
to GoogleCloudPlatform/composer-airflow
that referenced
this issue
Sep 20, 2024
* Fix MappedTaskGroup tasks not respecting upstream dependency When a MappedTaskGroup has upstream dependencies, the tasks in the group don't wait for the upstream tasks before they start running, this causes the tasks to fail. From my investigation, the tasks inside the MappedTaskGroup don't have upstream tasks while the MappedTaskGroup has the upstream tasks properly set. Due to this, the task's dependencies are met even though the Group has upstreams that haven't finished. The Fix was to set upstreams after creating the task group with the factory Closes: apache/airflow#33446 * set the relationship in __exit__ GitOrigin-RevId: fe27031382e2034b59a23db1c6b9bdbfef259137
kosteev
pushed a commit
to GoogleCloudPlatform/composer-airflow
that referenced
this issue
Nov 8, 2024
* Fix MappedTaskGroup tasks not respecting upstream dependency When a MappedTaskGroup has upstream dependencies, the tasks in the group don't wait for the upstream tasks before they start running, this causes the tasks to fail. From my investigation, the tasks inside the MappedTaskGroup don't have upstream tasks while the MappedTaskGroup has the upstream tasks properly set. Due to this, the task's dependencies are met even though the Group has upstreams that haven't finished. The Fix was to set upstreams after creating the task group with the factory Closes: apache/airflow#33446 * set the relationship in __exit__ GitOrigin-RevId: fe27031382e2034b59a23db1c6b9bdbfef259137
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affected_version:2.6
Issues Reported for 2.6
area:dynamic-task-mapping
AIP-42
kind:bug
This is a clearly a bug
Apache Airflow version
2.6.3
What happened
I am writing a DAG that transfers data from MSSQL to BigQuery, The part of the ETL process that actually fetches the data from MSSQL and moves it to BQ needs to parallelized.
I am trying to write it as a task group where the first task moves data from MSSQL to GCS, and the 2nd task loads the file into BQ.
for some odd reason when I expand the task group it is automatically marked as upstream_failed , at the very first moment the DAG is triggered.
I have tested this with a simple dag (provided below) as well and the bug was reproduced.
I found a similar issue here but the bug seems to persist even after configuring
AIRFLOW__SCHEDULER__SCHEDULE_AFTER_TASK_EXECUTION=False
What you think should happen instead
The task group should be dynamically expanded after all upstream tasks have finished since
expand_kwargs
needs the previous task's output.How to reproduce
Operating System
MacOS 13.4.1
Versions of Apache Airflow Providers
No providers needed to reproduce
Deployment
Docker-Compose
Deployment details
Docker-compose
Airflow image: apache/airflow:2.6.3-python3.9
Executor: Celery
Messaging queue: redis
Metadata DB: MySQL 5.7
Anything else
The problem occurs every time.
Here are some of the scheduler logs that may be relevant.
As can be seen from the logs, no upstream tasks are in
done
state yet the expanded task is set asupstream_failed
.slack discussion
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: