-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use back_populates
instead of sqlalchemy.orm.backref
in bidirectional relationships
#39430
Conversation
back_populates
instead of backref
in bidirectional relationshipsback_populates
instead of sqlalchemy.orm.backref
in bidirectional relationships
1a1e18d
to
168fc11
Compare
1e76071
to
3a0db83
Compare
3a0db83
to
9ed927a
Compare
|
||
Only makes sense for SchedulerJob and BackfillJob instances. | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this? Let's move it to the task_instances_enqueued. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This do not required anymore, you could achieve it by explicit bidirectional relationships expressions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO
This construction looks more clear
primaryjoin="Job.id == foreign(DagRun.creating_job_id)"
Rather than this
primaryjoin=lambda: Job.id == foreign(_resolve_dagrun_model().creating_job_id)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm talking about the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh... my bad. It already moved into the L104
https://github.com/apache/airflow/blob/9ed927a0cfdcc47e50a03bbbde50dcf991b7c15e/airflow/jobs/job.py#L104
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Original comment stored information about two different relationships
"TaskInstances which have been enqueued by this Job." -> task_instances_enqueued
"Only makes sense for SchedulerJob and BackfillJob" -> dag_runs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some proofs
DAG
from __future__ import annotations
from datetime import datetime, timezone
from airflow.decorators import task
from airflow.models.dag import DAG
START_DATE = datetime(2024, 2, 1, tzinfo=timezone.utc)
with DAG("pr_39430", schedule="@once", tags=["pr", "39430", "bi-directional"], start_date=START_DATE):
@task
def do_nothing(): ...
do_nothing()
Run one time by scheduler and one manual run, by trigger DAG from the UI, results in Tables
Table job
column | ||||
---|---|---|---|---|
id | 3 | 4 | 1 | 2 |
dag_id | pr_39430 | pr_39430 | null | null |
state | success | success | running | running |
job_type | LocalTaskJob | LocalTaskJob | TriggererJob | SchedulerJob |
start_date | 2024-05-22 11:45:23.867096 +00:00 | 2024-05-22 11:45:29.782799 +00:00 | 2024-05-22 11:44:49.709191 +00:00 | 2024-05-22 11:44:49.813874 +00:00 |
end_date | 2024-05-22 11:45:24.143576 +00:00 | 2024-05-22 11:45:30.040894 +00:00 | null | null |
latest_heartbeat | 2024-05-22 11:45:23.853896 +00:00 | 2024-05-22 11:45:29.766046 +00:00 | 2024-05-22 11:45:35.885734 +00:00 | 2024-05-22 11:45:37.722701 +00:00 |
executor_class | null | null | null | null |
hostname | cd2591016c8c | cd2591016c8c | cd2591016c8c | cd2591016c8c |
unixname | root | root | root | root |
Table task_instance
column | ||
---|---|---|
task_id | do_nothing | do_nothing |
dag_id | pr_39430 | pr_39430 |
run_id | scheduled__2024-02-01T00:00:00+00:00 | manual__2024-05-22T11:45:28.453199+00:00 |
map_index | -1 | -1 |
start_date | 2024-05-22 11:45:23.921193 +00:00 | 2024-05-22 11:45:29.821869 +00:00 |
end_date | 2024-05-22 11:45:24.085409 +00:00 | 2024-05-22 11:45:29.979149 +00:00 |
duration | 0.164216 | 0.15728 |
state | success | success |
try_number | 1 | 1 |
max_tries | 0 | 0 |
hostname | cd2591016c8c | cd2591016c8c |
unixname | root | root |
job_id | 3 | 4 |
pool | default_pool | default_pool |
pool_slots | 1 | 1 |
queue | default | default |
priority_weight | 1 | 1 |
operator | _PythonDecoratedOperator | _PythonDecoratedOperator |
custom_operator_name | @task | @task |
queued_dttm | 2024-05-22 11:45:23.394469 +00:00 | 2024-05-22 11:45:29.546020 +00:00 |
queued_by_job_id | 2 | 2 |
pid | 2375 | 2377 |
executor | null | null |
executor_config | 0x80057D942E | 0x80057D942E |
updated_at | 2024-05-22 11:45:23.930163 +00:00 | 2024-05-22 11:45:29.831787 +00:00 |
rendered_map_index | null | null |
external_executor_id | null | null |
trigger_id | null | null |
trigger_timeout | null | null |
next_method | null | null |
next_kwargs | null | null |
task_display_name | do_nothing | do_nothing |
Table dag_run
column | ||
---|---|---|
id | 1 | 2 |
dag_id | pr_39430 | pr_39430 |
queued_at | 2024-05-22 11:45:23.351323 +00:00 | 2024-05-22 11:45:28.490357 +00:00 |
execution_date | 2024-02-01 00:00:00.000000 +00:00 | 2024-05-22 11:45:28.453199 +00:00 |
start_date | 2024-05-22 11:45:23.373351 +00:00 | 2024-05-22 11:45:29.516604 +00:00 |
end_date | 2024-05-22 11:45:24.476332 +00:00 | 2024-05-22 11:45:30.609079 +00:00 |
state | success | success |
run_id | scheduled__2024-02-01T00:00:00+00:00 | manual__2024-05-22T11:45:28.453199+00:00 |
creating_job_id | 2 | null |
external_trigger | false | true |
run_type | scheduled | manual |
conf | 0x80057D942E | 0x80057D942E |
data_interval_start | 2024-02-01 00:00:00.000000 +00:00 | 2024-05-22 11:45:28.453199 +00:00 |
data_interval_end | 2024-02-01 00:00:00.000000 +00:00 | 2024-05-22 11:45:28.453199 +00:00 |
last_scheduling_decision | 2024-05-22 11:45:24.474830 +00:00 | 2024-05-22 11:45:30.606623 +00:00 |
dag_hash | cd6f55b5c736bf50d2f341663c0ad83f | cd6f55b5c736bf50d2f341663c0ad83f |
log_template_id | 2 | 2 |
updated_at | 2024-05-22 11:45:24.477225 +00:00 | 2024-05-22 11:45:30.609766 +00:00 |
clear_number | 0 | 0 |
Relationships
LocalJob (id: 3) create TI (run_id: scheduled__2024-02-01T00:00:00+00:00
)
LocalJob (id: 4) create TI (run_id: manual__2024-05-22T11:45:28.453199+00:00
)
SchedulerJob (id: 2) queued TI (run_id: scheduled__2024-02-01T00:00:00+00:00
)
SchedulerJob (id: 2) queued TI (run_id: manual__2024-05-22T11:45:28.453199+00:00
) - Even if it Manual DAG Run
DagRun (run_id: scheduled__2024-02-01T00:00:00+00:00
) created by SchedulerJob (id: 2)
DagRun (run_id: manual__2024-05-22T11:45:28.453199+00:00
) doesn't created by any Job
…l.dag_owner_links`
9ed927a
to
106026f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There’s a merge conflict though
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions. |
backref
is a legacy approach for construct relationships, instead of that preferable way is useback_populates
keyword argument inrelationship
for built it explicitly^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.