Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This fixes an issue were a database session got stuck #31128

Merged
merged 4 commits into from
Jun 14, 2023

Conversation

stijndehaes
Copy link
Contributor

The xcom_pull functionality got changed to be loaded lazy. This introduced a bug where a session is not properly closed. This only happens when inlets are defined on a task. It pops up on the airflow workers launched by the KubernetesExecutor. There is a database session stuck in idle in transaction for as long as the job is running. This uses precious resources.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@potiuk
Copy link
Member

potiuk commented May 8, 2023

Which version of the airflow did you observe it in (you mentioned a "change" -> in which version of airflow it started to show that behaviour for you?

@potiuk potiuk added this to the Airflow 2.6.1 milestone May 8, 2023
@stijndehaes
Copy link
Contributor Author

@potiuk thank you for responding so quickly.
So I initially observed this in Airflow 2.5.3, but reverted to 2.5.2 and 2.5.1 and still observed the issue. I didn't go back to 2.4 though, but we only recently noticed the issue since multiple long running jobs now use inlets, and before they did not.

My guess it that the following PR created the issue: #27699
More specifically the usage of LazyXComAccess

@stijndehaes stijndehaes force-pushed the bugfix/session-not-properly-closed branch from c4e5d82 to bb7891d Compare May 8, 2023 09:12
@stijndehaes stijndehaes force-pushed the bugfix/session-not-properly-closed branch from bb7891d to 5ccc4bc Compare May 9, 2023 08:17
@stijndehaes
Copy link
Contributor Author

@uranusjr I see this is approved but not merged, is there anything else I need to do? :)

@uranusjr
Copy link
Member

This change is close to core Airflow enough I think it needs two maintainer reviews.

Comment on lines +140 to +144
with create_session() as session:
_inlets = self.xcom_pull(
context, task_ids=task_ids, dag_id=self.dag_id, key=PIPELINE_OUTLETS, session=session
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with create_session() as session:
_inlets = self.xcom_pull(
context, task_ids=task_ids, dag_id=self.dag_id, key=PIPELINE_OUTLETS, session=session
)
_inlets = self.xcom_pull(
context, task_ids=task_ids, dag_id=self.dag_id, key=PIPELINE_OUTLETS
)

The call to create_session is no longer needed since you decorated xcom_pull with @provide_session. The session will be automatically provided

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am i missing something @uranusjr ?

Copy link
Member

@uranusjr uranusjr Jun 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both work (and using the default session from @provide_session is cleaner so you’re correct)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the create_session will actually remove the bug fix. As explained in the comment above the create session: the issue is that xcom_pull returns a LazyXComAccess iterator. Since we have returned from the xcom_pull function, the session that would have been automatically created there will be closed and thus a new one that leaks will be started when we iterate. By wrapping this in a manually created session the leak is fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see, the point is to make the session live slightly longer so line 147 below can still use it.

Copy link
Member

@uranusjr uranusjr Jun 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a comment above this block would help future readers understand the context better.

(Also I think the blank line + comments in lines 144-146 might actually hurt readability a bit since they make the two lines feel less associated. Maybe moving them above the entire block to make the create_session, xcom_pull, andinlets.extend lines stick close together would help as well.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove the empty lines and 2 comments. I don't think the added much here.

There already is a comment above the create_session block to explain things, so I am not sure I correctly understand what you mean with Maybe a comment above this block would help future readers understand the context better.. Can you clarify it for me?

@eladkal eladkal modified the milestones: Airflow 2.6.2, Airlfow 2.6.3 Jun 8, 2023
stijndehaes and others added 3 commits June 8, 2023 15:40
The xcom_pull functionality got changed to be loaded lazy.
This introduced a bug where a session is not properly closed.
This only happens when inlets are defined on a task.
It pops up on the airflow workers launched by the KubernetesExecutor.
There is a database session stuck in idle in transaction for as long
as the job is running. This uses precious resources.
Co-authored-by: Tzu-ping Chung <[email protected]>
Co-authored-by: Ephraim Anierobi <[email protected]>
Remove the newlines and small comments. Should make it
clear that the comment above the block is really important
@stijndehaes stijndehaes force-pushed the bugfix/session-not-properly-closed branch from eb67cab to 9e2701c Compare June 8, 2023 13:41
@uranusjr uranusjr merged commit a6f859e into apache:main Jun 14, 2023
@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Jul 6, 2023
ephraimbuddy pushed a commit that referenced this pull request Jul 6, 2023
Co-authored-by: Tzu-ping Chung <[email protected]>
Co-authored-by: Ephraim Anierobi <[email protected]>
Co-authored-by: eladkal <[email protected]>
(cherry picked from commit a6f859e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:lineage type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants