-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIP-44 make database isolation mode work in Breeze #40894
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
potiuk
requested review from
ashb,
ryanahamilton,
bbovenzi,
pierrejeambrun,
kaxil,
bolkedebruin and
XD-DENG
as code owners
July 19, 2024 17:07
boring-cyborg
bot
added
area:CLI
area:dev-tools
area:providers
area:serialization
area:webserver
Webserver related Issues
provider:fab
labels
Jul 19, 2024
I run quite many DAGs in "DB isolation mode" and with those changes, it looks like celery worker nicely runs them (and it's not even very slow it looks). For now "mini scheduler" is disabled - could be brought back likely if we decide to serialize DAG (currently we don't). |
|
vincbeck
approved these changes
Jul 19, 2024
jscheffl
approved these changes
Jul 19, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool!
dstandish
reviewed
Jul 19, 2024
potiuk
force-pushed
the
breeze-working-in-isolation-mode
branch
from
July 20, 2024 07:18
017d434
to
2810bc0
Compare
With this PR, it is possible to get a working "DB isolation" working solution with celery executor. From my tests it works comparably fast to the non-DB isolation executor. Things changed here: * remove "schedule_downstream_tasks" endpoint. It is currently not possible to get it as DAG object is removed during serialization and this is where this method calculates which tasks to schedule * when we are forcing DB access in DB isolation mode, we print log message that we are switching to using DB for appropriate components. We also make sure to remove DB configuration just in case it is set (this allows to run tests in breeze environment with more certainty) * the detection whether to force direct DB access is made in _main - this way regular commands run in breeze (migrate/user etc. can use the DB while intializing the environment and actions can be logged to DB or via RPC calls. * improved diagnostics Co-authored-by: Vincent <[email protected]>
potiuk
force-pushed
the
breeze-working-in-isolation-mode
branch
from
July 20, 2024 17:18
2810bc0
to
07e5b10
Compare
ephraimbuddy
added
the
changelog:skip
Changes that should be skipped from the changelog (CI, tests, etc..)
label
Jul 22, 2024
romsharon98
pushed a commit
to romsharon98/airflow
that referenced
this pull request
Jul 26, 2024
With this PR, it is possible to get a working "DB isolation" working solution with celery executor. From my tests it works comparably fast to the non-DB isolation executor. Things changed here: * remove "schedule_downstream_tasks" endpoint. It is currently not possible to get it as DAG object is removed during serialization and this is where this method calculates which tasks to schedule * when we are forcing DB access in DB isolation mode, we print log message that we are switching to using DB for appropriate components. We also make sure to remove DB configuration just in case it is set (this allows to run tests in breeze environment with more certainty) * the detection whether to force direct DB access is made in _main - this way regular commands run in breeze (migrate/user etc. can use the DB while intializing the environment and actions can be logged to DB or via RPC calls. * improved diagnostics Co-authored-by: Vincent <[email protected]>
31 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area:CLI
area:dev-tools
area:providers
area:serialization
area:webserver
Webserver related Issues
changelog:skip
Changes that should be skipped from the changelog (CI, tests, etc..)
provider:fab
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With this PR, it is possible to get a working "DB isolation" working solution with celery executor. From my tests it works comparably fast to the non-DB isolation executor.
Things changed here:
remove "schedule_downstream_tasks" endpoint. It is currently not possible to get it as DAG object is removed during serialization and this is where this method calculates which tasks to schedule
when we are forcing DB access in DB isolation mode, we print log message that we are switching to using DB for appropriate components. We also make sure to remove DB configuration just in case it is set (this allows to run tests in breeze environment with more certainty)
the detection whether to force direct DB access is made in _main - this way regular commands run in breeze (migrate/user etc. can use the DB while intializing the environment and actions can be logged to DB or via RPC calls.
improved diagnostics
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.