-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More optimized lazy-loading of provider information #17304
More optimized lazy-loading of provider information #17304
Conversation
636dc8a
to
6daa185
Compare
6daa185
to
d2aa352
Compare
6b95b4b
to
5bce2c6
Compare
With this change we truly lazy-load hooks and external_links only when we need them. Previously they were loaded when any of the properties of ProvidersManager was used, but with this change in some scenarios where only extra links are used or when we only need list of providers, but we do not need details on which custom hooks are needed, there will be much faster initialization. This is mainly for some CLI commands (for example `airlfow providers list` is much faster now), but also in some scenarios where for example .get_conn() is never used in Tasks, tasks might also never need to import/load the hooks and they might perform faster, with smaller memory footprint.
5bce2c6
to
2cdd7e2
Compare
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
Tansient errors only - seems so local_task_job became flaky after recent "QUEUED" state intro - might be worth looking at @ephraimbuddy - I will make issues for those quickly (and qurarantine them) |
Should this be in 2.1.3 vs 2.2? |
Easy cherry-pick and it is rather save. I added it to 2.1.3 |
With this change we truly lazy-load hooks and external_links only when we need them. Previously they were loaded when any of the properties of ProvidersManager was used, but with this change in some scenarios where only extra links are used or when we only need list of providers, but we do not need details on which custom hooks are needed, there will be much faster initialization. This is mainly for some CLI commands (for example `airlfow providers list` is much faster now), but also in some scenarios where for example .get_conn() is never used in Tasks, tasks might also never need to import/load the hooks and they might perform faster, with smaller memory footprint. (cherry picked from commit 2dc7aa8)
With this change we truly lazy-load hooks and external_links only when we need them. Previously they were loaded when any of the properties of ProvidersManager was used, but with this change in some scenarios where only extra links are used or when we only need list of providers, but we do not need details on which custom hooks are needed, there will be much faster initialization. This is mainly for some CLI commands (for example `airlfow providers list` is much faster now), but also in some scenarios where for example .get_conn() is never used in Tasks, tasks might also never need to import/load the hooks and they might perform faster, with smaller memory footprint. (cherry picked from commit 2dc7aa8)
With this change we truly lazy-load hooks and external_links only when we need them. Previously they were loaded when any of the properties of ProvidersManager was used, but with this change in some scenarios where only extra links are used or when we only need list of providers, but we do not need details on which custom hooks are needed, there will be much faster initialization. This is mainly for some CLI commands (for example `airlfow providers list` is much faster now), but also in some scenarios where for example .get_conn() is never used in Tasks, tasks might also never need to import/load the hooks and they might perform faster, with smaller memory footprint. (cherry picked from commit 2dc7aa8)
# This is never executed, but tricks static analyzers (PyDev, PyCharm,) | ||
# into knowing the types of these symbols, and what | ||
# they contain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@potiuk Was this line meant to be removed? (Sorry if I don't fully understand this PR, the comment just reads a bit funny.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all - you can add PR to bring it back :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged! Quickest contribution EVER!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I did not expect my first contribution to be this quick 🙂
Fixes comment deleted in apache#17304
Fixes comment deleted in #17304
With this change we truly lazy-load hooks and external_links only
when we need them. Previously they were loaded when any of the
properties of ProvidersManager was used, but with this change
in some scenarios where only extra links are used or when we
only need list of providers, but we do not need details on
which custom hooks are needed, there will be much
faster initialization. This is mainly for some CLI commands
(for example
airlfow providers list
is much faster now), butalso in some scenarios where for example .get_conn() is never
used in Tasks, tasks might also never need to import/load the hooks
and they might perform faster, with smaller memory footprint.