-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time #4005
Conversation
Aren't we still making all the same queries? What's the time to the page being fully populated too please? |
Good call on the RBAC templates, will make a fix.
Queries are slightly different, in the past we made one query per DAG, now it's just a single query, so the constant overhead of each query is removed.
Locally the time for the page to be fully populated is pretty much instant. Prod is within a second, the query is pretty much a more lightweight version of the one being used to fetch and display task instance state on the same page so performance shouldn't be a concern. |
dfcaf50
to
f5593a2
Compare
@ashb addressed your comments, retrying CI now, but the two build jobs that failed out of nine were caused by CI flakiness. |
Codecov Report
@@ Coverage Diff @@
## master #4005 +/- ##
==========================================
- Coverage 75.79% 75.74% -0.05%
==========================================
Files 199 199
Lines 15946 15972 +26
==========================================
+ Hits 12086 12098 +12
- Misses 3860 3874 +14
Continue to review full report at Codecov.
|
CI has passed, ready to be merged. |
f5593a2
to
62aa5a7
Compare
62aa5a7
to
6607e48
Compare
Codecov Report
@@ Coverage Diff @@
## master #4005 +/- ##
==========================================
- Coverage 75.79% 75.74% -0.05%
==========================================
Files 199 199
Lines 15946 15972 +26
==========================================
+ Hits 12086 12098 +12
- Misses 3860 3874 +14
Continue to review full report at Codecov.
|
Ready to merge FYI. |
I'm happy to merge myself and take responsibility in fixing it if there is a breakage FYI. |
Merged @aoen. High risk, high reward! 💪 |
@Fokko thanks! I owe you a PR review :). |
@aoen Speak of the devil :D this seems to be causing test failures after this PR was merged, but only on MySQL somewhat odly. Can you take a look please, (we might revert this PR temporarily) |
Examples that (we think) started happening after this PR was merged: https://travis-ci.org/apache/incubator-airflow/jobs/451428747#L4660 in And same build on Mysql https://travis-ci.org/apache/incubator-airflow/jobs/451428748#L4660 Going to try reverting this PR and see if it fixes things, even though the error doesn't make any sense. |
This reverts commit 0287cce.
…4145) * Revert "[AIRFLOW-3190] Make flake8 compliant" This reverts commit 1691c98. * Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (#4005)" This reverts commit 0287cce.
…" (apache#4145) * Revert "[AIRFLOW-3190] Make flake8 compliant" This reverts commit 1691c98. * Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (apache#4005)" This reverts commit 0287cce.
…" (apache#4145) * Revert "[AIRFLOW-3190] Make flake8 compliant" This reverts commit 1691c98. * Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (apache#4005)" This reverts commit 0287cce.
…" (apache#4145) * Revert "[AIRFLOW-3190] Make flake8 compliant" This reverts commit 1691c98. * Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (apache#4005)" This reverts commit 0287cce.
…" (apache#4145) * Revert "[AIRFLOW-3190] Make flake8 compliant" This reverts commit 1691c98. * Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (apache#4005)" This reverts commit 0287cce.
@ashb Tests were passing for me before I merged this several times, and it looks like the old logs are gone. Did the issue disappear after the commit was reverted? Does look suspicious I admit. This has been running without issue in our prod. Let me try to run this 10 or so times in CI and see if it's still failing, if it is not I propose remerging, and at least getting a traceback next time it fails in master CI to help fix (I can monitor CI for a while to make sure). Also sorry for the late reply, need to figure out why I'm not getting email notifications anymore... |
:D I don't remember exactly what was going on anymore, sorry. It was probable this PR wasn't at fault |
…" (apache#4145) * Revert "[AIRFLOW-3190] Make flake8 compliant" This reverts commit 1691c98. * Revert "[AIRFLOW-3160] Load latest_dagruns asynchronously (apache#4005)" This reverts commit 0287cce.
…ously, speed up front page load time apache#4005
…synchronously, speed up front page load time apache#4005 (#7) * fb64f2e: [TWTR][AIRFLOW-XXX] Twitter Airflow Customizations + Fixup job scheduling without explicit_defaults_for_timestamp * reformat * 6607e48(airflow:master): [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time apache#4005 * flake8 fix
… Default Retries and fix a small DAG refresh bug (#8) * fb64f2e: [TWTR][AIRFLOW-XXX] Twitter Airflow Customizations + Fixup job scheduling without explicit_defaults_for_timestamp * reformat * 6607e48(airflow:master): [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time apache#4005 * a93d550: * a93d550: (HEAD, twitter/1.10+twtr) [TWTR][[AIRFLOW-4939]] Add Default Retries and fix a small DAG refresh bug (#3) (2 weeks ago) * flake8 fix
Make sure you have checked all steps below.
Jira
Description
The front page loads very slowly when the DB has latency because one blocking query is made per DAG against the DB. Even if we were to add indexes, the queries should still be batched to avoid an overhead of RTT * # of DAGs.
The latest dagruns should be loaded asynchronously and in batch like the other UI elements that query the database.
From my tests I was able to get the front page to reduce page load time of the front page from 8s to ~0.7s after this change with a DB in the west part of the USA with the Airflow cluster in the east part of the USA.
The load on the DB should also be slightly reduced due to this change as well.
Tests
UI changes only, tested on local webservers and our prod hosts.
Commits
Documentation
Code Quality
git diff upstream/master -u -- "*.py" | flake8 --diff
cc @ashb