-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only count bad refs when moved
table exists
#23491
Merged
ashb
merged 12 commits into
apache:main
from
astronomer:dont-count-unless-table-already-there
May 6, 2022
Merged
Only count bad refs when moved
table exists
#23491
ashb
merged 12 commits into
apache:main
from
astronomer:dont-count-unless-table-already-there
May 6, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ashb
approved these changes
May 5, 2022
github-actions
bot
added
the
full tests needed
We need to run full set of tests for this PR to merge
label
May 5, 2022
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
dstandish
force-pushed
the
dont-count-unless-table-already-there
branch
from
May 5, 2022 19:58
2276083
to
2c09f1b
Compare
jedcunningham
added
the
use public runners
Makes sure that Public runners are used even if commiters creates the PR (useful for testing)
label
May 6, 2022
jedcunningham
approved these changes
May 6, 2022
jedcunningham
removed
the
use public runners
Makes sure that Public runners are used even if commiters creates the PR (useful for testing)
label
May 6, 2022
Co-authored-by: Jed Cunningham <[email protected]>
Co-authored-by: Jed Cunningham <[email protected]>
ashb
force-pushed
the
dont-count-unless-table-already-there
branch
from
May 6, 2022 11:49
c314182
to
aa719bd
Compare
Test failures were random. Merging now. |
jedcunningham
pushed a commit
to astronomer/airflow
that referenced
this pull request
May 6, 2022
This keeps the logic to fail without upgrading when (A) there are bad rows and (B) the "moved" table already exists. But we optimize so that we don't count the bad rows unless the "moved" table is there. Previously we counted always, but the first time a user attempts upgrade, the tables won't be there so there's no point in counting. Instead what we do is skip right to the CTAS, creating the _airflow_moved tables. If there aren't any rows in the "moved" table, then we delete the table immediately. Also included here is a delete optimization, where we join to the moved table instead of running the not exists query again. Co-authored-by: Jed Cunningham <[email protected]> Co-authored-by: Ash Berlin-Taylor <[email protected]> (cherry picked from commit 6cc41ab)
ephraimbuddy
pushed a commit
that referenced
this pull request
May 8, 2022
This keeps the logic to fail without upgrading when (A) there are bad rows and (B) the "moved" table already exists. But we optimize so that we don't count the bad rows unless the "moved" table is there. Previously we counted always, but the first time a user attempts upgrade, the tables won't be there so there's no point in counting. Instead what we do is skip right to the CTAS, creating the _airflow_moved tables. If there aren't any rows in the "moved" table, then we delete the table immediately. Also included here is a delete optimization, where we join to the moved table instead of running the not exists query again. Co-authored-by: Jed Cunningham <[email protected]> Co-authored-by: Ash Berlin-Taylor <[email protected]> (cherry picked from commit 6cc41ab)
ephraimbuddy
pushed a commit
that referenced
this pull request
May 21, 2022
This keeps the logic to fail without upgrading when (A) there are bad rows and (B) the "moved" table already exists. But we optimize so that we don't count the bad rows unless the "moved" table is there. Previously we counted always, but the first time a user attempts upgrade, the tables won't be there so there's no point in counting. Instead what we do is skip right to the CTAS, creating the _airflow_moved tables. If there aren't any rows in the "moved" table, then we delete the table immediately. Also included here is a delete optimization, where we join to the moved table instead of running the not exists query again. Co-authored-by: Jed Cunningham <[email protected]> Co-authored-by: Ash Berlin-Taylor <[email protected]> (cherry picked from commit 6cc41ab)
61 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
full tests needed
We need to run full set of tests for this PR to merge
type:bug-fix
Changelog: Bug Fixes
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This keeps the logic to fail without upgrading when (A) there are bad rows and (B) the "moved" table already exists. But we optimize so that we don't count the bad rows unless the "moved" table is there. Previously we counted always, but the first time a user attempts upgrade, the tables won't be there so there's no point in counting.
Instead what we do is skip right to the CTAS, creating the
_airflow_moved
tables. If there aren't any rows in the "moved" table, then we delete the table immediately.Also included here is a delete optimization, where we join to the moved table instead of running the not exists query again.
Tested on all 4 dialects with small sample data to ensure that behavior is correct.