-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix part of Google system tests #18494
Merged
Merged
Changes from 11 commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
0d46bc1
Fix CloudBuildExampleDagsSystemTest
mnojek d53dcc2
Fix DataCatalog tests
mnojek 20b502d
Fix Dataprep test
mnojek dd75210
Fix GCS test
mnojek 6f78a89
Fix Dataflow test
mnojek bd2a6e1
Fix GC Functions test
mnojek 5b2b061
Merge branch 'main' into fix-system-tests
mnojek fb89759
Fix Cloud SQL tests
mnojek 4edd6b9
Fix Data Fusion test
mnojek 473e3cd
Merge branch 'main' into fix-system-tests
mnojek f40a6fb
Change env var in Dataprep system test
mnojek a9a3d27
Fix config for dataflow system tests
e5830ed
Merge branch 'dataflow-system-tests' into fix-system-tests
mnojek 31bd140
Add unit test for datafusion
mnojek 0386c4e
Fix Data Fusion utest, add optional parameter to data fusion pipeline…
mnojek 1ed5341
Update code with black suggestions
mnojek 4700b5a
Fix flake8 issue
mnojek ae90ce4
Revert changes to example_cloud_sql.py according to review feedback
mnojek File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unique value should be generated by the test run script, because the DAG file is loaded multiple times by Airflow, so we have to make sure that it had the same structure each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @mik-laj - we use this value at least two times and it will be different every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this because the instance name is unique only for the specific Airflow container so each time I run the test within 1 Airflow run, it's the same. The problem was that if this system test failed before deleting the instance, it couldn't be run again because it wanted to create instance with the same instance while it was still present. That's why I added it. So it works only when it passes within the first execution.
That is a simple workaround to make each test execution independent but overall it's not the desired solution. Can you point me where else this value is used?
Ideally, I would want to have independent tests and also a mechanism that will cleanup all the values created by the test if it fails in the middle. This is not the scope of this PR, but in near future I plan to work on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is HIGHLY annoying with cloudsql that name cannot ber reused for a day or so even if db is deleted.
However I agree random POSTFIX generation in the DAG is a bad idea.
What we used to have in the past is that we had
variables.env
file infiles/airflow-breeze-config
where we sourced files with variables and we had a very simple script that generated the random postfix if it was missing.Then you could do step-by-step testing with keeping the randomly generated postfix even across breeze restarts.
When you needed to change the database name you'd simply remove the file and it would be re-generated automatically at breeze entry.
https://github.com/apache/airflow/blob/main/BREEZE.rst#customize-your-environment
Something like that might work (writing it from memory so I am not sure if it is correct) in variables.env:
This has several nice properties:
In the past we even shared the whole
airflow-breeze-config
directory was actually checked out separate repository where we kept all variables used by the team. This way you could share different variables between same users who have access to the same repo - at the same time each of the users will have different postifx as the random.env would not be part of the repo.Just an inspiration if you would like to optimize your development workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mnojek Airflow is a distributed application, which means that one DAG file is loaded multiple times on different nodes, so we have to make sure that this instance name has the same value on all nodes. These examples are used in system tests, where this condition is not necessary because we have a common memory, but these examples are also the inspiration for novice users who can use another executor e.g. CeleryExecutor, so each DAG will be loaded on each node separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example project that we used to run system tests for Google Cloud on Google Cloud Build:
https://github.com/politools/airflow-system-tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestions, @potiuk @mik-laj! I will not implement them now but I will definitely use your comments when designing the new approach for running system tests. Hopefully we can make it better and more reliable 😃
In the meantime I will remove this change and hopefully this will not be the case in the new approach.