-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync v2-3-stable
with v2-3-test
to release 2.3.3
#24762
Conversation
(cherry picked from commit 4daf51a)
The dependency we have for flask-appbuilder oauth authentication (for github/google authentication) should follow the limits that flask-appbuilder current version has. We added authlib there but apparently FAB currently limits authlib to <= 1.0 - we should follow fab rather than have our own dependency here. This has been pointed out in dpgaspar/Flask-AppBuilder#1861 (cherry picked from commit 5674491)
The TimedJSONWebSignatureSerializer has been deprecated from the itsdangerous library and they recommended to use dedicated libraries for it. pallets/itsdangerous#129 Since we are going to move to FAB 4+ with #22397 where newer version of itsdangerous is used, we need to switch to another library. We are already using PyJWT so the choice is obvious. Additionally to switching, the following improvements were done: * the use of JWT claims has been fixed to follow JWT standard. We were using "iat" header wrongly. The specification of JWT only expects the header to be there and be valid UTC timestamp, but the claim does not impact maturity of the signature - the signature is valid if iat is in the future. Instead "nbf" - "not before" claim should be used to verify if the request is not coming from the future. We now require all claims to be present in the request. * rather than using salt/signing_context we switched to standard JWT "audience" claim (same end result) * we have now much better diagnostics on the server side of the reason why request is forbidden - explicit error messages are printed in server logs and details of the exception. This is secure, we do not spill the information about the reason to the client, it's only available in server logs, so there is no risk attacker could use it. * the JWTSigner is "use-agnostic". We should be able to use the same class for any other signatures (Internal API from AIP-44) with just different audience * Short, 5 seconds default clock skew is allowed, to account for systems that have "almost" synchronized time * more tests addded with proper time freezing testing both expiry and immaturity of the request This change is not a breaking one because the JWT authentication details are not "public API" - but in case someone reverse engineered our claims and implemented their own log file retrieval, we should add a change in our changelog - therefore newsfragment is added. (cherry picked from commit 1f8e4c9)
That Also includes regenerating the breeze output images. (cherry picked from commit 13c62b5)
(cherry picked from commit 4d25365)
(cherry picked from commit 88363b5)
Breeze when started produced three warnings that were harmless, but we should fix them to remove "false positives". (cherry picked from commit ac8a790)
Choosing version of backend were broken when command line switches were used. The _VERSION variables were "hard-coded" to defaults rather than taken from command line. This is a remnant of initial implementation and converting the parameters to "cacheable" ones. While looking at the versions we also found that PARAM_NAME_FLAG is not used any more so we took the opportunity to remove it. (cherry picked from commit 4482eae)
(cherry picked from commit 2d687e4)
Until we have debian suppor tof MySQL and MSSQL ARM, runnign those on ARM platform is not supported. However error about it was not clear (pulling docker image failed). This PR adds platform checking also in breeze and fails fast without even attempting to enter breeze shell when you are on ARM and wants to run MsSQL or MySQL breeze shell. Also some errors with running different backend versions via breeze have been removed. (cherry picked from commit 00d2a3c)
The links to example sources in exampleinclude have been broken in a number of providers and they were additionally broken by AIP-47. This PR fixes it. Fixes: #23632 Fixes: apache/airflow-site#536 (cherry picked from commit 08b675c)
This is the first step to run breeze tests in parallel in CI. This flag adds "limited progress" output when running tests which means that the runnig tests will just print few lines with percent progress and color status indication from last few progress lines of Pytest output, but when it completes, the whole output is printed in a CI group - colored depending on status. The final version (wnen we implement parallel test execution) should also defer writing the output to until all tests are completed, but this should be a follow-up PR. (cherry picked from commit 41fefa1)
Few improvements: * direct link to workflow in the docs * "green" success screenshots (separate screenshot for rc) * job and step names contain both Airflow version and python version * running subsequent build with same version will cancel past in-progress build (in case you quickly make constraint fis for example and re-run) (cherry picked from commit 8ad18bf)
The old breeze-legacy used to have a possibility of very easy reproduction of CI failures by executing the right breeze command that contained the commit hash of the PR being tested. This has been broken for some time after we migrated to the new breeze, but finally it was the time when it was needed again. This PR brings back the capability by: * addding --image-tag parameters for tests, shell and start-airflow commands * if --image-tag is specified, then rather than building the image, it is pulled using the specified hash * if --image-tag is specified, the local sources are not mounted to breeze when started, but the sources already embedded in the image are used ("skipped" set for --mount-sources). * new "removed" command value is added to --mount-sources, it causes breeze command to remove the sources from the image (it is used when installing airflow during the tests for specified version (it's automatically used when --use-airflow-version is used). (cherry picked from commit 7dc794a)
After the images are pushed in CI we are running the verification of the AMD image now. This cannot be really done during building and pushing the image, because we are using multi-platform images using remote builders so the image is not even available locally, so we need to actually pull the images after they are built in order to verify them. This PR adds those features: * ability to pull images for verification with --pull-flag * ability to verify slim images (regular tests are skipped and we only expect the preinstalled providers to be available * the steps to verify the images (both regular and slim) are added to the workflow (cherry picked from commit 2936759)
In the new Breeze, switching to using parallelism is a ... breeze. This PR adds the capability of building the images in parallel in Breeze locally - for breeze command, but also uses this capability to build the images in parallel in our CI. Our builds are always executed on powerful, big machines with lots of CPU and docker run in memory filesystem with 32GB RAM, so it should be possible to run all builds in parallel on a single machine rather then spin off parallel machines to run the builds using the matrix strategy of Github Actions. Generally speaking - this will either speed up or get 4x cost saving for the build steps for all the "full test needed" PRs as well as all the main builds. There are a number of savings and improvements we can achieve this way: 1) less overhead for starting and runnning the machines 2) seems that with the new buildkit, the parallel builds are not suffering from some sequential locks (as it used to be, so we are basically do the same job using 25% resources for building the images. 3) we will stop having random "one image failed to build" cases - they will all either fail or succeed. 4) Less checks in the output 5) Production builds will additionally gain from single CI image pulled in order to perform the preparation of the packages and single package preparation step - it will save 4-5 minutes per image. The disadvantage is a less clear output of such parallel build where outputs from multiple builds will be interleaved in one CI output. (cherry picked from commit 893d935)
* Upgrade FAB to 4.1.1 The Flask Application Builder have been updated recently to support a number of newer dependencies. This PR is the attempt to migrate FAB to newer version. This includes: * update setup.py and setup.cfg upper and lower bounds to account for proper version of dependencies that FAB < 4.0.0 was blocking from upgrade * added typed Flask application retrieval with a custom application fields available for MyPy typing checks. * fix typing to account for typing hints added in multiple upgraded libraries optional values and content of request returned as Mapping * switch to PyJWT 2.* by using non-deprecated "required" claim as list rather than separate fields * add possibiliyt to install providers without constraints so that we could avoid errors on conflicting constraints when upgrade-to-newer-dependencies is used * add pre-commit to check that 2.4+ only get_airflow_app is not used in providers * avoid Bad Request in case the request sent to Flask 2.0 is not JSon content type * switch imports of internal classes to direct packages where classes are available rather than from "airflow.models" to satisfy MyPY * synchronize changes of FAB Security Manager 4.1.1 with our copy of the Security Manager. * add error handling for a few "None" cases detected by MyPY * corrected test cases that were broken by immutability of Flask 2 objects and better escaping done by Flask 2 * updated test cases to account for redirection to "path" rather than full URL by Flask2 Fixes: #22397 * fixup! Upgrade FAB to 4.1.1 (cherry picked from commit e2f1950)
(cherry picked from commit 8231add)
Recent changes to Breeze cause it to fail in certain situations, especially at self-upgrade (which was generated by today's upgrade with rich-click). There were two problems: * docker volume inspect missed 'volume' and it caused sometimes failures in CI * inputimeout dependency was missing after recent update to pre-commit venvs (cherry picked from commit 0905e38)
We were ignoring mypy error instead of fixing it. click had removed `get_terminal_size` and recommend using `shutil.get_terminal_size` (cherry picked from commit 5f67cc0)
(cherry picked from commit b8a4ac5)
Azure service bus uses uamqp which does not build for ARM architecture and we need to disable it as a dependency for ARM. (cherry picked from commit 477f907)
…24634) Today Python released new images that seems to break some of our dependencies (at least on M1/ARM). This PR adds a workaround possibility to add --python-image option to override the default, latest image with any other version released previously until we diagnose and fix the real issue. (cherry picked from commit d6e6e7d)
Instead of bash-based, complex logic script to perform PR selective checks we now integrated the whole logic into Breeze Python code. It is now much simplified, when it comes to algorithm. We've implemented simple rule-based decision tree. The rules describing the decision tree are now are now much easier to reason about and they correspond one-to-one with the rules that are implemented in the code in rather straightforward way. The code is much simpler and diagnostics of the selective checks has also been vastly improved: * The rule engine displays status of applying each rule and explains (with yellow warning message what decision was made and why. Informative messages are printed showing the resulting output * List of files impacting the decision are also displayed * The names of "ci file group" and "test type" were aligned * Unit tests covering wide range of cases are added. Each test describes what is the case they demonstrate * `breeze selective-checks` command that is used in CI can also be used locally by just providing commit-ish reference of the commit to check. This way you can very easily debug problems and fix them Fixes: #19971 (cherry picked from commit d7bd72f)
c7f625b
to
6a58b29
Compare
The recent changes in selective checks introduced a few problems in non-main branches: * constraints branch was not used by CI build in non-main branch this was not a proble, (until it started to fail) because other branches always work in "upgradeDepencies" case * default branch calculation did not account for the selective-check comment that was run before default branch was retrieved. Therefore we did not run build as from 2.3 branch * Some precomits should be skiped in non-main branch * Integration was also excluded from Provider Check * Two more commit outputs added: * docs-filter - depends whether you are with more packages * skip-pre-commit - allows to dynamically choose which pre-commits should be skipped (cherry picked from commit c8d3850)
…24780) When pushing cache we cannot use default builder because it has no capability to push cache. We need to use airflow_cache for it, which has additionally the capacity of forwarding the builds to a remote ARM instance in case cache is built for ARM images too. Recently we added a new --builder flag to allow to choose the builder, but our scripts and CI have not been modified to account for that (previously builder was chosen automatically but that has proven to be limiting in some manual operations and it also allows to choose different names for builders in case someone wants to build their own pipeline of builds. (cherry picked from commit a1679be)
Do we want #22658 included in 2.3.3 as well? |
This is a core change and I feel we should exclude it for not being merged before now. It's late IMO |
I was just wondering why the release doesn't include #24373. It seems like it is because the PR isn't labeled as |
7df10dc
to
a7a1511
Compare
There were errors with retieving constraints branch caused by using different convention for output names (sometimes dash, sometimes camelCase as suggested by most GitHub documents). The "dash-name" looks much better and is far more readable so we shoud unify all internal outputs to follow it. During that rename some old, unused outputs were removed, also it turned out that the new selective-check can replace previous "dynamic outputs" written in Bash as well. Additionally, the "defaults" are now retrieved via Python script, not bash script which will make it much more readable - both build_images and ci.yaml use it in the right place - before replacing the scripts and dev with the version coming in from PR in case of build_images.yaml. (cherry picked from commit 017507be1e1dbf39abcc94a44fab8869037893ea)
@hwooson12, generally things sometimes fall through the cracks unfortunately. I don't think there was anything specific keeping it from the next release, it just got missed. If you do notice PRs that should be bugfixes in the next release and they aren't in the milestone, just speak up. (In this specific case, it looks like a better fix is now being proposed anyways.) |
We're about to revert that commit and fix that issue properly in #24811. |
Time for
2.3.3rc1
!