Convert selective checks to Breeze Python #24610

potiuk · 2022-06-22T18:53:51Z

Instead of bash-based, complex logic script to perform PR selective
checks we now integrated the whole logic into Breeze Python code.

It is now much simplified, when it comes to algorithm. We've
implemented simple rule-based decision tree. The rules describing
the decision tree are now are now much easier
to reason about and they correspond one-to-one with the rules
that are implemented in the code in rather straightforward way.

The code is much simpler and diagnostics of the selective checks
has also been vastly improved:

The rule engine displays status of applying each rule and
explains (with yellow warning message what decision was made
and why. Informative messages are printed showing the resulting
output
List of files impacting the decision are also displayed
The names of "ci file group" and "test type" were aligned
Unit tests covering wide range of cases are added. Each test
describes what is the case they demonstrate
breeze selective-checks command that is used in CI can also
be used locally by just providing commit-ish reference of the
commit to check. This way you can very easily debug problems and
fix them

Fixes: #19971

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragement file, named {pr_number}.significant.rst, in newsfragments.

potiuk · 2022-06-22T19:09:27Z

This is the next step in Breeze conversion. This time -800 lines of Bash Code (replaced with Python but including lots of unit tests) which is far more readable and easy to reason about.

Plus very clear diagnosticts of what decisions are made and why, including breeze static-check tool that can be used locally to debug and reproduce selective check behaviour with any local commit.

Regular PR that triggers all tests (core change) but does not run any Helm/K8S/UI javascript tests

More complex change (this one) that trigers all possible tests because environment changed:

Fragment of more complex change (amazon + cncf.kubernetes) that triggers only subset of unit tests ("Always", "Providers") bit also all our K8S changes:

Tests are pretty comprehensive and we will be able to easily test and add any new cases in the future (especially when we split providers and want to make more fine-grained decisions which tests to run.

potiuk · 2022-06-22T19:11:28Z

cc: @edithturn @Bowrna : -800 lines of bash less and much easier "logic" to grasp for selective checks implementation

edithturn · 2022-06-22T21:38:36Z

I wouldn't have thought of all those changes, really impressive @potiuk.

potiuk · 2022-06-22T22:42:11Z

I wouldn't have thought of all those changes, really impressive @potiuk.

Just Python is way easier to write things better :)

potiuk · 2022-06-23T15:37:33Z

BTW. By converting the checks to Python I also found that current selectiuve checks were not "selective enough" as I missed the cases where "Helm" tests are run unnecessarily in quite a number of cases (For example they we run when only providers were modified - this should speed up PRs from contributors who only modified one or more providers and did not touch the core).

SELECTIVE_CHECKS.md

dev/breeze/src/airflow_breeze/commands/ci_commands.py

Instead of bash-based, complex logic script to perform PR selective checks we now integrated the whole logic into Breeze Python code. It is now much simplified, when it comes to algorithm. We've implemented simple rule-based decision tree. The rules describing the decision tree are now are now much easier to reason about and they correspond one-to-one with the rules that are implemented in the code in rather straightforward way. The code is much simpler and diagnostics of the selective checks has also been vastly improved: * The rule engine displays status of applying each rule and explains (with yellow warning message what decision was made and why. Informative messages are printed showing the resulting output * List of files impacting the decision are also displayed * The names of "ci file group" and "test type" were aligned * Unit tests covering wide range of cases are added. Each test describes what is the case they demonstrate * `breeze selective-checks` command that is used in CI can also be used locally by just providing commit-ish reference of the commit to check. This way you can very easily debug problems and fix them Fixes: apache#19971

potiuk · 2022-06-25T01:13:30Z

All Green!

github-actions · 2022-06-25T05:11:20Z

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

When apache#24610 was implemented I missed the label-when-reviewed workflow

When #24610 was implemented I missed the label-when-reviewed workflow

Selective checks docs have been moved to breeze as part of #24610 but some of the references were still left. This PR cleans it up.

Instead of bash-based, complex logic script to perform PR selective checks we now integrated the whole logic into Breeze Python code. It is now much simplified, when it comes to algorithm. We've implemented simple rule-based decision tree. The rules describing the decision tree are now are now much easier to reason about and they correspond one-to-one with the rules that are implemented in the code in rather straightforward way. The code is much simpler and diagnostics of the selective checks has also been vastly improved: * The rule engine displays status of applying each rule and explains (with yellow warning message what decision was made and why. Informative messages are printed showing the resulting output * List of files impacting the decision are also displayed * The names of "ci file group" and "test type" were aligned * Unit tests covering wide range of cases are added. Each test describes what is the case they demonstrate * `breeze selective-checks` command that is used in CI can also be used locally by just providing commit-ish reference of the commit to check. This way you can very easily debug problems and fix them Fixes: apache#19971 (cherry picked from commit d7bd72f)

…e#24651) When apache#24610 was implemented I missed the label-when-reviewed workflow (cherry picked from commit 2703874)

Selective checks docs have been moved to breeze as part of apache#24610 but some of the references were still left. This PR cleans it up. (cherry picked from commit aa8cd30)

edithturn · 2022-07-04T15:17:23Z

It was merged, well done @potiuk Jarek 🥳

After implementing apache#24610 and few follow-up fixes, it is now easy to add more optimizations to our unit test execution in CI (and to give this capability back to our contributors). This PR adds capability of running tests for selected set of providers - not for the whole "Providers" group. You can specify `--test-type "Providers[airbyte,http]" to only run tests for the two selected providers. This is the step towards separating providers to separate repositories, but it also allows to optimize the experience of the contributors developing only single provider changes (which is vast majority of contributions). This also allows to optimize build and elapsed time needd to run tests for those PRs that only affects selected providers (again - vast majority of PRs). The CI selection of which provider tests is done now in Selective Checkcs - they are a bit smarter in just selecting the providers that has been changed, they also check if there are any other providers that depend on it (we keep automatically updated by pre-commit dependencies.json file and this file determines which files should be run.

After implementing #24610 and few follow-up fixes, it is now easy to add more optimizations to our unit test execution in CI (and to give this capability back to our contributors). This PR adds capability of running tests for selected set of providers - not for the whole "Providers" group. You can specify `--test-type "Providers[airbyte,http]" to only run tests for the two selected providers. This is the step towards separating providers to separate repositories, but it also allows to optimize the experience of the contributors developing only single provider changes (which is vast majority of contributions). This also allows to optimize build and elapsed time needd to run tests for those PRs that only affects selected providers (again - vast majority of PRs). The CI selection of which provider tests is done now in Selective Checkcs - they are a bit smarter in just selecting the providers that has been changed, they also check if there are any other providers that depend on it (we keep automatically updated by pre-commit dependencies.json file and this file determines which files should be run.

After implementing #24610 and few follow-up fixes, it is now easy to add more optimizations to our unit test execution in CI (and to give this capability back to our contributors). This PR adds capability of running tests for selected set of providers - not for the whole "Providers" group. You can specify `--test-type "Providers[airbyte,http]" to only run tests for the two selected providers. This is the step towards separating providers to separate repositories, but it also allows to optimize the experience of the contributors developing only single provider changes (which is vast majority of contributions). This also allows to optimize build and elapsed time needd to run tests for those PRs that only affects selected providers (again - vast majority of PRs). The CI selection of which provider tests is done now in Selective Checkcs - they are a bit smarter in just selecting the providers that has been changed, they also check if there are any other providers that depend on it (we keep automatically updated by pre-commit dependencies.json file and this file determines which files should be run. (cherry picked from commit 3dedbd3)

potiuk requested review from ashb, mik-laj, jedcunningham and kaxil as code owners June 22, 2022 18:53

boring-cyborg bot added the area:dev-tools label Jun 22, 2022

potiuk requested review from eladkal and uranusjr June 22, 2022 18:53

potiuk force-pushed the add-selective-check branch from ba98814 to d361703 Compare June 22, 2022 19:00

potiuk force-pushed the add-selective-check branch 3 times, most recently from 4aa85b2 to 5407a01 Compare June 23, 2022 15:33

potiuk force-pushed the add-selective-check branch 2 times, most recently from 7ef286d to f2215d3 Compare June 23, 2022 20:37

mik-laj reviewed Jun 23, 2022

View reviewed changes

SELECTIVE_CHECKS.md Outdated Show resolved Hide resolved

potiuk force-pushed the add-selective-check branch 2 times, most recently from 04ca504 to f123eb9 Compare June 24, 2022 18:41

mik-laj reviewed Jun 24, 2022

View reviewed changes

dev/breeze/src/airflow_breeze/commands/ci_commands.py Outdated Show resolved Hide resolved

potiuk force-pushed the add-selective-check branch from f123eb9 to 00748eb Compare June 25, 2022 00:14

uranusjr approved these changes Jun 25, 2022

View reviewed changes

github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Jun 25, 2022

potiuk merged commit d7bd72f into apache:main Jun 25, 2022

potiuk deleted the add-selective-check branch June 25, 2022 07:46

potiuk added a commit to potiuk/airflow that referenced this pull request Jun 25, 2022

Switch to new selective-checks in label-when-reviewed workflow

4b2a335

When apache#24610 was implemented I missed the label-when-reviewed workflow

potiuk added a commit that referenced this pull request Jun 25, 2022

Switch to new selective-checks in label-when-reviewed workflow (#24651)

2703874

When #24610 was implemented I missed the label-when-reviewed workflow

potiuk added a commit that referenced this pull request Jun 25, 2022

Cleanup references to selective checks (#24649)

aa8cd30

Selective checks docs have been moved to breeze as part of #24610 but some of the references were still left. This PR cleans it up.

potiuk mentioned this pull request Jun 26, 2022

Add more selective provider tests #24666

Merged

potiuk added a commit to potiuk/airflow that referenced this pull request Jun 29, 2022

Switch to new selective-checks in label-when-reviewed workflow (apach…

4fbea89

…e#24651) When apache#24610 was implemented I missed the label-when-reviewed workflow (cherry picked from commit 2703874)

ephraimbuddy added this to the Airflow 2.3.3 milestone Jun 30, 2022

ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Jun 30, 2022

edithturn mentioned this pull request Jul 7, 2022

Rewrite Selective Check in Python #22327

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert selective checks to Breeze Python #24610

Convert selective checks to Breeze Python #24610

potiuk commented Jun 22, 2022

potiuk commented Jun 22, 2022

potiuk commented Jun 22, 2022

edithturn commented Jun 22, 2022

potiuk commented Jun 22, 2022

potiuk commented Jun 23, 2022 •

edited

Loading

potiuk commented Jun 25, 2022

github-actions bot commented Jun 25, 2022

edithturn commented Jul 4, 2022

Convert selective checks to Breeze Python #24610

Convert selective checks to Breeze Python #24610

Conversation

potiuk commented Jun 22, 2022

potiuk commented Jun 22, 2022

potiuk commented Jun 22, 2022

edithturn commented Jun 22, 2022

potiuk commented Jun 22, 2022

potiuk commented Jun 23, 2022 • edited Loading

potiuk commented Jun 25, 2022

github-actions bot commented Jun 25, 2022

edithturn commented Jul 4, 2022

potiuk commented Jun 23, 2022 •

edited

Loading