Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve detection of when breeze CI image needs rebuilding #33603

Merged
merged 2 commits into from
Aug 22, 2023

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Aug 22, 2023

Previously we have been using provider.yaml file modification as a sign that the docker image needs rebuilding when starting image. However just modification of provider.yaml file is not a sign that the image needs rebuilding. The image needs rebuilding when provider dependencies changed, but there are many more reasons why provider.yaml file changed - especially recently provider.yaml file contains much more information and dependencies are only part of it. Provider.yaml files can also be modified by release manager wnen documentation is prepared, but none of the documentation change is a reason for rebuilding the image.

This PR optimize the check for image building introducing two step process:

  • first we check if provider.yaml files changed
  • if they did, we regenerate provider dependencies by manully running the pre-commit script
  • then provider_dependencies.json is used instead of all providers to determine if the image needs rebuilding

This has several nice side effects:

  • the list of files that have been modified displayed to the user is potentially much smaller (no provider.yaml files)
  • provider_dependencies.json is regenereated automatically when you run any breeze command, which means that you do not have to have pre-commit installed to regenerate it
  • the notification "image needs rebuilding" will be printed less frequently to the user - only when it is really needed
  • preparing provider documentation in CI will not trigger image rebuilding (which might occasionally fail in such case especially when we bring back a provider from long suspension like it happened in Resume yandex provider #33574

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

Previously we have been using provider.yaml file modification as
a sign that the docker image needs rebuilding when starting image.
However just modification of provider.yaml file is not a sign
that the image needs rebuilding. The image needs rebuilding when
provider dependencies changed, but there are many more reasons why
provider.yaml file changed - especially recently provider.yaml
file contains much more information and dependencies are only part
of it. Provider.yaml files can also be modified by release manager
wnen documentation is prepared, but none of the documentation
change is a reason for rebuilding the image.

This PR optimize the check for image building introducing two
step process:

* first we check if provider.yaml files changed
* if they did, we regenerate provider dependencies by manully
  running the pre-commit script
* then provider_dependencies.json is used instead of all providers
  to determine if the image needs rebuilding

This has several nice side effects:

* the list of files that have been modified displayed to the
  user is potentially much smaller (no provider.yaml files)
* provider_dependencies.json is regenereated automatically when
  you run any breeze command, which means that you do not have
  to have pre-commit installed to regenerate it
* the notification "image needs rebuilding" will be printed less
  frequently to the user - only when it is really needed
* preparing provider documentation in CI will not trigger
  image rebuilding (which might occasionally fail in such case
  especially when we bring back a provider from long suspension
  like it happened in apache#33574
Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting PR. Don't have any major comments. LGTM

@potiuk potiuk mentioned this pull request Aug 22, 2023
@potiuk potiuk merged commit ac0d5b3 into apache:main Aug 22, 2023
62 of 64 checks passed
@potiuk potiuk deleted the optimize-when-to-rebuild-image branch August 22, 2023 11:27
@potiuk potiuk added this to the Airflow 2.7.1 milestone Aug 22, 2023
potiuk added a commit that referenced this pull request Aug 25, 2023
* Improve detection of when breeze CI image needs rebuilding

Previously we have been using provider.yaml file modification as
a sign that the docker image needs rebuilding when starting image.
However just modification of provider.yaml file is not a sign
that the image needs rebuilding. The image needs rebuilding when
provider dependencies changed, but there are many more reasons why
provider.yaml file changed - especially recently provider.yaml
file contains much more information and dependencies are only part
of it. Provider.yaml files can also be modified by release manager
wnen documentation is prepared, but none of the documentation
change is a reason for rebuilding the image.

This PR optimize the check for image building introducing two
step process:

* first we check if provider.yaml files changed
* if they did, we regenerate provider dependencies by manully
  running the pre-commit script
* then provider_dependencies.json is used instead of all providers
  to determine if the image needs rebuilding

This has several nice side effects:

* the list of files that have been modified displayed to the
  user is potentially much smaller (no provider.yaml files)
* provider_dependencies.json is regenereated automatically when
  you run any breeze command, which means that you do not have
  to have pre-commit installed to regenerate it
* the notification "image needs rebuilding" will be printed less
  frequently to the user - only when it is really needed
* preparing provider documentation in CI will not trigger
  image rebuilding (which might occasionally fail in such case
  especially when we bring back a provider from long suspension
  like it happened in #33574

* Update dev/breeze/src/airflow_breeze/commands/developer_commands.py

(cherry picked from commit ac0d5b3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants