Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add better diagnostics when provider.yaml check fails #37322

Merged

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Feb 10, 2024


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@rawwar
Copy link
Collaborator

rawwar commented Feb 11, 2024

@potiuk , Can you please confirm if this check only runs when there is a change in the provider.yaml file?

If so, can we try to actually pass --upgrade-to-newer-dependencies argument to the breeze directly in the pre-commit? If there are package upgrades, they will upgrade.

If not, would adding this help? I can look into this.

@potiuk
Copy link
Member Author

potiuk commented Feb 11, 2024

If so, can we try to actually pass --upgrade-to-newer-dependencies argument to the breeze directly in the pre-commit? If there are package upgrades, they will upgrade.

No - we do not want to do it automatically, because running it takes a long time and it takes a lot of time and networking, so you should not really run it automatically when provider.yaml changes, because the fact that it changes, does not mean that you HAVE TO do it. In many situations, running pre-commits will be fine even if you don't run it, and it's fine, because when you have slow network, develop while in a plane etc, you don't want to pull 700 MB of data.

The whole image rebuilding is designed around it - it should be "eventually consistent" - you are gently guided to rebuild the image, but you should not be forced to run the rebuild, because you might not want to do it just now.

And adding new providers/modifying dependencies is generallly very rare event that we should not optimize for. What you experience might seam common, but there is a very low percentage of PRs that add or modify dependencies or change provider.yaml file (and even less that result in absolute need or rebuilding the image).

This is explained in detail in the two ADRs (Architecture Decision Records): ADR4 where it is explained why we chose docker container as "common" environment and ADR12 - where the mechanism of eventual consistency and asking the user for confirmation is explained.

@potiuk
Copy link
Member Author

potiuk commented Feb 11, 2024

@potiuk , Can you please confirm if this check only runs when there is a change in the provider.yaml file?

Also it's not the case. In CI we always run "all" pre-commits. Pre-comit has two modes, one local (when it only run pre-commits that are related to locally modified files), and one usually used in ci where --all-files switch is used.

In CI in case dependencies were modified, we run --upgrade-to-newer-dependencies, but even there as of #37305 merged few hours ago, we run --upgrade-to-newer-dependencies not when provider.yaml change, but when we detect that the resulting pyproject.toml has changed dependencies.

And again - this is because running --upgrade-to-newer-dependencies with ~700 dependencies takes a lot of time (27 minutes in CI), not only because there is a lot of data, to pull but also because pip performs resolution of those dependencies and tries to find the newest versions of those dependencies. So --upgrade-to-newer-deps should be the last resort and deliberately run when you really need it.

@rawwar
Copy link
Collaborator

rawwar commented Feb 11, 2024

And adding new providers/modifying dependencies is generallly very rare event that we should not optimize for.

This makes sense and I agree.

And, Thank you so much for such a detailed response.

@potiuk potiuk merged commit 5ee75fb into apache:main Feb 11, 2024
54 checks passed
@potiuk potiuk deleted the better-diagnostics-when-provider-check-fails branch February 11, 2024 09:49
@potiuk potiuk added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Feb 12, 2024
@potiuk potiuk added this to the Airflow 2.8.2 milestone Feb 12, 2024
potiuk added a commit that referenced this pull request Feb 13, 2024
ephraimbuddy pushed a commit that referenced this pull request Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants