Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix template rendering for Common SQL operators #28202

Merged

Conversation

jon-evergreen
Copy link
Contributor

@jon-evergreen jon-evergreen commented Dec 7, 2022

Closes: #28195

This patch fixes all the common SQL operators I could find which showed the same problem as reported in #28195, that statements are generated "too early", before the templated variables have been applied. I think all changes should have tests which break without the fix. Some of these tests are quite brittle in that they mock complex nested SQL which is not ideal. In my defence though, so do a lot of the tests in the file

This patch adds the self.sql variable as a templated parameter, allowing for templated table, partition_clause, checks etc.

@boring-cyborg
Copy link

boring-cyborg bot commented Dec 7, 2022

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@jon-evergreen jon-evergreen force-pushed the fix-common-sql-template-rendering branch from dff8ea4 to 5dd5b8a Compare December 7, 2022 19:18
@potiuk
Copy link
Member

potiuk commented Dec 7, 2022

Static checks I am afraid

@jon-evergreen jon-evergreen force-pushed the fix-common-sql-template-rendering branch from 5dd5b8a to cc3b804 Compare December 8, 2022 10:22
@jon-evergreen
Copy link
Contributor Author

Yes, sorry, I missed those yesterday.

This code took the slightly more complex approach of moving things around such that the SQL gets generated only in the execute method. I did notice some of the operators take the approach of templating what would be self.sql, which allows templating basically everywhere in the SQL, and would also mean the full sql statement would appear in the "Rendered" tab in the UI, which could be nice. Though it also means you don't get the templated_fields highlighting that e.g. you can template the partition_clause argument. Though it would technically be more work for the task, I guess you could template all the things currently templated more for a documentation PoV and the sql field which the operator generates itself; the extra compute time would be relatively small, I imagine?

What do you think?

@jon-evergreen
Copy link
Contributor Author

jon-evergreen commented Dec 8, 2022

Ah, the static checks for BigQuery are failing because there's no longer a self.sql attribute. This suggests to me maybe I should keep self.sql around, and have that be templated.

The new version of the PR does template out self.sql as well (and tells airflow to render it as sql in the rendered tab)

@jon-evergreen jon-evergreen force-pushed the fix-common-sql-template-rendering branch from cc3b804 to bac62ee Compare December 8, 2022 11:27
@potiuk
Copy link
Member

potiuk commented Dec 8, 2022

Yeah. Cool. This is exactly why we have tests for :)

@potiuk
Copy link
Member

potiuk commented Dec 9, 2022

Static checks need to be fixed :( @jon-evergreen - I recommend to install pre-commit, it will fix everything for you.

@jon-evergreen
Copy link
Contributor Author

jon-evergreen commented Dec 9, 2022

@potiuk I did install the hooks after the first commit which got bounced. And when I make changes to the sql.py I get this in the pre-commit logs:

Check and update common.sql API stubs..............................(no files to check)Skipped

I think the precommit definition needs to change from

^scripts/ci/pre_commit/pre_commit_update_common_sql_api\.py|^airflow/providers/common/sql/.*\.py[i]$

to

 ^scripts/ci/pre_commit/pre_commit_update_common_sql_api\.py|^airflow/providers/common/sql/.*\.pyi?$

to have it run on changes to both sql.py and sql.pyi, as the static checks action is doing.

Closes: apache#28195

This patch fixes all the common SQL operators I could find which showed
the same problem as reported in apache#28195, that statements are generated
"too early", before the templated variables have been applied.  I think
all changes should have tests which break without the fix.  Some of
these tests are quite brittle in that they mock complex nested SQL which
is not ideal.

This patch adds the `self.sql` variable as a templated parameter,
allowing for templated `table`, `partition_clause`, `checks` etc.
@jon-evergreen jon-evergreen force-pushed the fix-common-sql-template-rendering branch from bac62ee to 1c3016c Compare December 9, 2022 13:47
@jon-evergreen
Copy link
Contributor Author

Also, in case it wasn't clear, I have now checked in the updated sql.pyi file

@potiuk
Copy link
Member

potiuk commented Dec 9, 2022

 ^scripts/ci/pre_commit/pre_commit_update_common_sql_api\.py|^airflow/providers/common/sql/.*\.pyi?$

to have it run on changes to both sql.py and sql.pyi, as the static checks action is doing.

^scripts/ci/pre_commit/pre_commit_update_common_sql_api.py|^airflow/providers/common/sql/.*.pyi?$

Correct 🤦 . The [i] is glob not regexp :)

@boring-cyborg
Copy link

boring-cyborg bot commented Dec 9, 2022

Awesome work, congrats on your first merged pull request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SQLTableCheckOperator doesn't correctly handle templated partition clause
3 participants