Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hook_params to SqlSensor. #18431

Merged
merged 1 commit into from
Nov 26, 2021
Merged

Conversation

kazanzhy
Copy link
Contributor

@kazanzhy kazanzhy commented Sep 22, 2021

This PR is based on: #17592 and closes: #13750
SqlSensor relies on underlying hooks that are automatically injected (DI) depending on the connection type provided to the SqlSensor. However, from time to time (but mainly in BigqueryHook) it is needed to pre-configure the "to-be-injected" hook because some default params do not fit the current config.
For example, if using SqlSensor with Bigquery it by default configures the SQL dialect to "legacy SQL", and there is no way to change this through configuration. Instead, the SqlSensor has to be extended and some functions are overwritten.

@boring-cyborg boring-cyborg bot added the area:core-operators Operators, Sensors and hooks within Core Airflow label Sep 22, 2021
@kazanzhy
Copy link
Contributor Author

Hi, @SamWheating.
Could you help me with this PR?

airflow/sensors/sql.py Outdated Show resolved Hide resolved
airflow/sensors/sql.py Outdated Show resolved Hide resolved
@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
if self._extra and self.is_extra_encrypted:
self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()

def get_hook(self):
"""Return hook based on conn_type."""
def get_hook(self, hook_params: Dict = {}):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is discouraged to pass a mutable object as a default. See this SO https://stackoverflow.com/q/1132941/621058 . Instead you can null it by default and if it's null assign it later

def get_hook(self, hook_params: Dict = None):
  if hook_params is None:
    hook_params = {}
  # ...

Copy link
Contributor Author

@kazanzhy kazanzhy Sep 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've also had to deal with the problem described in SO.
But in this case, Dict is just a container and hasn't been updated in the method.

Could you please write an example in the context of SqlSensor when we can receive the behavior similar to SO?
Because now, in my opinion, it looks like code sophistication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point @kazanzhy but I agree we should just stick with the practice of using None as the default for Optional[Dict] that is, as far as I know, uniformly adhered to in this repo.

If you leave it, people will always give it a second look when they see it. And pycharm complains about it. And using None for an optional arg is also a common pattern that I think is good aside from the mutability issue, just sends the message, you don't need to pass this, and we don't forward anything in this case.

@dinigo
Copy link
Contributor

dinigo commented Sep 30, 2021

Hello there. Been on holiday for some time, thats why #17592 was stalled. I see you get good progress here, including adding the missing DbHooks. Tell me If I can contribute somehow, just don't want to duplicate efforts

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 15, 2021
@kaxil kaxil requested a review from dstandish November 15, 2021 13:52
@kaxil kaxil added pinned Protect from Stalebot auto closing and removed stale Stale PRs per the .github/workflows/stale.yml policy file labels Nov 15, 2021
@kaxil kaxil requested a review from eladkal November 15, 2021 13:53
@eladkal
Copy link
Contributor

eladkal commented Nov 16, 2021

hi @kazanzhy recently #18718 was merged which is similar.
I see that you are depended on #17592 but since it's stale maybe you want to take over and integrate the needed changes to this PR?
This should be a relatively small change probably there is no real need to split the change into two PRs.

@kazanzhy
Copy link
Contributor Author

Hi @eladkal.
As you see this PR was before #18718 and I didn't see that it's referred to this PR.
I'll make rebase to the current main branch and update this PR.
Thank you for your comment

Copy link
Contributor

@dstandish dstandish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be split into 2 prs. This one should only add hook_params. Changing the conn types should be done separately.

@@ -289,8 +289,11 @@ def rotate_fernet_key(self):
if self._extra and self.is_extra_encrypted:
self._extra = fernet.rotate(self._extra.encode('utf-8')).decode()

def get_hook(self):
"""Return hook based on conn_type."""
def get_hook(self, hook_params: Dict = {}):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your point @kazanzhy but I agree we should just stick with the practice of using None as the default for Optional[Dict] that is, as far as I know, uniformly adhered to in this repo.

If you leave it, people will always give it a second look when they see it. And pycharm complains about it. And using None for an optional arg is also a common pattern that I think is good aside from the mutability issue, just sends the message, you don't need to pass this, and we don't forward anything in this case.

success=None,
failure=None,
fail_on_empty=False,
hook_params={},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@kazanzhy kazanzhy changed the title Add hook_kwargs to SqlSensor. Add hook_kwargs and fix allowed_conn_types for SqlSensor. Nov 17, 2021
@kazanzhy kazanzhy changed the title Add hook_kwargs and fix allowed_conn_types for SqlSensor. Add hook_params and fix allowed_conn_types for SqlSensor. Nov 17, 2021
@kazanzhy kazanzhy force-pushed the sqlsensor_hook_params branch from ca1e4f9 to 14562be Compare November 17, 2021 07:28
@kazanzhy
Copy link
Contributor Author

I updated the description. Also changed name of the parameter to hook_params which is similar to BaseSqlOperator.
I think the parameter of the method Connection.get_hook() also might be changed to hook_params

airflow/sensors/sql.py Show resolved Hide resolved
airflow/sensors/sql.py Outdated Show resolved Hide resolved
@kazanzhy kazanzhy changed the title Add hook_params and fix allowed_conn_types for SqlSensor. Add hook_params to SqlSensor. Nov 17, 2021
@kazanzhy kazanzhy force-pushed the sqlsensor_hook_params branch from 459a652 to f7abbfc Compare November 17, 2021 10:11
@uranusjr uranusjr requested a review from dstandish November 17, 2021 11:25
@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Nov 17, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@uranusjr uranusjr closed this Nov 18, 2021
@uranusjr uranusjr reopened this Nov 18, 2021
@kazanzhy kazanzhy force-pushed the sqlsensor_hook_params branch from f7abbfc to a2fa615 Compare November 24, 2021 12:57
@kazanzhy kazanzhy force-pushed the sqlsensor_hook_params branch from a2fa615 to 648254d Compare November 25, 2021 21:35
@potiuk potiuk merged commit 314a4fe into apache:main Nov 26, 2021
@boring-cyborg
Copy link

boring-cyborg bot commented Nov 26, 2021

Awesome work, congrats on your first merged pull request!

@kazanzhy kazanzhy deleted the sqlsensor_hook_params branch November 27, 2021 22:30
dillonjohnson pushed a commit to dillonjohnson/airflow that referenced this pull request Dec 1, 2021
@ephraimbuddy ephraimbuddy added the kind:feature Feature Requests label Apr 14, 2022
@jedcunningham jedcunningham added the type:improvement Changelog: Improvements label Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow full tests needed We need to run full set of tests for this PR to merge kind:feature Feature Requests pinned Protect from Stalebot auto closing type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Standard SQL in BigQuery Sensor
10 participants