Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate S3PrefixSensor and S3KeySizeSensor in favor of S3KeySensor #22737

Merged
merged 11 commits into from
Apr 12, 2022

Conversation

vincbeck
Copy link
Contributor

@vincbeck vincbeck commented Apr 4, 2022

Deprecate S3PrefixSensor and use S3KeySensor instead.

Why deprecating S3PrefixSensor?

S3PrefixSensor and S3KeySensor are doing pretty much the same thing so to avoid duplicates, we should deprecate one.
Also S3PrefixSensor does not behave the way it is described by the documentation in comments and the way you expect by the name. S3PrefixSensor does not wait for a given prefix in S3 to exist but it waits for a given folder in S3 to exist (given the delimiter is /). Here are some examples I ran for testing.

  • prefix="test". true when directory test exists
  • prefix="test". false when directory test does not exist and file test exists
  • prefix="test". false when directory test does not exist and file test2 exists
  • prefix="tes". false when directory test exists

This misalignment between expected behavior and actual one is confusing for users. Example of thread where a user does not understand why S3PrefixSensor behave this way.

Why updating S3KeySensor?

In order to be backward compatible we want to use S3KeySensor instead. S3PrefixSensor accept a list of files as input, so S3KeySensor should be too

What is default_check_fn and why do we need it?

S3KeySizeSensor works this way: you can provide a custom function to this sensor to apply an additional check layer. If you dont provide one, default_check_fn is used instead. We will need it to be backward compatible and keep the current behavior of S3KeySizeSensor as it is today.

Update S3KeySensor to handle multiple files
@vincbeck vincbeck requested a review from mik-laj as a code owner April 4, 2022 20:32
@boring-cyborg boring-cyborg bot added area:core-operators Operators, Sensors and hooks within Core Airflow area:providers kind:documentation provider:amazon-aws AWS/Amazon - related issues labels Apr 4, 2022
Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic seems OK
left few comments

airflow/providers/amazon/aws/sensors/s3.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/sensors/s3.py Outdated Show resolved Hide resolved
tests/providers/amazon/aws/sensors/test_s3_prefix.py Outdated Show resolved Hide resolved
airflow/providers/amazon/aws/sensors/s3.py Show resolved Hide resolved
vincbeck and others added 2 commits April 7, 2022 11:13
Co-authored-by: Ash Berlin-Taylor <[email protected]>
Address feedbacks from PR apache#22737
@vincbeck
Copy link
Contributor Author

vincbeck commented Apr 8, 2022

While working on addressing feedbacks from this PR I realized that S3KeySizeSensor is also a subset of S3KeySensor. Thus I also decided to deprecate it and added an optional parameter to S3KeySensor

@vincbeck
Copy link
Contributor Author

I only left tests related to deprecation warnings and default function in TestS3KeySizeSensor. I removed the other tests because these use cases are already covered in TestS3KeySensor
I also moved default_check_fn in S3KeySizeSensor

@vincbeck
Copy link
Contributor Author

I also edited the PR description to add a section to explain the purpose of default_check_fn since it might not be very obvious

@eladkal eladkal self-requested a review April 11, 2022 20:38
@eladkal eladkal changed the title Deprecate S3PrefixSensor Deprecate S3PrefixSensor and S3KeySizeSensor in favor of S3KeySensor Apr 12, 2022
Copy link
Contributor

@eladkal eladkal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job @vincbeck

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Apr 12, 2022
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@eladkal eladkal merged commit dffb0d2 into apache:main Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) full tests needed We need to run full set of tests for this PR to merge kind:documentation provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants