Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add use_glob to GCSObjectExistenceSensor #34137

Merged
merged 6 commits into from
Dec 17, 2023
Merged

Conversation

A-Costa
Copy link
Contributor

@A-Costa A-Costa commented Sep 6, 2023

Initial draft implementing #32650

Add the boolean parameter use_glob to GCSObjectExistenceSensor.

When set to True the sensor will use the hook.list method leveraging the newly introduced match_glob parameter to check if files are present, making it possible to define aGCSObjectExistenceSensor that checks if files that satisfy the given glob are present in a certain bucket.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@boring-cyborg boring-cyborg bot added area:providers provider:google Google (including GCP) related issues labels Sep 6, 2023
@boring-cyborg
Copy link

boring-cyborg bot commented Sep 6, 2023

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@potiuk
Copy link
Member

potiuk commented Sep 7, 2023

Can you please add unit tests for it?

@A-Costa
Copy link
Contributor Author

A-Costa commented Sep 7, 2023

Sure, that was just an initial draft. I will try to push a complete version of this PR by the end of the day

if object_response:
return "success"
if self.use_glob:
object_response = await bucket.list_blobs(match_glob=object_name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this depends on talkiq/gcloud-aio#634

@A-Costa A-Costa changed the title Add use_glob to GCSObjectExistenceSensor WIP Add use_glob to GCSObjectExistenceSensor Sep 7, 2023
Copy link
Contributor

@shahar1 shahar1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM :)
I hope that gcloud-aio will merge your PR there soon, so you can also merge it here.
Don't forget to add a test for GCSBlobTrigger

@potiuk
Copy link
Member

potiuk commented Sep 11, 2023

Needs fixes though.

@A-Costa
Copy link
Contributor Author

A-Costa commented Sep 13, 2023

talkiq/gcloud-aio#634 was merged today so this can be fixed and merged.

Apart from the missing tests, what else needs to be fixed? @potiuk

@eladkal
Copy link
Contributor

eladkal commented Oct 14, 2023

Apart from the missing tests, what else needs to be fixed?

I think that is all but we will take closer look in 2nd review once everything is ready
You'll need also to bump gcloud-aio version in provider.yaml so it will include the fix in the upstream library

Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 29, 2023
@A-Costa
Copy link
Contributor Author

A-Costa commented Nov 30, 2023

any update on this?

@potiuk
Copy link
Member

potiuk commented Nov 30, 2023

any update on this?

Well. The PR is waiting for your actions:

Needs fixes though

I think that is all but we will take closer look in 2nd review once everything is ready
You'll need also to bump gcloud-aio version in provider.yaml so it will include the fix in the upstream library

Tests are failed and @eladkal asked you to update libraries.

Why do you think you need any update before following these comments ?

@potiuk
Copy link
Member

potiuk commented Nov 30, 2023

Generallly - making PR green is a prerequisite for anyone to take further look - if tests are failing and you were told to fix them and nothing happened, well, then the ball is on your side.

@potiuk
Copy link
Member

potiuk commented Nov 30, 2023

(plus you have still WIP in name - usualy that is an indication that this is Work In Progress - I usually just skip over such PRs even if they are rebased by author, because they are well, Work in Progress

@A-Costa
Copy link
Contributor Author

A-Costa commented Nov 30, 2023

talkiq/gcloud-aio#634 was merged today so this can be fixed and merged.

Apart from the missing tests, what else needs to be fixed? @potiuk

Here i asked if something else needed fix

Apart from the missing tests, what else needs to be fixed?

I think that is all but we will take closer look in 2nd review once everything is ready You'll need also to bump gcloud-aio version in provider.yaml so it will include the fix in the upstream library

Here a 2nd review was mentioned that never arrived.

Needs fixes though.

this was your only message so far, not very helpful to a new contributor.

Will work on the things you pointed out now though...

@eladkal
Copy link
Contributor

eladkal commented Nov 30, 2023

Happy to review when you mark the PR as ready :)
Sometimes we do multiple runs of reviews that is very OK

@eladkal eladkal removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Nov 30, 2023
@potiuk
Copy link
Member

potiuk commented Nov 30, 2023

Here i asked if something else needed fix

Well. The CI tells you what needs to be fixed. I think you should take a look at the contributor's guide and understand that fixing and rebasing CI is what is the "first pass". You have one PR to submit. We regularly review 100s of PR a day. And CI and tests are the way to automate thing that must be fixed before we do next pass of reviews. So my comment - while it had been expressed with brevity - reflect the state -of matter. you need to fix tests before we move any furtther.

In case it is not clear - it's all described in our contributor's guide:

I understand it's a lot of information to digest, so I perfectly understand your lack of understanding how our PR process looks like and what is expected from author, but now that you got it explained in a little more than "tests need fixing" I hope you will continue contributing and follow those guidelines.

I think also what helps is understand that it's you who want to contribute something to the project, so it's a good idea to pay attention to your own PR and if in doubt consult the documentation.

Looking forward to your future contributions. No hard feelings from my side, I hope with those explanations you will understand why the ball was on your side all the time, you just did not follow up.

@A-Costa
Copy link
Contributor Author

A-Costa commented Nov 30, 2023

@potiuk of course no hard feelings on my side either! I appreciate the exchange and I'm sorry if I came up as rude, was never the intention!

Now that is more clear what is expected, and that action is required on my side, I'll be working on the things that are missing! Thanks

@potiuk
Copy link
Member

potiuk commented Nov 30, 2023

Apologies if MY comments were also seen as condescent (which I realise they might) - but this is simply coming from the sheer volume of things we have to deal with - it's very difficult to balance brevity (not helpful) with long messages (TLDR) and the time it needs to put, so we rely a lot on the docs we put together on contributing, but - yeah - it's difficult to follow, I understand that.

@A-Costa A-Costa force-pushed the dev/use_glob branch 2 times, most recently from d3419c4 to 41feec9 Compare December 17, 2023 16:11
@A-Costa A-Costa changed the title WIP Add use_glob to GCSObjectExistenceSensor Add use_glob to GCSObjectExistenceSensor Dec 17, 2023
@A-Costa A-Costa requested a review from potiuk as a code owner December 17, 2023 16:20
@A-Costa
Copy link
Contributor Author

A-Costa commented Dec 17, 2023

I've added the missing test for GCSBlobTrigger and removed WIP from the title, as this is ready to be reviewed.

You'll need also to bump gcloud-aio version in provider.yaml so it will include the fix in the upstream library

@eladkal regarding this, it seems to me that the version is not pinned in the yaml, so it should be fine already:

@potiuk
Copy link
Member

potiuk commented Dec 17, 2023

Is there a minimum version of gcloud-aio needed for that ? if so we should set >= MIN

@A-Costa
Copy link
Contributor Author

A-Costa commented Dec 17, 2023

Yes, the match_glob parameter is first exposed in the 9.0.0 release:
https://github.com/talkiq/gcloud-aio/releases/tag/storage-9.0.0

I'll add the minimum version

@potiuk potiuk merged commit 9233541 into apache:main Dec 17, 2023
52 checks passed
Copy link

boring-cyborg bot commented Dec 17, 2023

Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions.

@ephraimbuddy ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) provider:google Google (including GCP) related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants