Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better support for Boto Waiters #28236

Merged
merged 3 commits into from
Dec 12, 2022
Merged

Conversation

ferruzzi
Copy link
Contributor

@ferruzzi ferruzzi commented Dec 8, 2022

This PR cleans up a bit of tech debt in the EKS operators. When I wrote them I didn't know what boto waiters were. A Boto waiter is similar to an Ariflow Sensor but the delay, backoff, and max-retries is all handled server-side by the service. I've replaced a bunch of my custom waiter code with official boto waiters, added the infrastructure to allow us to write custom waiters, along with documentation and an example.

This should make future AWS sensors much easier and cleaner to write. We may want to go through the existing sensors and convert them over at some point. This also opens up an interesting idea of using a custom waiter in a TaskFlow method in place of a sensor. Who knows.

Rendered version of the README is here for easier review.

@ferruzzi ferruzzi requested a review from eladkal as a code owner December 8, 2022 20:11
@boring-cyborg boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Dec 8, 2022
airflow/providers/amazon/aws/waiters/base_waiter.py Outdated Show resolved Hide resolved
@@ -0,0 +1,37 @@
# Licensed to the Apache Software Foundation (ASF) under one
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought BatchClient already implements some kind of waiters (I personally never use it in this hook)

  • airflow/providers/amazon/aws/hooks/batch_waiters.json
  • airflow/providers/amazon/aws/hooks/batch_waiters.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes? Looks like it. I've never used that one and it's not something you could inherit and use for other services.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean a good point one day also migrate batch waiters to generic solution

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this would lay the framework for that and adding whatever other custom waiters that folks want to make. The more I look at it, the more I think it'll either simplify or replace a lot of Sensors as well, which wasn't the intended purpose but a very nice side effect.

Comment on lines 601 to 602
def get_waiter(self, waiter_name):
return EksBotoWaiter(client=self.conn).waiter(waiter_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought: "Is it possible to make it part of base AWS hook?"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but then you'd have to pass it the waiter model every time too, right?

Copy link
Contributor

@Taragolis Taragolis Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just an idea.

It could be some optional argument of hook and if it None than use predefined model for hook (e.g. from waiters directory) if defined than use this model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand a lot of boto3 clients doesn't have waiters so it might be add additional complexity if it included in base hook

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe possible to work some magic.... if every service's custom waiter model config is in ~/aws/waiters/{service}.json then the service name could be extracted from the hook

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one additional thing by default there are limited files which included in resulting wheel/sdist package, and json files do not add by default.

So we also need updates in this file:

{% if PROVIDER_PACKAGE_ID == 'amazon' %}
include airflow/providers/amazon/aws/hooks/batch_waiters.json
{% elif PROVIDER_PACKAGE_ID == 'google' %}
include airflow/providers/google/cloud/example_dags/*.yaml
include airflow/providers/google/cloud/example_dags/*.sql

For avoid the situation which happen with k8s provider: #27451

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhhh, thank you. I was unaware of that. I had a chat with Vincent yesterday and he suggested another change so I should have this updated today. Sorry for the delay.

Copy link
Contributor Author

@ferruzzi ferruzzi Dec 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file you linked at dev/provider_packages/MANIFEST_TEMPLATE.in.jinja2 has a message in it saying

# NOTE! THIS FILE IS AUTOMATICALLY GENERATED AND WILL BE
# OVERWRITTEN WHEN PREPARING PACKAGES.

# IF YOU WANT TO MODIFY IT, YOU SHOULD MODIFY THE TEMPLATE
# `MANIFEST_TEMPLATE.py.jinja2` IN the `provider_packages` DIRECTORY

but there is no file by that name. Do you know if that's an old warning that should be removed/ignored?

ferruzzi:~/workplace/airflow$ find . -name MANIFEST_TEMPLATE*
./dev/provider_packages/MANIFEST_TEMPLATE.in.jinja2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment added to generated MANIFEST.in into the provider package.

I thought it some mistake and it should be MANIFEST_TEMPLATE.in.jinja2 instead of MANIFEST_TEMPLATE.py.jinja2.

cc: @potiuk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed MANIFEST_TEMPLATE.in.jinja2. If that's not right, I can fix it. I'll need to rebase before this gets merged anyway.

@ferruzzi ferruzzi force-pushed the ferruzzi/eks-waiters branch 3 times, most recently from 391e883 to 0d25b1e Compare December 11, 2022 08:24
Copy link
Contributor

@Taragolis Taragolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good

@ferruzzi
Copy link
Contributor Author

Looks like it is all passing but wants a rebase again. I'll give it a nudge.

@ferruzzi ferruzzi force-pushed the ferruzzi/eks-waiters branch from 0d25b1e to 617665d Compare December 12, 2022 02:59
@ferruzzi ferruzzi closed this Dec 12, 2022
@ferruzzi ferruzzi reopened this Dec 12, 2022
@ferruzzi
Copy link
Contributor Author

@Taragolis - You have the power now! Do eeeetttt! :P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers provider:amazon AWS/Amazon - related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants