-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GlueJobOperator failing with Invalid type for parameter RoleName after updating provider version. #29960
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
Looks like this is a bug introduced in #27893 We now try generate the config for the Glue Job before we know whether we're creating a new one or updating an existing job. If the job already exists the user doesn't need to provide as much info but the new method for generating the config assumes all info for creating a new job is present (in this case the role). So it breaks since this user hasn't provided a role, because their job already exists. |
Worse than that, the whole updating job logic makes all sorts of assumptions that don't hold true if the job is created outside of the operator (like in this case where the user created the job with terraform). The new logic will detect that the defaults in the operator generated config are different from whatever was used outside of airflow and it will try to clobber the Job. I didn't get a chance to review #27893, but it has many back-compat issues... |
Oops! I did review this PR and did not see this. @romibuzi Any chance you can take a look at this issue and come up with a fix? |
Oh damn indeed the operator will now try to update configuration :( Or as @Taragolis advised we split this operator in 2 and create another one |
This would be a straightforward solution for sure! But it would be an API change, and we can't put the genie back in the bottle: your change is already out and folks could be already using it with the expectation that it updates the job without the need for a flag. However, I think that's a pretty small contingent of folks, and we could consider this behaviour a bug, honestly, so fixing it in this way should probably not trigger back compat. I'd be interested to hear what others think. |
Yes. API change is sometimes necessary to fix bugs. We are not (and should not be) extremely strict with "any api change is backwards incompatible" - if for example the interface is unusable or it behaves in unpredictable or illogical ways, it's fine to introduce such changes as bugfixes. |
@o-nikolas Yeah i can definitely work on it and submit a PR this week! |
I have proposed a PR adding this new parameter here: #30162 |
Apache Airflow Provider(s)
amazon
Versions of Apache Airflow Providers
apache-airflow-providers-amazon = "7.3.0"
Apache Airflow version
2.5.1
Operating System
Debian GNU/Linux
Deployment
Official Apache Airflow Helm Chart
Deployment details
No response
What happened
After updating the provider version to 7.3.0 from 6.0.0, our glue jobs started failing. We currently use the GlueJobOperator to run existing Glue jobs that we manage in Terraform. The full traceback is below:
What you think should happen instead
The operator creates a new job run for a glue job without additional configuration.
How to reproduce
Create a DAG with a GlueJobOperator without using
iam_role_name
. Example:Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: