-
Notifications
You must be signed in to change notification settings - Fork 14.4k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAG serialization broken because of pendulum #20055
Comments
I believe this is deliberate. The error tells what to do - change the timezone to use pendulum. If you look here: https://airflow.apache.org/docs/apache-airflow/stable/timezone.html#time-zone-aware-dags, you are supposed to use pendulum timezones (and it's been there for quite a while when we expected pendulum not datetime). While previously it coudl accidentally work with datetime timezones, it was, I believe, not intended. So please convert your DAGs to use the pendulum timezones and it should be fixed. @uranusjr - am I right? Should we close it? |
As far as I am concerned if this is deliberate this is not a great choice. As mentioned this date comes from the output of the Parsing of a YAML file. For now we implemented a work around: intercept the output of the parsing before passing it to Airflow. But, to me, using pendulum is a choice internal to Airflow. At it's interface Airflow should be capable of accepting anything that comes from Python standard library. And should not impose to use anything extra to deal with something as simple as dates. To the least the DAG class should validate properly validate its input, as you put new constraints on the input it accepts. Also this change is not really documented (still states here |
For those who might wonder, there is this documentation section. Maybe the documentation of the DAG class should be updated to reference this section so people are more aware of it. |
That's a very simplistic view.. Sometimes things work accidentally - especially if the implementation doesn't follow the documentation. So this is by far not a universal definition of bug that anyone should follow. It's not "0-1" definition by far But I am not sure how it was in this case, so I will revert to others to comment. I am not sure what was the case here - otherwise I'd have closed this or moved to a discussion straight away. |
Airflow has a utility for it that you can use to make sure you don't need to use anything else. from airflow.utils import timezone
now = timezone.utcnow()
a_date = timezone.datetime(2017, 1, 1) https://airflow.apache.org/docs/apache-airflow/2.2.2/timezone.html#naive-and-aware-datetime-objects Airflow have relied on Pendulum for as long as I can remember and we make it clear on https://airflow.apache.org/docs/apache-airflow/2.2.2/timezone.html on why we picked pendulum instead of pytz too. However, if you have any suggestions on how we can make it clearer, we are happy to hear your thoughts |
Yeah. Concur with @kaxil - maybe there are some ways we can make it clearer. I will convert this into discussion now - but maybe we can contine discussing what can be done to improve there ((and @cansjt -> maybe you can even directly create a PR to add clear the confusion in the examples you pointed out ? Airflow has > 1800 contributors, so that seems like an easy way to become one as well. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Apache Airflow version
2.2.2 (latest released)
Operating System
Debian
Versions of Apache Airflow Providers
N/A
Deployment
Other 3rd-party Helm chart
Deployment details
N/A
What happened
After upgrading from 2.1.3 to 2.2.2, Airflow is no longer capable of serializing some dags. It fails with the exception:
The start date was provide to the DAG initializer as
datetime.datetime(2020, 11, 1, 0, 0, tzinfo=datetime.timezone.utc)
, which, FYI, is the result of the parsing of a YAML file.What you expected to happen
Airflow should be capable of serializing the DAG without any error.
How to reproduce
Create a DAG with a start date as follows:
And enable DAG serialization in Airflow's configuration (though this is the default now, no more
store_serialized_dags
option in the configuration reference).Anything else
This is similar but yet slightly different from #16613.
May also be related to #19450.
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: