Replies: 8 comments 8 replies
-
I believe this is deliberate. The error tells what to do - change the timezone to use pendulum. If you look here: https://airflow.apache.org/docs/apache-airflow/stable/timezone.html#time-zone-aware-dags, you are supposed to use pendulum timezones (and it's been there for quite a while when we expected pendulum not datetime). While previously it coudl accidentally work with datetime timezones, it was, I believe, not intended. So please convert your DAGs to use the pendulum timezones and it should be fixed. @uranusjr - am I right? Should we close it? |
Beta Was this translation helpful? Give feedback.
-
As far as I am concerned if this is deliberate this is not a great choice. As mentioned this date comes from the output of the Parsing of a YAML file. For now we implemented a work around: intercept the output of the parsing before passing it to Airflow. But, to me, using pendulum is a choice internal to Airflow. At it's interface Airflow should be capable of accepting anything that comes from Python standard library. And should not impose to use anything extra to deal with something as simple as dates. To the least the DAG class should validate properly validate its input, as you put new constraints on the input it accepts. Also this change is not really documented (still states here |
Beta Was this translation helpful? Give feedback.
-
For those who might wonder, there is this documentation section. Maybe the documentation of the DAG class should be updated to reference this section so people are more aware of it. |
Beta Was this translation helpful? Give feedback.
-
That's a very simplistic view.. Sometimes things work accidentally - especially if the implementation doesn't follow the documentation. So this is by far not a universal definition of bug that anyone should follow. It's not "0-1" definition by far But I am not sure how it was in this case, so I will revert to others to comment. I am not sure what was the case here - otherwise I'd have closed this or moved to a discussion straight away. |
Beta Was this translation helpful? Give feedback.
-
Airflow has a utility for it that you can use to make sure you don't need to use anything else. from airflow.utils import timezone
now = timezone.utcnow()
a_date = timezone.datetime(2017, 1, 1) https://airflow.apache.org/docs/apache-airflow/2.2.2/timezone.html#naive-and-aware-datetime-objects Airflow have relied on Pendulum for as long as I can remember and we make it clear on https://airflow.apache.org/docs/apache-airflow/2.2.2/timezone.html on why we picked pendulum instead of pytz too. However, if you have any suggestions on how we can make it clearer, we are happy to hear your thoughts |
Beta Was this translation helpful? Give feedback.
-
Yeah. Concur with @kaxil - maybe there are some ways we can make it clearer. I will convert this into discussion now - but maybe we can continue discussing what can be done to improve there ((and @cansjt -> maybe you can even directly create a PR to add clear the confusion in the examples you pointed out ? Airflow has > 1800 contributors, so that seems like an easy way to become one as well. |
Beta Was this translation helpful? Give feedback.
-
To clarify, #19450 is not strongly related. |
Beta Was this translation helpful? Give feedback.
-
I ran into this after upgrading from 2.1.4 to 2.2.3 using Google's Cloud Composer (basically a managed airflow installation). I found this issue pretty quickly based on the error on some of our DAGs. I wanted to provide my thoughts. Much of the documentation and examples I've seen use naive As mentioned in a previous comment in this thread, the DAG documentation states the It's also odd (to me) to have a specific timezone implementation, but to continue to allow naive datetimes. If airflow always enforced tz-aware datetimes, documentation and examples would need to get updated to continue working. It would likely be a much harder transition than the one from Airflow There's a FAQ page, which includes the question "What is the deal with start_time?" and it doesn't mention I know its not directly related to airflow, but Google's Cloud Composer is also driving new uses to Airflow. Their documentation also uses naive datetimes in their examples, and I havent seen mention of pendulum in their documentation. I plan on opening a support case with google and mentioning this issue in the hopes that they take some time to augment their documentation. |
Beta Was this translation helpful? Give feedback.
-
Apache Airflow version
2.2.2 (latest released)
Operating System
Debian
Versions of Apache Airflow Providers
N/A
Deployment
Other 3rd-party Helm chart
Deployment details
N/A
What happened
After upgrading from 2.1.3 to 2.2.2, Airflow is no longer capable of serializing some dags. It fails with the exception:
The start date was provide to the DAG initializer as
datetime.datetime(2020, 11, 1, 0, 0, tzinfo=datetime.timezone.utc)
, which, FYI, is the result of the parsing of a YAML file.What you expected to happen
Airflow should be capable of serializing the DAG without any error.
How to reproduce
Create a DAG with a start date as follows:
And enable DAG serialization in Airflow's configuration (though this is the default now, no more
store_serialized_dags
option in the configuration reference).Anything else
This is similar but yet slightly different from #16613.
May also be related to #19450.
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions