-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch enable_cftimeindex to True by default #2516
Conversation
Hello @spencerkclark! Thanks for updating the PR.
Comment last updated on October 27, 2018 at 18:13 Hours UTC |
Should we add nice error messages to resample and plot telling the user what to do if they try those two with cftimeindex? |
doc/time-series.rst
Outdated
@@ -223,16 +224,26 @@ Through the standalone ``cftime`` library and a custom subclass of | |||
functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for | |||
dates from non-standard calendars or dates using a standard calendar, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify "non-standard calendars" -> "non-standard calendars commonly used in climate science"
This paragraph is missing a bit of context for non-climate users :)
doc/time-series.rst
Outdated
|
||
- Resampling along the time dimension for data indexed by a | ||
:py:class:`~xarray.CFTimeIndex` | ||
- Built-in plotting of data with :py:class:`cftime.datetime` coordinate axes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to link to the GitHub issues for these.
doc/time-series.rst
Outdated
|
||
If at any time you would like to restore the old default behavior, which was | ||
to attempt to decode datetimes into ``np.datetime64[ns]`` objects whenever | ||
possible (regardless of calendar type), you can set ``enable_cftimeindex`` to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather encourage people to convert their data to a standard calendar via normal xarray/cftime APIs. That's a cleaner solution than setting global configuration.
Can we add an example of this with existing APIs? Or maybe we should augment CFTimeIndex with a new to_datetime64
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather encourage people to convert their data to a standard calendar via normal xarray/cftime APIs. That's a cleaner solution than setting global configuration.
This is a really good point and would make things a lot easier to understand. Currently we use xarray.coding.times.cftime_to_nptime
internally; I can convert it to a better-documented public function (perhaps with friendlier error handling too).
doc/whats-new.rst
Outdated
@@ -33,6 +33,12 @@ v0.11.0 (unreleased) | |||
Breaking changes | |||
~~~~~~~~~~~~~~~~ | |||
|
|||
- The option ``enable_cftimeindex`` has now been set to ``True`` by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should think about eventually removing this option entirely, and making it simply a "no-op" for now (raising FutureWarning). In the long term I think we would prefer to avoid keeping around extraneous global options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense -- that will clean up the datetime decoding logic (and tests) quite a bit.
@@ -291,7 +279,12 @@ def cftime_to_nptime(times): | |||
times = np.asarray(times) | |||
new = np.empty(times.shape, dtype='M8[ns]') | |||
for i, t in np.ndenumerate(times): | |||
dt = datetime(t.year, t.month, t.day, t.hour, t.minute, t.second) | |||
try: | |||
dt = pd.Timestamp(t.year, t.month, t.day, t.hour, t.minute, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I switched to using pd.Timestamp
here, because directly converting from datetime.datetime
to np.datetime64[ns]
did not raise an error for dates outside 1678 to 2262:
In [1]: from datetime import datetime; import numpy as np
In [2]: np.datetime64(datetime(1, 1, 1), 'ns')
Out[2]: numpy.datetime64('1754-08-30T22:43:41.128654848')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good -- please note this in a comment in the code
The usual rule is that if you have to comment on your pull request to explain why you did something, you should just comment the code instead! :)
@@ -291,7 +279,12 @@ def cftime_to_nptime(times): | |||
times = np.asarray(times) | |||
new = np.empty(times.shape, dtype='M8[ns]') | |||
for i, t in np.ndenumerate(times): | |||
dt = datetime(t.year, t.month, t.day, t.hour, t.minute, t.second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was there a reason we did not include the microsecond
attribute of the datetime here previously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly microsecond
is only a recent addition to cftime.datetime
?
'calendar, {!r}, to a pandas.DatetimeIndex, which uses dates ' | ||
'from the standard calendar. This may lead to subtle errors ' | ||
'in operations that depend on the length of time between ' | ||
'dates.'.format(calendar), RuntimeWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, but can we add a keyword argument for silencing the warning? Maybe index.to_datetimeindex(unsafe=True)
could silence the warning?
@@ -291,7 +279,12 @@ def cftime_to_nptime(times): | |||
times = np.asarray(times) | |||
new = np.empty(times.shape, dtype='M8[ns]') | |||
for i, t in np.ndenumerate(times): | |||
dt = datetime(t.year, t.month, t.day, t.hour, t.minute, t.second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly microsecond
is only a recent addition to cftime.datetime
?
@@ -291,7 +279,12 @@ def cftime_to_nptime(times): | |||
times = np.asarray(times) | |||
new = np.empty(times.shape, dtype='M8[ns]') | |||
for i, t in np.ndenumerate(times): | |||
dt = datetime(t.year, t.month, t.day, t.hour, t.minute, t.second) | |||
try: | |||
dt = pd.Timestamp(t.year, t.month, t.day, t.hour, t.minute, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good -- please note this in a comment in the code
The usual rule is that if you have to comment on your pull request to explain why you did something, you should just comment the code instead! :)
xarray/core/common.py
Outdated
@@ -683,6 +684,20 @@ def resample(self, freq=None, dim=None, how=None, skipna=None, | |||
else: | |||
raise TypeError("Dimension name should be a string; " | |||
"was passed %r" % dim) | |||
|
|||
if isinstance(self.indexes[dim_name], CFTimeIndex): | |||
raise TypeError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about NotImplementedError
for now? We do plan to support this eventually
xarray/core/options.py
Outdated
@@ -83,6 +85,11 @@ class set_options(object): | |||
|
|||
def __init__(self, **kwargs): | |||
self.old = OPTIONS.copy() | |||
if ENABLE_CFTIMEINDEX in kwargs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this as a function in SETTERS
instead? That lets us keep the option specific logic outside of this class.
xarray/core/options.py
Outdated
@@ -52,7 +54,7 @@ class set_options(object): | |||
Default: ``'inner'``. | |||
- ``enable_cftimeindex``: flag to enable using a ``CFTimeIndex`` | |||
for time indexes with non-standard calendars or dates outside the | |||
Timestamp-valid range. Default: ``False``. | |||
Timestamp-valid range. Default: ``True``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove this now -- no need to document a deprecated/unused option.
Conflicts: xarray/tests/test_coding_times.py
This would have failed before including the microsecond attribute of each date in cftime_to_nptime in eaa4a44.
I don't know why this failed in one of the Appveyor builds:
|
Thanks @shoyer -- that was a bit puzzling but I think I sorted it out. I haven't done a careful diagnosis, but my hunch is that the previous build was running into this issue: pytest-dev/pytest#2917. A fix was merged a few weeks ago, but the Python 2.7 32-bit build is/was using a relatively old version of pytest (3.5 versus the most recent 3.9.3). In the previous build, the FutureWarning for setting the
In 6d08d3b I made changes to |
This looks like a reasonable fix to me |
* upstream/master: (122 commits) add missing , and article in error message (pydata#2557) Add libnetcdf, libhdf5, pydap and cfgrib to xarray.show_versions() (pydata#2555) revert to dev version for 0.11.1 Release xarray v0.11 DOC: update whatsnew for xarray 0.11 release (pydata#2548) Drop the hack needed to use CachingFileManager as we don't use it anymore. (pydata#2544) add full test env for py37 ci env (pydata#2545) Remove old-style resample example in documentation (pydata#2543) Stop loading tutorial data by default (pydata#2538) Remove the old syntax for resample. (pydata#2541) Remove use of deprecated, unused keyword. (pydata#2540) Deprecate inplace (pydata#2524) Zarr chunking (GH2300) (pydata#2487) Include multidimensional stacking groupby in docs (pydata#2493) (pydata#2536) Switch enable_cftimeindex to True by default (pydata#2516) Raise more informative error when converting tuples to Variable. (pydata#2523) Global option to always keep/discard attrs on operations (pydata#2482) Remove tests where answers change in cftime 1.0.2.1 (pydata#2522) Finish deprecation cycle for DataArray.__contains__ checking array values (pydata#2520) Fix bug where OverflowError is not being raised (pydata#2519) ...
As discussed in #2437 and #2505, this sets the option
enable_cftimeindex
toTrue
by default.whats-new.rst
for all changes.