Continue to use nanosecond-precision Timestamps in precision-sensitive areas #7731

spencerkclark · 2023-04-06T13:06:50Z

This addresses the remaining cftime-related test failures in #7707 by introducing a function that always returns a nanosecond-precision Timestamp object. Despite no corresponding test failures, for safety I grepped and went ahead and replaced the pd.Timestamp constructor with this function in a few other areas. I also updated our documentation to replace any mentions of the "Timestamp-valid range" with "nanosecond-precision range" since Timestamps are now more flexible, and included a note that we have an issue open for relaxing this nanosecond-precision assumption in xarray eventually.

While in principle I think it would be fine if CFTimeIndex.to_datetimeindex returned a DatetimeIndex with non-nanosecond-precision values, since we don't use to_datetimeindex anywhere outside of tests, in its current state it was returning nonsense values:

>>> import pandas as pd
>>> import xarray as xr
>>> times = xr.cftime_range("0001", periods=5)
>>> times.to_datetimeindex()
DatetimeIndex(['1754-08-30 22:43:41.128654848',
               '1754-08-31 22:43:41.128654848',
               '1754-09-01 22:43:41.128654848',
               '1754-09-02 22:43:41.128654848',
               '1754-09-03 22:43:41.128654848'],
              dtype='datetime64[ns]', freq=None)

This is due to the assumption in cftime_to_nptime that the resulting array will have nanosecond-precision values. We can (and should) address this eventually, but for the sake of quickly supporting pandas version two I decided to be conservative and punt this off to be part of #7493. cftime_to_nptime is used in places other than to_datetimeindex, so modifying it has other impacts downstream.

…estamp

keewis · 2023-04-06T13:26:22Z

to properly test this, I guess we'd need to merge #7724 first?

jsignell · 2023-04-06T13:38:38Z

to properly test this, I guess we'd need to merge #7724 first?

Otherwise the env in the upstream test will never solve right?

keewis · 2023-04-06T13:48:06Z

sorry, I forgot about being able to use the upstream-dev environment for testing. That should solve just fine since by design it ignores all pinned dependencies, but we currently get occasional segfaults and unrelated failing tests, and we only test on a single OS / python version.

spencerkclark · 2023-04-06T13:57:56Z

I'm fine waiting until #7724 is merged to let our main CI cover this. Indeed the upstream tests are flaky. Locally I just installed pandas 2 via pip to do testing during development.

xarray/core/pdcompat.py

dcherian · 2023-04-12T14:41:02Z

RTD failures are real:

WARNING: [autosummary] failed to import xarray.CFTimeIndex.is_all_dates.
Possible hints:
* ImportError: 
* AttributeError: type object 'CFTimeIndex' has no attribute 'is_all_dates'
* ModuleNotFoundError: No module named 'xarray.CFTimeIndex'
WARNING: [autosummary] failed to import xarray.CFTimeIndex.is_mixed.
Possible hints:
* ImportError: 
* AttributeError: type object 'CFTimeIndex' has no attribute 'is_mixed'
* ModuleNotFoundError: No module named 'xarray.CFTimeIndex'
WARNING: [autosummary] failed to import xarray.CFTimeIndex.is_monotonic.
Possible hints:
* ImportError: 
* AttributeError: type object 'CFTimeIndex' has no attribute 'is_monotonic'
* ModuleNotFoundError: No module named 'xarray.CFTimeIndex'
WARNING: [autosummary] failed to import xarray.CFTimeIndex.is_type_compatible.
Possible hints:
* AttributeError: type object 'CFTimeIndex' has no attribute 'is_type_compatible'
* ImportError: 
* ModuleNotFoundError: No module named 'xarray.CFTimeIndex'
WARNING: [autosummary] failed to import xarray.CFTimeIndex.set_value.
Possible hints:
* ImportError: 
* AttributeError: type object 'CFTimeIndex' has no attribute 'set_value'
* ModuleNotFoundError: No module named 'xarray.CFTimeIndex'
WARNING: [autosummary] failed to import xarray.CFTimeIndex.to_native_types.
Possible hints:
* ImportError: 
* AttributeError: type object 'CFTimeIndex' has no attribute 'to_native_types'
* ModuleNotFoundError: No module named 'xarray.CFTimeIndex'

Co-authored-by: Illviljan <[email protected]>

spencerkclark · 2023-04-12T22:42:26Z

Thanks all for the help! Fingers crossed things should be all green now. Happy to address any more review comments.

dcherian · 2023-04-13T03:35:01Z

There are a bunch of warnings in the tests that could be silenced:

D:\a\xarray\xarray\xarray\tests\test_dataset.py:516: UserWarning: Converting non-nanosecond precision datetime values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time.

But we can also just merge quickly to get a release out

spencerkclark · 2023-04-13T14:50:15Z

Thanks for noting that @dcherian -- I think I got to all of them now.

dcherian · 2023-04-13T14:58:20Z

Thanks for patiently working through this Spencer. I'll merge now, and then we can release tomorrow.

* main: (34 commits) Update whats-new.rst Fix binning by unsorted array (pydata#7762) Bump codecov/codecov-action from 3.1.1 to 3.1.2 (pydata#7760) Fix typing errors using mypy 1.2 (pydata#7752) [skip-ci] dev whats-new Add whats-new for v2023.04.0 (pydata#7757) remove the `black` hook (pydata#7756) reword the what's new entry for the `pandas` 2.0 dtype changes (pydata#7755) restructure the contributing guide (pydata#7681) Continue to use nanosecond-precision Timestamps in precision-sensitive areas (pydata#7731) minor doc updates to clarify extensions using accessors (pydata#7751) align: Avoid reindexing when join="exact" (pydata#7736) `pandas=2.0` support (pydata#7724) Clarify vectorized indexing documentation (pydata#7747) Avoid recasting a CFTimeIndex (pydata#7735) fix typo (pydata#7746) [pre-commit.ci] pre-commit autoupdate (pydata#7745) Bump pypa/gh-action-pypi-publish from 1.8.4 to 1.8.5 (pydata#7743) preserve boolean dtype in encoding (pydata#7720) [skip-ci] Add alignment benchmarks (pydata#7738) ...

[test-upstream] use nanosecond-precision timestamps for now

6868618

github-actions bot added the topic-cftime label Apr 6, 2023

[test-upstream] allow kwargs to be passed to nanosecond_precision_tim…

dbe7b85

…estamp

keewis mentioned this pull request Apr 7, 2023

pandas=2.0 support #7724

Merged

dcherian and others added 2 commits April 12, 2023 07:04

Merge branch 'main' into enforce-nanosecond-precision

9ef7322

unpin pandas

058354b

github-actions bot added CI Continuous Integration tools dependencies Pull requests that update a dependency file labels Apr 12, 2023

Illviljan reviewed Apr 12, 2023

View reviewed changes

xarray/core/pdcompat.py Outdated Show resolved Hide resolved

spencerkclark and others added 5 commits April 12, 2023 18:18

Add type hint for nanosecond_precision_timestamp

0e9d83a

Co-authored-by: Illviljan <[email protected]>

Add one more TODO comment

1520629

Remove deleted attributes in CFTimeIndex documentation

7c38022

Merge branch 'main' into enforce-nanosecond-precision

633a5c1

Add a what's new entry

06029d7

Remove one more attribute from api-hidden.rst

602b154

spencerkclark added 2 commits April 13, 2023 06:53

Silence conversion warnings in tests

7ee1a97

Silence more conversion warnings

4e24ca8

dcherian merged commit c9c1c6d into pydata:main Apr 13, 2023

spencerkclark deleted the enforce-nanosecond-precision branch April 13, 2023 15:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Continue to use nanosecond-precision Timestamps in precision-sensitive areas #7731

Continue to use nanosecond-precision Timestamps in precision-sensitive areas #7731

spencerkclark commented Apr 6, 2023

keewis commented Apr 6, 2023

jsignell commented Apr 6, 2023

keewis commented Apr 6, 2023

spencerkclark commented Apr 6, 2023

dcherian commented Apr 12, 2023 •

edited

Loading

spencerkclark commented Apr 12, 2023

dcherian commented Apr 13, 2023

spencerkclark commented Apr 13, 2023

dcherian commented Apr 13, 2023

Continue to use nanosecond-precision Timestamps in precision-sensitive areas #7731

Continue to use nanosecond-precision Timestamps in precision-sensitive areas #7731

Conversation

spencerkclark commented Apr 6, 2023

keewis commented Apr 6, 2023

jsignell commented Apr 6, 2023

keewis commented Apr 6, 2023

spencerkclark commented Apr 6, 2023

dcherian commented Apr 12, 2023 • edited Loading

spencerkclark commented Apr 12, 2023

dcherian commented Apr 13, 2023

spencerkclark commented Apr 13, 2023

dcherian commented Apr 13, 2023

dcherian commented Apr 12, 2023 •

edited

Loading