-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address latest pandas-related upstream test failures #9081
Address latest pandas-related upstream test failures #9081
Conversation
ed23adc
to
ad164b7
Compare
ad164b7
to
bd875d3
Compare
(datetime(2000, 1, 1), has_pandas_3), | ||
(np.array([datetime(2000, 1, 1)]), has_pandas_3), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With pandas 3, pd.Series(datetime.datetime(...))
will produce a Series
with np.datetime64[us]
values instead of np.datetime64[ns]
values, so this conversion now warns.
dd = times.to_pydatetime() | ||
reference_dates = [dd[0], dd[2]] | ||
reference_dates = [times[0], times[2]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, whether reference_dates
started as datetime.datetime
objects or np.datetime64[ns]
values was not material to this test, so I removed the conversion to datetime.datetime
to avoid the conversion warning under pandas 3 (the times would previously get converted back to datetime64[ns]
values in the DataArray
constructor).
xarray/tests/test_dataarray.py
Outdated
roundtripped = DataArray.from_dict(da.to_dict()) | ||
with warnings.catch_warnings(): | ||
warnings.filterwarnings("ignore", message="Converting non-nanosecond") | ||
roundtripped = DataArray.from_dict(da.to_dict()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
da.to_dict()
produces datetime.datetime
objects, which under pandas 3 lead to a conversion warning in the DataArray
constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we have this pattern in multiple modules, it might be worth adding the code as a special context manager to xarray.tests.__init__
. Something like this might work (I didn't check):
from contextlib import contextmanager
import warnings
@contextmanager
def ignore_warnings(category=None, pattern=None):
if category is None and pattern is None:
raise ValueError("need at least one of category and pattern")
try:
with warnings.catch_warnings():
warnings.filterwarnings("ignore", message=pattern, category=category)
yield
finally:
pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks—I ended up switching to marking these tests with @pytest.mark.filterwarnings("ignore:Converting non-nanosecond")
, since that is a pattern we use elsewhere in the tests already.
xarray/tests/test_plot.py
Outdated
darray = DataArray(data, dims=["time"]) | ||
darray.coords["time"] = np.array([datetime(2017, m, 1) for m in month]) | ||
times = pd.date_range(start="2017-01-01", freq="ME", periods=12) | ||
darray = DataArray(data, dims=["time"], coords=[times]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use of datetime.datetime
objects was immaterial to this test, so we use pd.date_range
to produce the dates instead to avoid the non-nanosecond conversion warning.
xarray/tests/test_variable.py
Outdated
with warnings.catch_warnings(): | ||
warnings.filterwarnings("ignore", message="Converting non-nanosecond") | ||
expected = self.cls("t", dates) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed since dates
sometimes consists of datetime.datetime
objects, which leads to a conversion warning under pandas 3.
@@ -529,7 +529,7 @@ def test_roundtrip_string_encoded_characters(self) -> None: | |||
assert actual["x"].encoding["_Encoding"] == "ascii" | |||
|
|||
def test_roundtrip_numpy_datetime_data(self) -> None: | |||
times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"]) | |||
times = pd.to_datetime(["2000-01-01", "2000-01-02", "NaT"], unit="ns") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas.to_datetime
will infer the precision from the input in pandas 3, so we explicitly specify the desired precision now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fixes! This looks good to me.
I know I have not been doing that either for the numpy>=2
changes, but I wonder if we should add a whats-new
entry (internal changes)?
Thanks @spencerkclark |
* Address pandas-related upstream test failures * Address more warnings * Don't lose coverage for pandas < 3 * Address one more warning * Fix accidental change from MS to ME * Use datetime64[ns] arrays * Switch to @pytest.mark.filterwarnings
* don't remove `netcdf4` from the upstream-dev environment * also stop removing `h5py` and `hdf5` * hard-code the precision (I believe this was missed in #9081) * don't remove `h5py` either * use on-diks _FillValue as standrd expects, use view instead of cast to prevent OverflowError. * whats-new * unpin `numpy` * rework UnsignedCoder * add test * Update xarray/coding/variables.py Co-authored-by: Justus Magin <[email protected]> --------- Co-authored-by: Kai Mühlbauer <[email protected]> Co-authored-by: Kai Mühlbauer <[email protected]> Co-authored-by: Deepak Cherian <[email protected]>
This PR addresses the upstream failures described in #8844 (comment) with a few minor changes to ensure that, for the time being, nanosecond precision times continue to be used in xarray. These failures stem from pandas-dev/pandas#55901, which causes
pandas.to_datetime
to infer the precision to use based its input instead of always using nanosecond precision.See the review comments for an explanation of the changes.