Skip to content

Commit

Permalink
Backport PR #44828 on branch 1.3.x (REGR: resampling DataFrame with D…
Browse files Browse the repository at this point in the history
…ateTimeIndex with empty groups and uint8, uint16 or uint32 columns incorrectly raising RuntimeError) (#44831)

Co-authored-by: Simon Hawkins <[email protected]>
  • Loading branch information
meeseeksmachine and simonjayhawkins authored Dec 10, 2021
1 parent f3bcf95 commit 1804780
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 3 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.3.5.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ Fixed regressions
~~~~~~~~~~~~~~~~~
- Fixed regression in :meth:`Series.equals` when comparing floats with dtype object to None (:issue:`44190`)
- Fixed regression in :func:`merge_asof` raising error when array was supplied as join key (:issue:`42844`)
- Fixed regression when resampling :class:`DataFrame` with :class:`DateTimeIndex` with empty groups and ``uint8``, ``uint16`` or ``uint32`` columns incorrectly raising ``RuntimeError`` (:issue:`43329`)
- Fixed regression in creating a :class:`DataFrame` from a timezone-aware :class:`Timestamp` scalar near a Daylight Savings Time transition (:issue:`42505`)
- Fixed performance regression in :func:`read_csv` (:issue:`44106`)
- Fixed regression in :meth:`Series.duplicated` and :meth:`Series.drop_duplicates` when Series has :class:`Categorical` dtype with boolean categories (:issue:`44351`)
Expand Down
7 changes: 4 additions & 3 deletions pandas/core/groupby/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -546,9 +546,10 @@ def _call_cython_op(
elif is_bool_dtype(dtype):
values = values.astype("int64")
elif is_integer_dtype(dtype):
# e.g. uint8 -> uint64, int16 -> int64
dtype_str = dtype.kind + "8"
values = values.astype(dtype_str, copy=False)
# GH#43329 If the dtype is explicitly of type uint64 the type is not
# changed to prevent overflow.
if dtype != np.uint64:
values = values.astype(np.int64, copy=False)
elif is_numeric:
if not is_complex_dtype(dtype):
values = ensure_float64(values)
Expand Down
24 changes: 24 additions & 0 deletions pandas/tests/resample/test_datetime_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -1827,3 +1827,27 @@ def test_resample_aggregate_functions_min_count(func):
index=DatetimeIndex(["2020-03-31"], dtype="datetime64[ns]", freq="Q-DEC"),
)
tm.assert_series_equal(result, expected)


def test_resample_unsigned_int(uint_dtype):
# gh-43329
df = DataFrame(
index=date_range(start="2000-01-01", end="2000-01-03 23", freq="12H"),
columns=["x"],
data=[0, 1, 0] * 2,
dtype=uint_dtype,
)
df = df.loc[(df.index < "2000-01-02") | (df.index > "2000-01-03"), :]

if uint_dtype == "uint64":
with pytest.raises(RuntimeError, match="empty group with uint64_t"):
result = df.resample("D").max()
else:
result = df.resample("D").max()

expected = DataFrame(
[1, np.nan, 0],
columns=["x"],
index=date_range(start="2000-01-01", end="2000-01-03 23", freq="D"),
)
tm.assert_frame_equal(result, expected)

0 comments on commit 1804780

Please sign in to comment.