Skip to content

Commit

Permalink
BUG: Fix float32 precision issues in pd.to_datetime (#60510)
Browse files Browse the repository at this point in the history
* BUG: Fix float32 precision issues in pd.to_datetime

* BUG: Add note to whatsnew
  • Loading branch information
snitish authored Dec 9, 2024
1 parent b667fdf commit 6cbe941
Show file tree
Hide file tree
Showing 3 changed files with 18 additions and 0 deletions.
1 change: 1 addition & 0 deletions doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -626,6 +626,7 @@ Datetimelike
- Bug in :meth:`DatetimeIndex.union` and :meth:`DatetimeIndex.intersection` when ``unit`` was non-nanosecond (:issue:`59036`)
- Bug in :meth:`Series.dt.microsecond` producing incorrect results for pyarrow backed :class:`Series`. (:issue:`59154`)
- Bug in :meth:`to_datetime` not respecting dayfirst if an uncommon date string was passed. (:issue:`58859`)
- Bug in :meth:`to_datetime` on float32 df with year, month, day etc. columns leads to precision issues and incorrect result. (:issue:`60506`)
- Bug in :meth:`to_datetime` reports incorrect index in case of any failure scenario. (:issue:`58298`)
- Bug in :meth:`to_datetime` wrongly converts when ``arg`` is a ``np.datetime64`` object with unit of ``ps``. (:issue:`60341`)
- Bug in setting scalar values with mismatched resolution into arrays with non-nanosecond ``datetime64``, ``timedelta64`` or :class:`DatetimeTZDtype` incorrectly truncating those scalars (:issue:`56410`)
Expand Down
5 changes: 5 additions & 0 deletions pandas/core/tools/datetimes.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@
from pandas.core.dtypes.common import (
ensure_object,
is_float,
is_float_dtype,
is_integer,
is_integer_dtype,
is_list_like,
Expand Down Expand Up @@ -1153,6 +1154,10 @@ def coerce(values):
# we allow coercion to if errors allows
values = to_numeric(values, errors=errors)

# prevent prevision issues in case of float32 # GH#60506
if is_float_dtype(values.dtype):
values = values.astype("float64")

# prevent overflow in case of int8 or int16
if is_integer_dtype(values.dtype):
values = values.astype("int64")
Expand Down
12 changes: 12 additions & 0 deletions pandas/tests/tools/test_to_datetime.py
Original file line number Diff line number Diff line change
Expand Up @@ -2084,6 +2084,18 @@ def test_dataframe_str_dtype(self, df, cache):
)
tm.assert_series_equal(result, expected)

def test_dataframe_float32_dtype(self, df, cache):
# GH#60506
# coerce to float64
result = to_datetime(df.astype(np.float32), cache=cache)
expected = Series(
[
Timestamp("20150204 06:58:10.001002003"),
Timestamp("20160305 07:59:11.001002003"),
]
)
tm.assert_series_equal(result, expected)

def test_dataframe_coerce(self, cache):
# passing coerce
df2 = DataFrame({"year": [2015, 2016], "month": [2, 20], "day": [4, 5]})
Expand Down

0 comments on commit 6cbe941

Please sign in to comment.