Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REGR: Series.__mod__ behaves different with numexpr #36552

Merged
merged 9 commits into from
Sep 30, 2020
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.1.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ Fixed regressions
- Fixed regression when adding a :meth:`timedelta_range` to a :class:`Timestamp` raised a ``ValueError`` (:issue:`35897`)
- Fixed regression in :meth:`Series.__getitem__` incorrectly raising when the input was a tuple (:issue:`35534`)
- Fixed regression in :meth:`Series.__getitem__` incorrectly raising when the input was a frozenset (:issue:`35747`)
- Fixed regression in modulo of :class:`Index`, :class:`Series` and :class:`DataFrame` using ``numexpr`` using C not Python semantics (:issue:`36047`, :issue:`36526`)
- Fixed regression in :meth:`read_excel` with ``engine="odf"`` caused ``UnboundLocalError`` in some cases where cells had nested child nodes (:issue:`36122`, :issue:`35802`)
- Fixed regression in :meth:`DataFrame.replace` inconsistent replace when using a float in the replace method (:issue:`35376`)
- Fixed regression in :class:`DataFrame` and :class:`Series` comparisons between numeric arrays and strings (:issue:`35700`, :issue:`36377`)
Expand Down
5 changes: 4 additions & 1 deletion pandas/core/computation/expressions.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,10 @@ def _evaluate_numexpr(op, op_str, a, b):
roperator.rtruediv: "/",
operator.floordiv: "//",
roperator.rfloordiv: "//",
operator.mod: "%",
# we require Python semantics for mod of negative for backwards compatibility
jreback marked this conversation as resolved.
Show resolved Hide resolved
# see https://github.com/pydata/numexpr/issues/365
# so sticking with unaccelerated for now
operator.mod: None,
roperator.rmod: "%",
operator.pow: "**",
roperator.rpow: "**",
Expand Down
2 changes: 0 additions & 2 deletions pandas/core/ops/methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,8 +171,6 @@ def _create_methods(cls, arith_method, comp_method, bool_method, special):
mul=arith_method(cls, operator.mul, special),
truediv=arith_method(cls, operator.truediv, special),
floordiv=arith_method(cls, operator.floordiv, special),
# Causes a floating point exception in the tests when numexpr enabled,
# so for now no speedup
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message should have been moved in #19649

I think Ok if we have not had reports of floating point exceptions since enabled.

mod=arith_method(cls, operator.mod, special),
pow=arith_method(cls, operator.pow, special),
# not entirely sure why this is necessary, but previously was included
Expand Down
40 changes: 39 additions & 1 deletion pandas/tests/test_expressions.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import pytest

import pandas._testing as tm
from pandas.core.api import DataFrame
from pandas.core.api import DataFrame, Index, Series
from pandas.core.computation import expressions as expr

_frame = DataFrame(randn(10000, 4), columns=list("ABCD"), dtype="float64")
Expand Down Expand Up @@ -380,3 +380,41 @@ def test_frame_series_axis(self, axis, arith):

result = op_func(other, axis=axis)
tm.assert_frame_equal(expected, result)

@pytest.mark.parametrize(
"op",
[
"__mod__",
pytest.param("__rmod__", marks=pytest.mark.xfail(reason="GH-36552")),
"__floordiv__",
"__rfloordiv__",
],
)
@pytest.mark.parametrize("box", [DataFrame, Series, Index])
@pytest.mark.parametrize("scalar", [-5, 5])
def test_python_semantics_with_numexpr_installed(self, op, box, scalar):
# https://github.com/pandas-dev/pandas/issues/36047
expr._MIN_ELEMENTS = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it is done like that in other tests as well (so doesn't need to be fixed here), but that doesn't seem a very clean way to patch this, as it will influence other tests as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is an class with a teardown_method that resets _MIN_ELEMENTS

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but yes we should replace with a fixture.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, didn't see the teardown/setup, only the import at the top ..

data = np.arange(-50, 50)
obj = box(data)
method = getattr(obj, op)
result = method(scalar)

# compare result with numpy
expr.set_use_numexpr(False)
expected = method(scalar)
expr.set_use_numexpr(True)
tm.assert_equal(result, expected)

# compare result element-wise with Python
for i, elem in enumerate(data):
if box == DataFrame:
scalar_result = result.iloc[i, 0]
else:
scalar_result = result[i]
try:
expected = getattr(int(elem), op)(scalar)
except ZeroDivisionError:
pass
else:
assert scalar_result == expected