-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REGR: Series.__mod__ behaves different with numexpr #36552
REGR: Series.__mod__ behaves different with numexpr #36552
Conversation
will add tests if reverting change in #30147 without any new test failures see #30147 (comment) |
@@ -171,8 +171,6 @@ def _create_methods(cls, arith_method, comp_method, bool_method, special): | |||
mul=arith_method(cls, operator.mul, special), | |||
truediv=arith_method(cls, operator.truediv, special), | |||
floordiv=arith_method(cls, operator.floordiv, special), | |||
# Causes a floating point exception in the tests when numexpr enabled, | |||
# so for now no speedup |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message should have been moved in #19649
I think Ok if we have not had reports of floating point exceptions since enabled.
@simonjayhawkins example with rmod:
But not sure how it comes this also regressed compared to 1.0, as we already had |
(sorry, see now you only asked about (r)floordiv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some tests for those cases?
rmod was already failing, so not a regression. I couldn't find any issues so probably not so used so much, considering we have had 2 reports on the mod issue in a short space of time. |
I don't see any failure here:
and both are failing on master |
for regressions, i check against 1.0.5.
i'll try to find the cause of that regression. it was indeed working on 0.25.3 also. |
Whoops, I simply did not have numexpr installed in my 1.0 environment .. |
And I didn't in my 0.25.3! 😄 |
Optional dependencies used in core operations ... :-) Anyway, rmod is less used, so most important is fixing mod itself, which is undoubtly a regression |
my plan for this PR was simply to revert the change the caused the regression. I not yet looked, but there may be special handling which is why floordiv and rfloordiv and not affected. I will add tests for all 4 ops and xfail the rmod one and can then look to a proper fix (which could also be backportable of course) |
Sounds good! |
unrelated failure (also on #34918) |
@pytest.mark.parametrize("scalar", [-5, 5]) | ||
def test_python_semantics_with_numexpr_installed(self, op, box, tester, scalar): | ||
# https://github.com/pandas-dev/pandas/issues/36047 | ||
expr._MIN_ELEMENTS = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it is done like that in other tests as well (so doesn't need to be fixed here), but that doesn't seem a very clean way to patch this, as it will influence other tests as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an class with a teardown_method that resets _MIN_ELEMENTS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but yes we should replace with a fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't see the teardown/setup, only the import at the top ..
pandas/tests/test_expressions.py
Outdated
expr.set_use_numexpr(False) | ||
expected = method(scalar) | ||
expr.set_use_numexpr(True) | ||
tester(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also test against a manually constructed expected result?
(or do you know if we have elsewhere a test that covers this behaviour with negative numbers for modulo ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll look though the tests again to see what we already have.
the method i used here is inspired by run_arithmetic
, run_binary
and also run_frame
I kept the object sizes small here so could do elementwise comparion similar to code in #36552 (comment) wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's certainly good to test the equivalency of numpy vs numexpr behaviour. But here we discovered a specific case where they are not equal, and I think it is useful to also (in addition, can also be a separate test) test this corner case of negative values with modulo.
We might change the implementation in the future (eg assume we would take on numexpr as required dependency and always use it), and then we wouldn't necessarily catch this case.
The object size shouldn't matter since you did expr._MIN_ELEMENTS = 0
? (so it's fine to use a small example for which it is easy to manually create the expected result?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The object size shouldn't matter
I meant iterating through in Python was not a resource issue.
I have added an element-wise comparison but tests are failing at the moment
>>> pd.__version__
'1.2.0.dev0+499.g3b12293cea'
>>>
>>> 5 // 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
>>>
>>> 5 // np.array([0, 0])[0]
<stdin>:1: RuntimeWarning: divide by zero encountered in long_scalars
0
>>>
>>> 5 // np.array(0)
<stdin>:1: RuntimeWarning: divide by zero encountered in floor_divide
0
>>>
>>> 5 // np.array([-1, 0, 1])
array([-5, 0, 5], dtype=int32)
>>>
>>> 5 // pd.Series([-1, 0, 1])
0 -5.0
1 inf
2 5.0
dtype: float64
>>>
>>> -5 // np.array([-1, 0, 1])
array([ 5, 0, -5], dtype=int32)
>>>
>>> -5 // pd.Series([-1, 0, 1])
0 5.0
1 -inf
2 -5.0
dtype: float64
>>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed test. this discrepancy that has surfaced is not related to the regression issue, will investigate further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think pandas is doing the right thing here? other than returning a non-integer result for floordiv. maybe should return IntegerArray in future
>>> 5 / np.array([-1, 0, 1])
<stdin>:1: RuntimeWarning: divide by zero encountered in true_divide
array([-5., inf, 5.])
>>>
>>> 5 / pd.Series([-1, 0, 1])
0 -5.0
1 inf
2 5.0
dtype: float64
>>>
Can you also test against a manually constructed expected result?
still outstanding
this looks ready @simonjayhawkins (non-withstanding your comment above, which looks separte). not worried about the single failure which is old on travis. |
Not quite ready. review request #36552 (comment) remains outstanding. I can't block by requesting changes on my own PR, moving to 1.1.4 so release note needs to be moved. closing for now instead. Will re-open after 1.1.3 release. I've started the final release readiness checks. merging now would require going back to square one. |
@simonjayhawkins since this is basically ready, and fixing a serious regression (silently wrong result), I would still include it if possible. I think the only outstanding request was to have a more manual test checking correctness? That can certainly be left for a follow-up PR and shouldn't block 1.1.3 (we also didn't have such test now, so I mainly see this as a nice-to-have) |
Yeah a couple of reports about this. so getting this in would be good. (I closed this since often never seem to get round to nice to haves!) reopening to check ci status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no objection to merging for 1.1.3 (minor comment but nbd)
pandas/tests/test_expressions.py
Outdated
"box, tester", | ||
[ | ||
(DataFrame, tm.assert_frame_equal), | ||
(Series, tm.assert_series_equal), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to usually add the tester as assert_equal handles this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. a bit belt and braces. will change so we re-run ci b4 merge.
lgtm merge when ready (ignore the arm failure) |
@meeseeksdev backport 1.1.x |
…pr (#36750) Co-authored-by: Simon Hawkins <[email protected]>
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff