-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement arith ops on pd.Categorical #21213
Conversation
@@ -1203,6 +1219,24 @@ def map(self, mapper): | |||
__le__ = _cat_compare_op('__le__') | |||
__ge__ = _cat_compare_op('__ge__') | |||
|
|||
__add__ = _cat_arithmetic_op(operator.add) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so my ops PR does almost exactly this (for extension arrays). I think we should maybe put it in a mixin? so can capture ops definitions generically (but of course can do that later).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codecov Report
@@ Coverage Diff @@
## master #21213 +/- ##
==========================================
- Coverage 91.84% 91.83% -0.01%
==========================================
Files 153 153
Lines 49505 49534 +29
==========================================
+ Hits 45466 45492 +26
- Misses 4039 4042 +3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it actually needed to add those?
If those are not defined, Python gives already almost the same error automatically.
In [20]: pd.Categorical([1, 2, 3]) + 1
...
TypeError: unsupported operand type(s) for +: 'Categorical' and 'int'
At the moment the behavior is implicitly defined in |
I am not speaking about the change in ops.py, that is good, but just asking about the reasoning to add those methods to Categorical. Without adding those, python will also raise a TypeError (unless the |
Again, that's for the change in ops.py? (which I completely agree with)
but on the other hand, not defining the method or returning NotImplemented is, although implicit, the standard python idiom to handle this. This path is already taken when eg the categorical series it the To be clear, there might be good other reasons to do this, eg because it would give changes in behaviour compared to what we currently have. |
No, its for the methods themselves on the EA subclasses. #21160 already does this, so the "precedent" is more for 20889.
The PR will keep the current exception messages unchanged. Not a big deal. Really if 21160 goes through it will set the relevant precedent for 20889 and other authors, at which point I don't especially care about this. |
yes I think this will be much simpler / better after #21160 first PR should factor out the ops to a Mixin (I may do that), then integerate. |
Sounds good, closing. If you get a chance to look at #19959 after it goes through we can add tests for IntEA\pm datetime64 |
@jreback wrote:
I was going to move what I was doing with the ops in #20889 into a mixin. Should I work on this or let you create the mixin as you said you may do above? Ref @jorisvandenbossche discussion at #20889 (comment) |
IMO you can already do this. Some of the changes to ops.py will be duplicated (but you can actually check Jeff's PR how he did it, only a small change), but the mixin being talked about here will not have this "scalar-ops-fallback", which is the point of the Mixin you want to create, I think? |
git diff upstream/master -u -- "*.py" | flake8 --diff
Then in core.ops we dispatch to pd.Categorical instead of special-casing.