Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_reduce needs to be defined on non-NumPy ExtensionArrays #21920

Closed
xhochy opened this issue Jul 15, 2018 · 2 comments
Closed

_reduce needs to be defined on non-NumPy ExtensionArrays #21920

xhochy opened this issue Jul 15, 2018 · 2 comments
Labels
ExtensionArray Extending pandas with custom dtypes or arrays.

Comments

@xhochy
Copy link
Contributor

xhochy commented Jul 15, 2018

Creating a Series that is not backed by a NumPy array requires the ExtensionArray object to implement _reduce to correctly delegate the method. The following method reproduces the exception:

import fletcher as fr
import pandas as pd

series = pd.Series(fr.FletcherArray([1, 2]))
series.max()

With the respective exeception:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-f77c98c78d5c> in <module>()
      3 
      4 series = pd.Series(fr.FletcherArray([1, 2]))
----> 5 series.max()

~/Development/arrow-repos-2/pandas/pandas/core/generic.py in stat_func(self, axis, skipna, level, numeric_only, **kwargs)
   9777                                       skipna=skipna)
   9778         return self._reduce(f, name, axis=axis, skipna=skipna,
-> 9779                             numeric_only=numeric_only)
   9780 
   9781     return set_function_name(stat_func, name, cls)

~/Development/arrow-repos-2/pandas/pandas/core/series.py in _reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   3251 
-> 3252         return delegate._reduce(op=op, name=name, axis=axis, skipna=skipna,
   3253                                 numeric_only=numeric_only,
   3254                                 filter_type=filter_type, **kwds)

AttributeError: 'FletcherArray' object has no attribute '_reduce'

To fix this bug, I would:

  1. Move Categorical._reduce to the ExtensionArray interface.
  2. Define a new fixture to specify the supported operations.
  3. Add a new class pandas.tests.extension.base.BaseReductionOpsTests to verify the output of these tests.
@jorisvandenbossche jorisvandenbossche added the ExtensionArray Extending pandas with custom dtypes or arrays. label Jul 15, 2018
@jorisvandenbossche
Copy link
Member

Yes, we didn't come to this part of the interface yet (the initial use cases did not yet need reducing functions), but is is certainly something we need to add (and will need that as well for the internal extension arrays such as int NA).
See also #21789. So let's use that issues to discuss.

@jorisvandenbossche
Copy link
Member

(you can move the example and questions if you want)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

No branches or pull requests

2 participants