Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QST: is the new behavior of GroupByRolling for MultiIndex in v1.1.1 intended? #36018

Closed
itholic opened this issue Sep 1, 2020 · 4 comments · Fixed by #36152
Closed

QST: is the new behavior of GroupByRolling for MultiIndex in v1.1.1 intended? #36018

itholic opened this issue Sep 1, 2020 · 4 comments · Fixed by #36152
Labels
Bug Groupby MultiIndex Regression Functionality that used to work in a prior pandas version Window rolling, ewma, expanding
Milestone

Comments

@itholic
Copy link

itholic commented Sep 1, 2020

Question about pandas

Let's say we have a Series with MultiIndex like the below.

>>> pser = pd.Series(
...     [1, 2, 3, 2],
...     index=pd.MultiIndex.from_tuples([("a", "x"), ("a", "y"), ("b", "z"), ("c", "z")]),
...     name="a",
... )

>>> pser
a  x    1
   y    2
b  z    3
c  z    2
Name: a, dtype: int64

Then, when I use GroupByRolling, In the version of pandas <= 1.0.5 shows result as below.
(Let's focus on the level of MultiIndex in the examples below)

>>> pser.groupby(pser).rolling(2).max()
a
1  a  x    NaN
2  a  y    NaN
   c  z    2.0
3  b  z    NaN
Name: a, dtype: float64

>>> pser.groupby(pser).rolling(2).max().index
MultiIndex([(1, 'a', 'x'),
            (2, 'a', 'y'),
            (2, 'c', 'z'),
            (3, 'b', 'z')],
           names=['a', None, None])

>>> pser.groupby(pser).rolling(2).max().index.nlevels  # It keeps the index level.
3

However, In the pandas 1.1.0, the result seems different from the previous version as below.
(The level of MultiIndex is decreased after performing)

>>> pser.groupby(pser).rolling(2).max()
a
1  (a, x)    NaN
2  (a, y)    NaN
   (c, z)    2.0
3  (b, z)    NaN
Name: a, dtype: float64

>>> pser.groupby(pser).rolling(2).max().index
MultiIndex([(1, ('a', 'x')),
            (2, ('a', 'y')),
            (2, ('c', 'z')),
            (3, ('b', 'z'))],
           names=['a', None])

>>> pser.groupby(pser).rolling(2).max().index.nlevels  # index level changed from 3 to 2
2

Is it intended behavior in pandas 1.1.1. ??

Thanks :)

@itholic itholic added Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Sep 1, 2020
@TomAugspurger
Copy link
Contributor

Hard to say, but probably not intentional. Are you able to bisect 1.1.0 to 1.1.1 to see what changed?

@TomAugspurger TomAugspurger added Groupby MultiIndex Window rolling, ewma, expanding and removed Needs Triage Issue that has not been reviewed by a pandas team member Usage Question labels Sep 4, 2020
@phofl
Copy link
Member

phofl commented Sep 5, 2020

I looked into this and found, that the index names also get lost now:

a        
1  (a, x)    NaN
2  (a, y)    NaN
   (c, z)    2.0
3  (b, z)    NaN
Name: a, dtype: float64

while with 1.0.5

a  1  2
1  a  x    NaN
2  a  y    NaN
   c  z    2.0
3  b  z    NaN
Name: a, dtype: float64

Input was:

pser = pd.Series(
     [1, 2, 3, 2],
     index=pd.MultiIndex.from_tuples([("a", "x"), ("a", "y"), ("b", "z"), ("c", "z")], names=["1", "2"]),
     name="a")

Method which constructs the index is completely new, so there is no single thing which changed

@rhshadrach
Copy link
Member

@mroeschke Looks like this is a regression and so should be tagged 1.1.2, is that right?

@jreback jreback modified the milestones: 1.2, 1.1.2 Sep 6, 2020
@simonjayhawkins simonjayhawkins added the Regression Functionality that used to work in a prior pandas version label Sep 6, 2020
@simonjayhawkins
Copy link
Member

simonjayhawkins commented Sep 6, 2020

first bad commit: [bad52a9] PERF: Use Indexers to implement groupby rolling (#34052)

https://github.com/simonjayhawkins/pandas/runs/1078593349?check_suite_focus=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby MultiIndex Regression Functionality that used to work in a prior pandas version Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants