Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: groupby.rolling.corr incorrect/fails when grouping columns contain tuples #44078

Closed
3 tasks done
Tracked by #7
rhshadrach opened this issue Oct 18, 2021 · 5 comments · Fixed by #54499
Closed
3 tasks done
Tracked by #7

BUG: groupby.rolling.corr incorrect/fails when grouping columns contain tuples #44078

rhshadrach opened this issue Oct 18, 2021 · 5 comments · Fixed by #54499
Assignees
Labels
cov/corr Groupby Needs Tests Unit test(s) needed to prevent regressions Window rolling, ewma, expanding
Milestone

Comments

@rhshadrach
Copy link
Member

rhshadrach commented Oct 18, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

df = pd.DataFrame({'a': [(1,), (1,), (1,)], 'b': [4, 5, 6]})
gb = df.groupby(['a'])
print(gb.rolling(2).corr(other=df))

df = pd.DataFrame({'a': [(1,2,), (1,2,), (1,2,)], 'b': [4, 5, 6]})
gb = df.groupby(['a'])
print(gb.rolling(2).corr(other=df))

Issue Description

The first block produces

      a    b
a           
1 0 NaN  NaN
  1 NaN  1.0
  2 NaN  1.0

where the first index level should be (1, ) instead. The second block raises a ValueError. It is due to usage of maybe_make_list in _apply_pairwise (related: #44056). Instead of using this function, we can pack the index keys in lists when self._grouper.nkeys is 1 and otherwise leave them as-is. I.e.

if self._grouper.nkeys == 1:
    gb_pairs = (
        [pair] for pair in self._grouper.indices.keys()
    )
else:
    gb_pairs = (
        pair for pair in self._grouper.indices.keys()
    )

Expected Behavior

          a    b
a           
(1, ) 0 NaN  NaN
      1 NaN  1.0
      2 NaN  1.0

           a    b
a           
(1, 2) 0 NaN  NaN
       1 NaN  1.0
       2 NaN  1.0

Installed Versions

Replace this line with the output of pd.show_versions()

@rhshadrach rhshadrach added Bug Groupby Window rolling, ewma, expanding labels Oct 18, 2021
@rhshadrach rhshadrach added this to the Contributions Welcome milestone Oct 18, 2021
@BarkotBeyene
Copy link
Contributor

Hello, I run the code on the latest pandas version and it works fine.

@rhshadrach
Copy link
Member Author

Thanks @BarkotBeyene - this could use tests. Would you be interested in putting up a PR?

@rhshadrach rhshadrach added Needs Tests Unit test(s) needed to prevent regressions and removed Bug labels Aug 18, 2022
@BarkotBeyene
Copy link
Contributor

Hey @rhshadrach, I'm currently working on another issue, but I'll take a look at it once I'm done. Thanks

@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@oluwatonio
Copy link

take

@oluwatonio oluwatonio removed their assignment Aug 11, 2023
@omar-elbaz
Copy link
Contributor

take

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cov/corr Groupby Needs Tests Unit test(s) needed to prevent regressions Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants