-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Support rolling over one level of a MultiIndex #34642
Comments
Probably, there is a related issue #34617. By the ways, if data frame is indexed with integer index then one can use import pandas as pd
idx = pd.MultiIndex.from_product(
[range(7), ["a", "b"]], names=["date", "obs"],
)
df = pd.DataFrame(index=idx)
df['c1'] = range(len(df))
print(df)
df_r = df.groupby(by="obs", group_keys=False).rolling(
7, level=1
).mean().sort_index()
print(df_r) |
In @daskol example the
I'm still very confused. |
A brief update. After updating pandas to v1.1.1 the above walk around fails with SEGFAULT!
but
It seems to me that this issue should be labeled as BUG pandas version info
|
SEGFAULT possibly linked to #36018 |
hi, I wonder if there were any update on this issue, in my situation, I am also unable to do rolling with non-integer windows when there is multi-index in the dataframe. With the simple example that was used in this thread, I would get different error like the following:
df.groupby(by="obs", group_keys=False).rolling("7d", on=df.index.levels[0]).mean() File "C:\Users\c_yy\AppData\Local\Temp/ipykernel_15476/3165508401.py", line 1, in File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 1855, in mean File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 1309, in mean File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 594, in _apply File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 545, in _apply File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 441, in _apply_blockwise File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\internals\managers.py", line 325, in apply File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\internals\blocks.py", line 381, in apply File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 431, in hfunc File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 535, in homogeneous_func File "<array_function internals>", line 5, in apply_along_axis File "C:\Users\c_yy\anaconda3\lib\site-packages\numpy\lib\shape_base.py", line 379, in apply_along_axis File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\rolling.py", line 521, in calc File "C:\Users\c_yy\anaconda3\lib\site-packages\pandas\core\window\indexers.py", line 333, in get_window_bounds IndexError: index 10 is out of bounds for axis 0 with size 10 When using my own real data set (see below), I get different error: rxx.loc['2015-12-01':'2017-12-29',:] [104 rows x 112 columns] rxx.groupby(level='order_book_id',group_keys=False).rolling('63d',on=rxx.index.levels[0]).mean() pd.show_versions() INSTALLED VERSIONScommit : 945c9ed pandas : 1.3.4 |
I have to concur. This seems like a useful thing to have. Or if there is an alternative way to handle this use case it could be described in docs. |
I've run into this issue as well. The documentation for |
I have a similar issue, which again involves performing rolling actions on a dataframe with duplicate datetimes. I've gotten a complicated workaround using groupby and some other hacks, but a more native support would be greatly appreciated. |
The groupby transform method has this exact functionality. The docs explain it pretty well. Although I agree that the windowing API should definitely natively have this functionality. |
This is the solution wanted |
The doc still mentions this as funtctionality but it doesn't work: src: Rolling doc
|
This is still an issue |
This is still an issue however there is a hacky(?) work around. Consider the following multi indexed series
While this is a tad more verbose then pandas just acknowledging the closest index this is a workable replacement. The descriptive statistics for the series line up with this method and the non indexed method. |
I have searched the [pandas] tag on StackOverflow for similar questions.
I have asked my usage related question on StackOverflow.
I had a hard time understanding how
df.rolling
works whendf
is indexed by a MultiIndexThis is an example data frame:
which outputs
Now I want to apply a rolling window on the
date
level, keeping theobs
level separate.I tried with no success obvious and simple (least surprise) commands like
df.rolling("7d", index="date")
ordf.rolling("7d", on="date")
but finally the desired result is obtained by
which gives me the correct result:
It seams to me that this should be a quite common situation, so I was wondering if there is a simpler way to obtain the same results. By the way my solution is not very robust, because there are hidden assumptions on how the objects returned by
groupby
are indexed, which do not necessarily hold for a generic data frame.Moreover the doc of the
on
parameter inrolling
was almost incomprehensible to me: I'm still wondering if my usagerolling( "7d", on=df.index.levels[0])
is the intended one or not.The text was updated successfully, but these errors were encountered: