Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERR: too strict validation on groupby.rolling with time-aware freq #15130

Closed
jreback opened this issue Jan 13, 2017 · 0 comments
Closed

ERR: too strict validation on groupby.rolling with time-aware freq #15130

jreback opened this issue Jan 13, 2017 · 0 comments
Labels
Bug Error Reporting Incorrect or improved errors from pandas Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jan 13, 2017

http://stackoverflow.com/questions/41642320/efficient-pandas-rolling-aggregation-over-date-range-by-group-python-2-7-windo/41643179?noredirect=1#comment70486923_41643179

In [1]: data = [
   ...: ['David', '1/1/2015', 100], ['David', '1/5/2015', 500], ['David', '5/30/2015', 50], ['David', '7/25/2015', 50],
   ...: ['Ryan', '1/4/2014', 100], ['Ryan', '1/19/2015', 500], ['Ryan', '3/31/2016', 50],
   ...: ['Joe', '7/1/2015', 100], ['Joe', '9/9/2015', 500], ['Joe', '10/15/2015', 50]
   ...: ]
   ...: 
   ...: list_of_vals = []
   ...: 
   ...: dates_df = pd.DataFrame(data=data, columns=['name', 'date', 'amount'], index=None)
   ...: dates_df['date'] = pd.to_datetime(dates_df['date'])
   ...: 

This check doesn't need to occur when we are grouping

In [7]: dates_df.groupby('name').rolling('180D', on='date')['amount'].sum()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-8896cb99a66a> in <module>()
----> 1 dates_df.groupby('name').rolling('180D', on='date')['amount'].sum()

/Users/jreback/pandas/pandas/core/groupby.py in rolling(self, *args, **kwargs)
   1148         """
   1149         from pandas.core.window import RollingGroupby
-> 1150         return RollingGroupby(self, *args, **kwargs)
   1151 
   1152     @Substitution(name='groupby')

/Users/jreback/pandas/pandas/core/window.py in __init__(self, obj, *args, **kwargs)
    635         self._groupby.mutated = True
    636         self._groupby.grouper.mutated = True
--> 637         super(GroupByMixin, self).__init__(obj, *args, **kwargs)
    638 
    639     count = GroupByMixin._dispatch('count')

/Users/jreback/pandas/pandas/core/window.py in __init__(self, obj, window, min_periods, freq, center, win_type, axis, on, **kwargs)
     76         self.win_type = win_type
     77         self.axis = obj._get_axis_number(axis) if axis is not None else None
---> 78         self.validate()
     79 
     80     @property

/Users/jreback/pandas/pandas/core/window.py in validate(self)
   1030                 formatted = self.on or 'index'
   1031                 raise ValueError("{0} must be "
-> 1032                                  "monotonic".format(formatted))
   1033 
   1034             from pandas.tseries.frequencies import to_offset

ValueError: date must be monotonic

This is ok

In [9]: dates_df.groupby('name').apply(lambda x: x.rolling('180D', on='date')['amount'].sum())
Out[9]: 
name    
David  0    100.0
       1    600.0
       2    650.0
       3    100.0
Joe    7    100.0
       8    600.0
       9    650.0
Ryan   4    100.0
       5    500.0
       6     50.0
Name: amount, dtype: float64
@jreback jreback added Bug Difficulty Intermediate Error Reporting Incorrect or improved errors from pandas Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jan 13, 2017
@jreback jreback added this to the 0.20.0 milestone Jan 13, 2017
jreback added a commit to jreback/pandas that referenced this issue Jan 20, 2017
jreback added a commit to jreback/pandas that referenced this issue Jan 20, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this issue Mar 21, 2017
closes pandas-dev#15130

Author: Jeff Reback <[email protected]>

Closes pandas-dev#15175 from jreback/groupby_rolling and squashes the following commits:

5831b8e [Jeff Reback] BUG: no need to validate monotonicity when groupby-rolling
jreback added a commit to jreback/pandas that referenced this issue Apr 22, 2017
jreback added a commit to jreback/pandas that referenced this issue Apr 22, 2017
jreback added a commit that referenced this issue Apr 22, 2017
pcluo pushed a commit to pcluo/pandas that referenced this issue May 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas Groupby Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant