Skip to content

Commit

Permalink
test for 8669
Browse files Browse the repository at this point in the history
  • Loading branch information
jreback committed Apr 9, 2018
1 parent 2dcdabb commit 582da12
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 2 deletions.
4 changes: 2 additions & 2 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ Other Enhancements
- Updated ``to_gbq`` and ``read_gbq`` signature and documentation to reflect changes from
the Pandas-GBQ library version 0.4.0. Adds intersphinx mapping to Pandas-GBQ
library. (:issue:`20564`)

.. _whatsnew_0230.api_breaking:

Backwards incompatible API changes
Expand Down Expand Up @@ -487,7 +487,7 @@ If you wish to retain the old behavior while using Python >= 3.6, you can use
Categorical Grouping no longer expands to all possible groupers
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In previous versions, grouping by 1 or more categorical columns would result in an index that was the cartesian product of all of the categories for each grouper, not just the observed values. This is inconsistent with output for other dtypes, can potentially cast to different dtypes (as missing values are introduced), and could cause a huge frame to be generated. Pandas will now return only the observed values, regardless if grouping on a categorical column; note that the categorical dtype is *still* preserved. You will still have a categorical columns (:issue:`14942`). This reverts issues (:issue:`10132`, :issue:`8138`, :issue:`14942`, :issue:`15217`, :issue:`17594`)
In previous versions, grouping by 1 or more categorical columns would result in an index that was the cartesian product of all of the categories for each grouper, not just the observed values. This is inconsistent with output for other dtypes, can potentially cast to different dtypes (as missing values are introduced), and could cause a huge frame to be generated. Pandas will now return only the observed values, regardless if grouping on a categorical column; note that the categorical dtype is *still* preserved. You will still have a categorical columns (:issue:`14942`, :issue:`8138`, :issue:`15217`, :issue:`17594`, :issue:`8669`)



Expand Down
15 changes: 15 additions & 0 deletions pandas/tests/groupby/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,7 @@ def test_with_multiple_groupers_no_expand():
# gh-14942 (implement)
# gh-10132 (revert)
# gh-8138 (revert)
# gh-8869

cat1 = Categorical(["a", "a", "b", "b"],
categories=["a", "b", "z"], ordered=True)
Expand Down Expand Up @@ -316,6 +317,20 @@ def test_with_multiple_groupers_no_expand():
"C3": [10, 100, 200, 34]}, index=idx)
tm.assert_frame_equal(res, exp)

# gh-8869
# with as_index
d = {'foo': [10, 8, 4, 8, 4, 1, 1], 'bar': [10, 20, 30, 40, 50, 60, 70],
'baz': ['d', 'c', 'e', 'a', 'a', 'd', 'c']}
df = pd.DataFrame(d)
cat = pd.cut(df['foo'], np.linspace(0, 10, 3))
df['range'] = cat
groups = df.groupby(['range', 'baz'], as_index=False)
result = groups.agg('mean')

groups2 = df.groupby(['range', 'baz'], as_index=True)
expected = groups2.agg('mean').reset_index()
tm.assert_frame_equal(result, expected)


def test_datetime():
# GH9049: ensure backward compatibility
Expand Down

0 comments on commit 582da12

Please sign in to comment.