Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataFrame.unstack #886

Closed
ueshin opened this issue Oct 3, 2019 · 4 comments · Fixed by #1295
Closed

DataFrame.unstack #886

ueshin opened this issue Oct 3, 2019 · 4 comments · Fixed by #1295
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@ueshin
Copy link
Collaborator

ueshin commented Oct 3, 2019

Implement DataFrame.unstack.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.unstack.html

@ueshin ueshin added enhancement New feature or request help wanted Extra attention is needed labels Oct 3, 2019
@ag-og
Copy link

ag-og commented Nov 22, 2019

Any update on this issue?

@HyukjinKwon
Copy link
Member

I investigated this for whole day, and seems we shouldn't implement this API. It creates rows * index columns:

>>> df = pd.DataFrame({'A': ['a', 'b', 'c', 'd'], 'B': [1, 3, 5, 7], 'C': [2, 4, 6, 7]}, index=pd.MultiIndex.from_tuples([('X', 'A'), ('X', 'B'), ('Y', 'C'), ('Y', 'D')]))
>>> df
     A  B  C
X A  a  1  2
  B  b  3  4
Y C  c  5  6
  D  d  7  7
>>> df.unstack()
     A                   B                   C
     A    B    C    D    A    B    C    D    A    B    C    D
X    a    b  NaN  NaN  1.0  3.0  NaN  NaN  2.0  4.0  NaN  NaN
Y  NaN  NaN    c    d  NaN  NaN  5.0  7.0  NaN  NaN  6.0  7.0

I don't think this practically works since Spark partitions it row based. This is much worse than transpose we had to think multiple times. Let's don't do this for now.

@ueshin
Copy link
Collaborator Author

ueshin commented Feb 13, 2020

Makes sense. Shall we move it to "won't support" list?

@HyukjinKwon
Copy link
Member

Sure, will do.

HyukjinKwon added a commit that referenced this issue Feb 25, 2020
This PR implements `DataFrame.unstack()`. This PR intentionally does not implement `level` with multi-index. See the discussion at #886
When multi-index is not use, it is okay to implement.

Resolves #886
HyukjinKwon pushed a commit that referenced this issue Mar 4, 2020
We can support unstack with MultiIndex using pivot_table.
But as discussed at #886 (comment), we should still be careful to use it.
rising-star92 added a commit to rising-star92/databricks-koalas that referenced this issue Jan 27, 2023
This PR implements `DataFrame.unstack()`. This PR intentionally does not implement `level` with multi-index. See the discussion at databricks/koalas#886
When multi-index is not use, it is okay to implement.

Resolves #886
rising-star92 added a commit to rising-star92/databricks-koalas that referenced this issue Jan 27, 2023
We can support unstack with MultiIndex using pivot_table.
But as discussed at databricks/koalas#886 (comment), we should still be careful to use it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants