-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crosstab in 0.23.4 does not respect categorical variables #22453
Comments
Can you try on master? The examples from the docs worked fine for me and I think we have CI to check that as well |
Example run on master:
|
It seems to be a regression in 0.23.0. As on 0.22 I still get the correct result (on pandas 0.20 it is also only partially working, only the index is correct) |
@andrewcassidy if you would have time, very welcome to look into it what could be the cause |
Hi there, a handy bisect led me to this commit: b020891, which references PR #20583 (in short: there is no reindexing of categoricals now, so empty groups are omitted). This appears to be intentional behaviour wherein empty groups are dropped in a groupby, which affects other operations that use groupby under the hood. They don't mention crosstab specifically, but they do deal with a very similar issue in
|
By the way not only does the example not work, the documentation preceding does not reflect the current functionality:
This was already pointed out in a related (open!) issue, #16367. |
Pretty sure this is the same as #16367 , so closing in favour of that one (though please do let me know if I've misunderstood) |
Code Sample, a copy-pastable example if possible
col_0 d e
row_0
a 1 0
b 0 1
Problem description
https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.crosstab.html
The example in the docs DOES NOT work
Expected Output
col_0 d e f
row_0
a 1 0 0
b 0 1 0
c 0 0 0
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: en_US.UTF-8
pandas: 0.23.4
pytest: None
pip: 10.0.1
setuptools: 40.0.0
Cython: None
numpy: 1.15.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: 0.6.0
pandas_datareader: None
The text was updated successfully, but these errors were encountered: