-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: method .nunique on categorical series in v0.21 with only NaNs gives ValueError #18051
Comments
This individual issue can be fixed with a simple If this is something in cython or requires larger refactoring, this will be beyond my ability, I'm sorry. |
no, this should be fixed in cython. |
Tthis is a regression from v0.20.3. Maybe this has something to do with the new CategoricalDtype, @TomAugspurger ? |
I think it's more likely to be the changes to |
If `old_categories` is empty (all nan categories) then `_recode_for_categories` should return `codes.copy()` so that the writable flag is True.
@topper-123 @TomAugspurger @jreback please review #18279 |
rebased to remove conflicts in whats new
…-dev#18436) (cherry picked from commit b45325e)
Code Sample, a copy-pastable example if possible
Problem description
The above code gave 0 in v20.3 and is expected to give 0 also in v0.21. The problem is independent of if I set some categories.
EDIT: Actually this doesn't give error if I set categories. so this only happens if no categories are set. The use case for no categories in my case is programmatically reading in data, where some columns are empty and of dtype categorical.
Expected Output
0 (zero)
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: 8137209
python: 3.5.4.final.0
python-bits: 32
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None
pandas: 0.21.0
pytest: 3.2.3
pip: 9.0.1
setuptools: 36.5.0.post20170922
Cython: None
numpy: 1.13.3
scipy: None
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.4.8
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: