Pandas groupby sum misbehaves when one of the columns has string objects. #24196
Labels
Bug
Groupby
Needs Tests
Unit test(s) needed to prevent regressions
Nuisance Columns
Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply
Reduction Operations
sum, mean, min, max, etc.
Code Sample, a copy-pastable example if possible
Problem description
The output for the above code should return NaN for the column 'B' of the DataFrame when min_count=1 as mentioned in DataFrame.sum doc., but it is returning 0 instead.
Actual Output
Expected Ouput
So, I decided to remove the column with strings, then it returns NaN for the column 'B'. However, I think the columns in the Dataframe should be independent to each other. It seems like a bug.
Output
Output of
pd.show_versions()
pandas: 0.23.4
pytest: 4.0.1
pip: 8.1.1
setuptools: 40.6.2
Cython: 0.29.1
numpy: 1.15.4
scipy: 1.1.0
pyarrow: 0.11.1
xarray: 0.11.0
IPython: 7.2.0
sphinx: 1.8.2
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.7
blosc: 1.6.2
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.8
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.12
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml: 4.2.5
bs4: 4.6.3
html5lib: 0.999
sqlalchemy: 1.2.14
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: 0.2.0
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: