-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: DataFrame[Int64].mean().dtype is object, should be Float64 #42895
Comments
df.iloc[:, 1] = pd.NA # <-- incorrectly casts to object, lets cast back and ignore that for now this here is a whole separate issue. Reported here #44199 |
Also, I now know how to fix your example. Fixed: arr = np.random.randn(4, 3).astype("int64")
df = pd.DataFrame(arr).astype(pd.Int64Dtype())
df.iloc[:, 1] = pd.NA # no more casting issues
# df = df.astype("Int64") # not needed anymore
res = df.mean()
>>> res
0 -0.50
1 NaN
2 0.25
dtype: float64 |
Not quite. The point of the example is that we get unwanted behavior when we have an all-NA Int64 column |
@jbrockmendel This is really not about |
@Demetrio92 maybe it's not clear |
@jreback So, for testing Issues get fixed, when devs see what is the problem. We can create 1000 tickets all stating the same ".dtype is object, should be Float64" by using |
They look the same to me. If you think there is something different, please open a separate issue for that to avoid disrailing this one.
Incorrect. In both your example and the OP, after setting |
I was digging into the related issue. And figured out that despite that casting problem arr = np.random.randn(4, 3).astype("int64")
df = pd.DataFrame(arr).astype("Int64")
df.iloc[:, 1] = pd.NA # <-- incorrectly casts to object, lets cast back and ignore that for now
# df = df.astype("Int64")
res = df.mean() Not sure why, but the above code works as expected
Tested using pandas |
@Demetrio92 the underlying cause here is the lack of 2D support for IntegerArray. Further investigation is not a good use of your time. A place where investigative eyeballs would be very helpful is https://github.com/pandas-dev/pandas/blob/master/pandas/core/internals/blocks.py#L1265 "# TODO: in all tests we have mask.all(); can we rely on that?" |
Closed by #52788 |
This breaks the last corner case I want to test for #33036.
Trivial to fix with 2D EAs, not sure how/if to fix it without.
The text was updated successfully, but these errors were encountered: