-
-
Notifications
You must be signed in to change notification settings - Fork 18.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: pd.NA
when it replaces a value in a column, changes its type to "object"
#44199
Comments
I think this is expected, pd.NA ist not a float or an int. If you define your column as
|
@phofl yes, I see that. I used Still, Presence of null-values should not change the type of a column. This is an overarching principle, and if pandas breaks it, it will be precedential. No SQL-compatible database has such behavior, neither do R data.frames. So, I see two solutions here:
|
1 is not feasible right now and it was not decided if this would be the case in the future. 2: We have open issues discussing the behavior of setting incompatible values into a DataFrame column. One option would be raising here |
Nonetheless, can you reference the issue? I feel like maybe this one is a duplicate. I did a bit of searching before reporting, but couldn't find if this was discussed already. |
1 was discussed on the mailing list I think. No conclusion reached yet. Would have to look up 2 myself, this is somewhere under Indexing and is a bit older |
There has been discussion about this. At the moment these dtypes are a) much buggier/unstable than numpy dtypes and b) much slower when dealing with multiple/many columns. Both of these are improving, but these are going to stay opt-in at least until they are reasonably stable. For FloatingDtypes in particular #32265 is a tough nut to crack. |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
Issue Description
Changing one value in a column with an
NA
/NULL
should not change column's data type. That seems reasonable. Also, it seems the functionality is already there. I am not entirely sure if this is a bug or a feature.Essentially it is due to the default assignment of a column type, which is
float64
notpd.Float64Dtype()
. I am not sure if the migration is on the roadmap, but this bug could be an argument in its favor.Expected Behavior
Installed Versions
The text was updated successfully, but these errors were encountered: