Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Infer column dtype when constructing object with pd.NA #58366

Open
1 of 3 tasks
amanlai opened this issue Apr 22, 2024 · 2 comments
Open
1 of 3 tasks

ENH: Infer column dtype when constructing object with pd.NA #58366

amanlai opened this issue Apr 22, 2024 · 2 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions Enhancement NA - MaskedArrays Related to pd.NA and nullable extension arrays Needs Discussion Requires discussion from core team before further action

Comments

@amanlai
Copy link
Contributor

amanlai commented Apr 22, 2024

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

When a list that contains integers and pd.NAs is cast in a pandas object, the column becomes dtype=object. An example illustrating the issue:

Feature Description

pd.Series([pd.NA, 1]).dtype            # object

Alternative Solutions

Is it possible to cast it into an Extension Dtype such as 'Int64' or at least a numeric dtype such as float64 or Float64?

Additional Context

No response

@amanlai amanlai added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 22, 2024
@asishm
Copy link
Contributor

asishm commented Apr 22, 2024

You can use pd.Series.convert_dtypes to achieve this.

In [39]: pd.Series([pd.NA, 1]).convert_dtypes().dtype
Out[39]: Int64Dtype()

That said, None does get converted to nan in pd.Series([None, 1]) and gets assigned a float dtype. pd.NA having similar behavior would make sense.

@mroeschke
Copy link
Member

#58243 is discussing the path for nullable types being returned by default.

Having one sentinel changing the returned dtype leads to value dependent behavior which pandas is trying to avoid, so I think this change would be better suited for the migration in #58243

@mroeschke mroeschke added Dtype Conversions Unexpected or buggy dtype conversions Needs Discussion Requires discussion from core team before further action NA - MaskedArrays Related to pd.NA and nullable extension arrays and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Enhancement NA - MaskedArrays Related to pd.NA and nullable extension arrays Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

3 participants