-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: DataFrame construction with dictionary ArrowDtype columns #53654
Conversation
import pyarrow as pa | ||
|
||
if pa.types.is_dictionary(dtype.pyarrow_dtype): | ||
other = other.astype(ArrowDtype(dtype.pyarrow_dtype.value_type)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is going to have implications beyond the constructors
Can you add test(s) that directly test the affected indexing methods
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some tests for get_indexer
, get_indexer_non_unique
that seem related. LMK if there are other methods you also had in mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new tests look good, thanks.
I think we can make this a lot cheaper though since the places that all _unpack_nested_dtype only need the result's dtype. so with a little tinkering we can just return that and avoid doing a cast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Might be better as a followup since this change is being backported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
Looks to be greenish so merging, but can follow up if needed |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon! Remember to remove the If these instructions are inaccurate, feel free to suggest an improvement. |
…nary ArrowDtype columns
…s-dev#53654) * BUG: DataFrame construction with dictionary ArrowDtype columns * Add tests for get_indexer * Windows
…s-dev#53654) * BUG: DataFrame construction with dictionary ArrowDtype columns * Add tests for get_indexer * Windows
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.