Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: flags #52165

Open
jbrockmendel opened this issue Mar 24, 2023 · 4 comments
Open

DEPR: flags #52165

jbrockmendel opened this issue Mar 24, 2023 · 4 comments
Labels
Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action

Comments

@jbrockmendel
Copy link
Member

jbrockmendel commented Mar 24, 2023

Splitting discussion off from #51280
PR #52153

The checking and propagation of flags in __finalize__ means a small-but-everywhere performance hit for all users that we should deprecate.

Flags only has allow_duplicate_labels, which can be disallowed by a 3rd-party validation library.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action and removed Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 24, 2023
@mroeschke
Copy link
Member

mroeschke commented Mar 24, 2023

@jorisvandenbossche
Copy link
Member

For context, the .flags / set_flags was a new feature added in pandas 1.2, as a general mechanism but at the time specifically for the "optionally disallow duplicate labels" option ( (https://pandas.pydata.org/docs/whatsnew/v1.2.0.html#optionally-disallow-duplicate-labels). See #27108 / #28394 (cc @TomAugspurger)

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Mar 27, 2023

The checking and propagation of flags in finalize means a small-but-everywhere performance hit for all users that we should deprecate.

Is that specific to the flags mechanism, or is it something to do with calling __finalize__ in the first place? I'd be fine with a dedicated boolean to propagate the duplicate labels information.

@jbrockmendel
Copy link
Member Author

Is [the performance penalty of flags] specific to the flags mechanism, or is it something to do with calling finalize in the first place?

I think of it as being __finalize__ holistically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

4 participants