-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: copy cached attributes on index shallow_copy #32568
PERF: copy cached attributes on index shallow_copy #32568
Conversation
pandas/core/indexes/base.py
Outdated
@@ -499,10 +501,12 @@ def _shallow_copy(self, values=None, name: Label = no_default): | |||
""" | |||
name = self.name if name is no_default else name | |||
|
|||
if values is None: | |||
values = self.values | |||
cache = self._cache if values is None else {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you keep the if values is None:
check; i like it for inspection of coverage results
LGTM pending green |
Should engine be shared? If so, maybe access it to ensure it is in _cache before coping _cache? |
If the
My hunch is that it's mostly not that important, because either the cache has already been populated, or the original index will not be used again (e.g. in a pipe, often). But in some given cases we'd prefer either 1 or 2, depending on context, but we'd have to choose. I prefer 2., because in that case |
Makes sense, thanks for explaining your thought process. |
thanks @topper-123 if you think we need additional asvs's happy to have them (e.g. is_monotonic, is_unique), MI indexing, and .get_loc would all benefit; we might have enough, just checking in about this. |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
The performance of the example in #28584 is:
The issue is still on the extension indexes, e.g.
CategoricalIndex._shallow_copy
. I'd like to take them afterwards.