-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: fix long string representation #36638
Conversation
Seems more reasonable - how does the performance look? |
Before refactoring:
After refactoring:
After this fix:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Changes look good.
Would be nice to maybe further profile before refactor / after this PR to see where the additional slowdown is coming from (but doesn't necessarily need to be here)
I guess, I see some area for improvement. In When refactoring I was concerned with the statement |
With "upcoming changes", you mean code that is now not yet in formats.py, but you are wanting to do in future PRs? (and if so, can you give some examples of what you have in mind?) Or is there now already some code that mutates |
By the "upcoming changes" I meant changes in Regarding the future RPs on the topic. Right now I am working on restructuring formatters, in an attempt to have them more aligned with each other. PR in progress #36510. |
Thanks for the explanation!
Actually, with python assignment/reference semantics, the |
yeah i would remove the .copy as it is not necessary (you could also add a test to assert that we don't mutate the inpute), but doesn't need to be in this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivanovmg thanks for the quick follow-up!
closes PERF: large perf regression in DataFrame repr #36636
tests added / passed
passes
black pandas
passes
git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry
Fix long string representation for large dataframes.
Eliminate for loop, which was filtering out the proper rows/columns to be displayed.
Revert to the original implementation with concat-ing head+tail and left+right parts.