-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve url_decode performance for long strings #7353
Conversation
Using the benchmark, here are the before and after performance numbers on a V100. The two numbers varying in the benchmark name are the number of rows and characters per row, respectively.
After:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good solution. I don't think it is handling sliced columns correctly though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments. Will take another look once the sliced input tests are in place.
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #7353 +/- ##
==============================================
Coverage ? 82.22%
==============================================
Files ? 100
Lines ? 16969
Branches ? 0
==============================================
Hits ? 13953
Misses ? 3016
Partials ? 0 Continue to review full report at Codecov.
|
Thanks a ton for the quick reviews! I have yet to address the excellent feedback, but I wanted to post a slightly modified version of the algorithm that I found was significantly faster in practice to see if there was any suggestions/feedback on the approach. Rather than perform the expensive binary search from each character index three times (during This bears out in the new performance numbers. I added a 50% chance version of an escape sequence to compare as the number of escape sequences becomes significantly more common.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great. Just a couple of very minor comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMake approval
@gpucibot merge |
Fixes #7348.
This changes the
url_decode
algorithm from row-level parallelism to character-level parallelism which improves the performance when operating on string columns that have longer average string lengths. A benchmark forurl_decode
has also been added.