-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve scalar string replace performance for long strings #7415
Improve scalar string replace performance for long strings #7415
Conversation
Performance comparisons from the scalar benchmark.
After:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is alot of work and will take some time go through. It looks good to me so far. These are my first-pass comments right now.
Codecov Report
@@ Coverage Diff @@
## branch-0.19 #7415 +/- ##
==============================================
Coverage ? 82.26%
==============================================
Files ? 101
Lines ? 17072
Branches ? 0
==============================================
Hits ? 14045
Misses ? 3027
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. It would probably be worth exploring this for replace-mutliple as well. But not in this PR.
@gpucibot merge |
Fixes #7370.
This adds a scalar string replace algorithm with character-level parallelism which significantly improves the performance of scalar string replacement on longer strings. It can involve many more kernel launches than the row-based algorithm and does not always outperform on short strings. Therefore a heuristic based on the average character length of valid string rows is used to automatically select which algorithm to use.