-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unnecessary null checks in GroupColumn
s
#12944
Comments
I am working on a poc #12947 for checking the improvement of this idea. |
I found it actually now a trivial change. The challenge is that, we should refactor codes to support vectorize append firstly. (otherwise, we need to add a branch about which append function we should call for each row's appending, and as a reuslt, we remove a row level branch and add another...) I am trying it. |
This is the number get from my local:
|
Makes sense to vectorize appends at the same time, I think this might have even more impact than just omitting null checks. |
I wonder should we vectorize eq together. I impl a simple version only vectorize the append operation #12996 But it introduce another branch in the |
Is your feature request related to a problem or challenge?
As mentioned by @Dandandan in #12809 (comment)
Some null checks are actullay unnecessary for arrays containing no nulls (basically we can just use
null_count
to check it).Describe the solution you'd like
As mentioned above, the simple way is using
null_count
in array to make it.Furtherly, I found we indeed check which rows are nulls in
create_hashes
. I think maybe we can reuse this result?Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: