-
Notifications
You must be signed in to change notification settings - Fork 902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize left_semi_join
by materializing the gather mask
#10511
Conversation
Can one of the admins verify this patch? |
Up to 20x faster. Separated hash table lookup from copy_if because increased register usage significantly limited occupancy of this kernel.
@cheinger so to be clear, the performance improvement didn't come from using |
@jrhemstad correct. I updated the gitlab issue with a more detailed explanation |
@cheinger could you update the PR description to provide a short summary? The PR description goes into the CHANGELOG. |
ok to test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work. Can you please update the PR title accordingly? It would be useful to also include your performance analysis (here) in the PR description. Did you notice any performance changes in semi join benchmarks?
ok to test |
1 similar comment
ok to test |
This comment was marked as outdated.
This comment was marked as outdated.
@PointKernel can you re-review/approve? |
left_semi_join
by materializing the gather mask
add to whitelist |
add to allowlist |
rerun tests |
@gpucibot merge |
Thank you @cheinger for adding this optimization! I'm seeing a 15-30% reduction in compute time for our |
@GregoryKimball Sweet! Happy to help! |
Closes #10464
Updates the
left_semi_join
to materialize the gather mask instead of generating it via a transform iterator.Including the
map.contains
in thegather
call reduced occupancy due to increasing register usage. As a result, explicitly materializing the gather mask is faster.