You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
The performance of GpuBroadcastNestedLoopJoinExec is not ideal. If I am not wrong, the major bottleneck is innerLikeJoin, which will firstly materialize the full Cartesian Product's result of two input tables. Then, filter the product table with bound conditions. For spark runtime, above two steps are chained in streaming style, avoiding huge cost of materializing the full product table.
Describe the solution you'd like
A naive solution is to split streaming table into several sub-tables. Then, conduct innerLikeJoin on each of them. Finally, concatenate sub-results to get the final result.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
The performance of
GpuBroadcastNestedLoopJoinExec
is not ideal. If I am not wrong, the major bottleneck is innerLikeJoin, which will firstly materialize the full Cartesian Product's result of two input tables. Then, filter the product table with bound conditions. For spark runtime, above two steps are chained in streaming style, avoiding huge cost of materializing the full product table.Describe the solution you'd like
A naive solution is to split streaming table into several sub-tables. Then, conduct
innerLikeJoin
on each of them. Finally, concatenate sub-results to get the final result.The text was updated successfully, but these errors were encountered: