-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement physical execution of uncorrelated scalar subqueries #3781
Comments
We can also remove the restriction on the conversion that the subquery should contain an aggregate, and perform the check of being scalar in the physical node. |
Related issue: #3725 |
We can just change it to left join, and add logic to check if more than 1 row is returned in left join. |
I don't think that's possible. A cross join is used as that doesn't require a join condition. The cross join is less efficient however, as it repeats the value for the scalar as much as there are values on the left side, and then uses that to filter on (which is less efficient than using a scalar in the filter). |
Oh, my mistake. The tile is uncorrelated scalar subquery. For correlated scalar subquery, we can change it to left join. |
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
We currently support uncorrelated scalar subqueries by translating them into a cross-join. It would likely be more efficient to execute the subquery and update the original plan with the scalar value.
We also need to do this so we can throw an error if more than 1 row is returned and there is no way to do this in the logical plan.
Describe the solution you'd like
The optimizer will need some kind of trait that execution engines (DataFusion, Ballista, Dask SQL, etc) can implement for resolving scalar values.
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: