-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: internal error: apply join found an unexpected re-optimized right-hand-side with non-zero subqueries #35594
Comments
@jordanlewis, will you take first look at this? |
Yes, I already started to take a look this morning. Justin poked over and noticed that it's weird that the produced plan has 2 exact duplicate groups. I haven't discovered anything else yet. Fundamentally though, the apply join code expects that there will not be any RHS expressions with actual subqueries - if that assumption is violated then I think it means that the optimizer is doing something unexpected up front. |
But the optimizer never hoists uncorrelated subqueries; only correlated subqueries. |
Right, but I thought that all uncorrelated subqueries (no matter how deep within the query tree) get transformed during the very first execbuild phase into top-level exec subqueries. If that were the case, then we wouldn't see any more top-level exec subqueries during the re-optimization phases. |
I think Jordan is right that this isn't an execution thing—the fact that there's two identical groups (not just semantically, even the column ids are the same) indicates something has gone wrong in the optimizer. I suspect there's an incorrect transformation rule somewhere. I can investigate. |
Smaller repro:
Jordan and I talked it over and we think this actually happens whenever the RHS of an apply join has an uncorrelated subquery. Subsequent execbuilds for the apply-join case need to know not to re-raise the subquery to the top-level and instead return the prior reference. |
@jordanlewis and I talked about this some more today. I think for 19.1 it's OK to re-execute any uncorrelated subquery on each iteration of the apply loop, rather than trying to execute once. In a later release, we will most likely have the optimizer hoist uncorrelated subqueries into top-level Jordan, have you thought more about this? Would it be relatively easy to just re-execute each time, or are there other problems with that? |
It'll be easy to re-execute each time. I went down a rabbit hole of trying to make the top level subqueries just get executed once. It was almost easy, but not quite. The solution I wanted to do was have a unique ID associated with each subquery that's passed along during the CopyAndReplace step, so that when we see a subquery during execbuild, we can replace it with the old subquery reference based on that unique id instead of execbuilding a new one. But that was a little bit challenging because of this unique id generation stuff. I worked with Justin on it a bit and gave up. He said that it'll be easy to solve this particular issue when with support is available in opt. So once that's done, we'll be able to more easily remove the duplicate subquery evaluation - but for now we'll just do it on every row. |
Sounds right to me. |
35711: sql: permit subqueries within RHS of applyjoin r=jordanlewis a=jordanlewis Previously, the code assumed that subqueries would have already been promoted to top-level subqueries and run exactly once by the re-optimization phase of apply join. However, that promotion happens in execbuild, not in the optimizer, and therefore won't have already happened by the time that we go to execbuild the re-optimized RHS in apply join. For now, the solution is to re-run the subqueries in the RHS every time the RHS is run. This is suboptimal because subqueries need only be run once per query regardless of their position within the apply join tree. However, setting this up currently is difficult at the moment and would have required a more invasive change. When WITH support is available, the optimizer will promote subqueries into WITH clauses up front, at which point we will be able to safely remove the requirement that apply join rerun subqueries on the RHS. Closes #35594. Release note: None Co-authored-by: Jordan Lewis <[email protected]>
Setup:
SQL:
Error:
Found with sqlsmith.
The text was updated successfully, but these errors were encountered: