You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Here's a thought dump on SQL duplication. There's no definite proposal yet below, this requires more thinking; ultimately, this may be more about how we translate GroupBy and handle pending selectors.
The following LINQ query (see #32234 (comment) for the full repro):
SELECT [i].[IterationId] AS [Key], (
SELECTMAX([i5].[Timestamp])
FROM [IterationValues] AS [i4]
INNER JOIN [Iteration] AS [i5] ON [i4].[IterationId] = [i5].[Id]
WHERE [i4].[IterationId] IN (
SELECT [i6].[value]
FROM OPENJSON(@__ids_0) WITH ([value] int'$') AS [i6]
) AND [i].[IterationId] = [i4].[IterationId]) AS [MaxTimestamp]
FROM [IterationValues] AS [i]
WHERE [i].[IterationId] IN (
SELECT [i0].[value]
FROM OPENJSON(@__ids_0) WITH ([value] int'$') AS [i0]
)
GROUP BY [i].[IterationId]
ORDER BY (
SELECTMAX([i5].[Timestamp])
FROM [IterationValues] AS [i4]
INNER JOIN [Iteration] AS [i5] ON [i4].[IterationId] = [i5].[Id]
WHERE [i4].[IterationId] IN (
SELECT [i3].[value]
FROM OPENJSON(@__ids_0) WITH ([value] int'$') AS [i3]
) AND [i].[IterationId] = [i4].[IterationId])
There is considerable subquery duplication in the above SQL: the ordering and projection contains the exact same subquery, and the OPENJSON subquery contained within it is also duplicate in the WHERE clause. This is quite inefficient, causing the database to evaluate the subquery multiple times.
One eager approach here would be to perform a pushdown in translation, when an operator is being applied that would cause duplication. For example, with the above query, when the OrderBy is applied, we'd detect that the lambda references a SelectExpression that's also in the projection, do a pushdown, and then apply the ordering on top of that. This has the disadvantage of being probably too eager: if a further Select is then applied which removes the subquery from the projection, the pushdown is no longer necessary, but we already did it too early. Also, finding out whether the OrderBy references the projection isn't trivial (either an additional visitation on all lambdas just to detect this, or to change the way we translate lambdas, passing them more context).
Another approach would be to add a phase in post-processing, where we find all SelectExpressions which are referenced multiple times, and extract those out to CTEs; this would add one pass for discovery, and another one to apply the transformation if needed (to optimize this we may gather the discovery information as part of translation and pass it forward to post-processing). There are probably some other CTE-related post-processing needs, so this visitor probably makes sense as a new post-processing step.
For the ordering and projection duplication above, a CTE may be slightly less great than a simple subquery pushdown (even if perf-wise they should be the same). We could detect the possibility to push down instead of introduce a CTE (because the duplication is between the projection and another clause, i.e. ordering), but that may be too much trouble for what its worth.
Note that this second approach needs to happen right after translation, before type mapping inference visitation happens - or any other visit which may rewrite the tree. This is because such a visitor may cause the SelectExpression to get duplicated, and we lose the multiple referencing.
However, note that the WHERE clause itself also gets duplicated in the above ordering and projection. The ideal way to rewrite this query is to perform a pushdown right after the WHERE clause, so that the IN operator is only evaluated once. That does point towards approach (1), i.e. detecting that the projection contains a complex, potentially expensive expression which is better off being pushed down.
The text was updated successfully, but these errors were encountered:
Here's a thought dump on SQL duplication. There's no definite proposal yet below, this requires more thinking; ultimately, this may be more about how we translate GroupBy and handle pending selectors.
The following LINQ query (see #32234 (comment) for the full repro):
... produces the following SQL:
There is considerable subquery duplication in the above SQL: the ordering and projection contains the exact same subquery, and the OPENJSON subquery contained within it is also duplicate in the WHERE clause. This is quite inefficient, causing the database to evaluate the subquery multiple times.
However, note that the WHERE clause itself also gets duplicated in the above ordering and projection. The ideal way to rewrite this query is to perform a pushdown right after the WHERE clause, so that the IN operator is only evaluated once. That does point towards approach (1), i.e. detecting that the projection contains a complex, potentially expensive expression which is better off being pushed down.
The text was updated successfully, but these errors were encountered: