Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo SQL tree pruning #32672

Merged
merged 2 commits into from
Dec 29, 2023
Merged

Redo SQL tree pruning #32672

merged 2 commits into from
Dec 29, 2023

Conversation

roji
Copy link
Member

@roji roji commented Dec 25, 2023

  • Notes on the previous pruning design:
    • The design wasn't visitor based, but rather manual recursing into the tables of the selects. This had several disadvantages, e.g. not pruning in subqueries which aren't tables, and not allowing the pruning mechanism to be extended by providers.
    • It was also based on a separate pass for finding which columns are referenced on which table (ColumnExpressionFindingExpressionVisitor); this was done for each select as we recursed down, revisiting the same nested nodes again and again.
  • This rewrite does the following:
    • Replace SelectExpressionPruningExpressionVisitor with SqlTreePruner, which is a real visitor that visits everywhere, pruning places we didn't prune before.
    • This also means it can be extended by providers (e.g. OPENJSON pruning, #32668).
    • The visitor is implemented as a single pass, both discovering column references and performing the actual pruning (better performance).
  • Note that pruning still mutates the SelectExpression rather than creating an immutable copy, as before. I did this mainly because that's how things currently work, and changing that would mean wider changes.
    • Specifically, if there are multi-referenced selects, TableAliasVerifyingExpressionVisitor (which comes right afterwards) has logic for handling that; but if pruning duplicates the multi-referenced select, we end up with two different select instances in the tree with the same alias, tripping up assertions (see similarity with this problem).
    • We should revisit all of this after removing multi-referenced selects, which would be part of #32277.

@ajcvickers you may be interested in taking a look, this is query stuff but relatively scoped to a specific problem etc.

Two additional pruning-related PRs also coming up based on this one.

Closes #31083
Fixes #31407

@roji roji requested a review from a team December 25, 2023 18:26
@@ -21,6 +21,8 @@ public class SqlServerQuerySqlGenerator : QuerySqlGenerator
private readonly ISqlGenerationHelper _sqlGenerationHelper;
private readonly int _sqlServerCompatibilityLevel;

private bool _withinTable;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes for SQL Server generation are unfortunate and tricky...

SQL Server is very strict and does not allow a projection without an alias within a table subquery (SELECT ... FROM (SELECT 1 ...))); we already had special logic for generating 1 AS empty to work around this, but the logic for when to do this was incomplete. The pruning introduced in this PR caused selects within set operations to also possibly get their projection completely trimmed away, but our logic didn't catch that (this was an already-existing, possibly latent bug).

As a result, we need to do visitor state hacks to detect when the empty projection is directly in a subquery table, to generate AS empty only when needed.

public override Task Update_non_main_table_in_entity_with_entity_splitting(bool async)
=> Assert.ThrowsAnyAsync<Exception>(
() => base.Update_non_main_table_in_entity_with_entity_splitting(async));
public override async Task Update_non_main_table_in_entity_with_entity_splitting(bool async)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix for #31407

@@ -42,9 +42,6 @@ public override async Task Delete_where_using_hierarchy(bool async)
WHERE (
SELECT COUNT(*)
FROM [Animals] AS [a]
LEFT JOIN [Birds] AS [b] ON [a].[Id] = [b].[Id]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We weren't doing any pruning in subqueries that weren't table subqueries (this one is a scalar subquery inside the predicate).

SELECT JSON_VALUE([o].[value], '$.OwnedReferenceLeaf.SomethingSomething') AS [c], [o].[key], CAST(JSON_VALUE([o].[value], '$.Date') AS datetime2) AS [c0]
FROM OPENJSON([j].[OwnedReferenceRoot], '$.OwnedCollectionBranch') AS [o]
ORDER BY CAST(JSON_VALUE([o].[value], '$.Date') AS datetime2) DESC
SELECT JSON_VALUE([o].[OwnedReferenceLeaf], '$.SomethingSomething') AS [c], [o].[Date] AS [c0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "expansion" from WITH-less OPENJSON to WITH-ful is actually an improvement; an upcoming PR will also prune unreferenced columns from the WITH column list.

@@ -331,7 +331,7 @@ public virtual async Task Ordered_array_of_string()
WHERE (
SELECT COUNT(*)
FROM (
SELECT [s].[value], [s].[key], CAST([s].[key] AS int) AS [c]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maumar we discussed pruning these - so here we go :) I'm not sure any more if we opened an issue to track this (couldn't find one in a quick search)

@@ -2556,21 +2556,21 @@ public override async Task Concat_anonymous_with_count(bool async)
"""
SELECT COUNT(*)
FROM (
SELECT [t].[Nickname], [t].[SquadId], [t].[AssignedCityName], [t].[CityOfBirthName], [t].[FullName], [t].[HasSoulPatch], [t].[LeaderNickname], [t].[LeaderSquadId], [t].[Rank], [t].[Discriminator], [t].[Nickname] AS [Name]
SELECT 1 AS empty
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice example where having COUNT(*) at the top causes all projections inside to get pruned

@roji roji merged commit 651d5b9 into dotnet:main Dec 29, 2023
6 of 7 checks passed
@roji roji deleted the Pruning branch December 29, 2023 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve pruning to properly support ExecuteUpdate Redo our pruning implementation
2 participants