Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Simplify case blocks in SQL tree #16092

Closed
maumar opened this issue Jun 14, 2019 · 21 comments · Fixed by #20966
Closed

Query: Simplify case blocks in SQL tree #16092

maumar opened this issue Jun 14, 2019 · 21 comments · Fixed by #20966
Assignees
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Milestone

Comments

@maumar
Copy link
Contributor

maumar commented Jun 14, 2019

Currently we translate CompareTo naively, using case statements with 3 options, however those can be significantly optimized for cases where CompareTo itself is compared to 1, 0 -1

foo.CompareTo(bar) == 0 -> foo == bar

example test:
Double_order_by_on_string_compare

example current sql:

SELECT [w].[Id], [w].[AmmunitionType], [w].[IsAutomatic], [w].[Name], [w].[OwnerFullName], [w].[SynergyWithId]
FROM [Weapons] AS [w]
ORDER BY CASE
    WHEN (CASE
        WHEN ([w].[Name] = N'Marcus'' Lancer') AND [w].[Name] IS NOT NULL THEN 0
        WHEN [w].[Name] > N'Marcus'' Lancer' THEN 1
        WHEN [w].[Name] < N'Marcus'' Lancer' THEN -1
    END = 0) AND CASE
        WHEN ([w].[Name] = N'Marcus'' Lancer') AND [w].[Name] IS NOT NULL THEN 0
        WHEN [w].[Name] > N'Marcus'' Lancer' THEN 1
        WHEN [w].[Name] < N'Marcus'' Lancer' THEN -1
    END IS NOT NULL THEN CAST(1 AS bit)
    ELSE CAST(0 AS bit)
END, [w].[Id]

example improved sql:

SELECT [w].[Id], [w].[AmmunitionType], [w].[IsAutomatic], [w].[Name], [w].[OwnerFullName], [w].[SynergyWithId]
FROM [Weapons] AS [w]
ORDER BY CASE
    WHEN [w].[Name] = N'Marcus'' Lancer'
    THEN CAST(1 AS bit) ELSE CAST(0 AS bit)
END, [w].[Id]
@smitpatel
Copy link
Member

Post translation optimization.

@ajcvickers ajcvickers added this to the Backlog milestone Jun 14, 2019
@smitpatel smitpatel changed the title Query: improve translation of CompareTo Query: Simplify case blocks in SQL tree Sep 5, 2019
@smitpatel
Copy link
Member

This should not change translation pipeline. It must be done in post translation SQL optimization.

@bricelam
Copy link
Contributor

bricelam commented Sep 6, 2019

Out of curiosity, do we ever reduce CASE to COALESCE, ISNULL/IFNULL, or NULLIF?

@smitpatel
Copy link
Member

We could but we never did.

@roji
Copy link
Member

roji commented Jan 29, 2020

Note: simplification of String.Compare(a, b) == 0 (with constant 0) to a == b was done as part of #19681, but not other simplifications:

  • Comparison to 1 or -1
  • Greater than/less than 0
  • Greater than/less than or equal than 0
  • All of the above with string.CompareTo()

Also note that the above was done in a preprocessing visitor, but @smitpatel is saying it should be done post-translation. Let's wait for @smitpatel and decide what to do.

@smitpatel
Copy link
Member

Post-translation. The code added in preprocessing visitor should be removed.

@roji
Copy link
Member

roji commented Feb 5, 2020

Can you explain why it's important this happens in postprocessing? Also, these special VB method calls will cause failure during translation because we don't know them..

@smitpatel
Copy link
Member

Because translation pipeline should be oblivious to it. Hence the simplification of case block must happen in post processing.

This issue is not about VB method calls. VB method calls should be converted to their equivalent representation of C# in pretranslation phase. Though

Operators.CompareString(
Left: s.supplierID,
Right: __$VB$Local_filter_0,
TextCompare: False)

Should get converted to string.Compare in C#

@roji
Copy link
Member

roji commented Feb 5, 2020

Because translation pipeline should be oblivious to it. Hence the simplification of case block must happen in post processing.

Am trying to understand why, according to what principles... Specifically, the rewriting of string.Compare() == 0 to equality (or string.Compare) is universal, and not provider-dependent. Moving it to post-processing means it would have to be repeated for relational, cosmos, in-memory... Is there a compelling reason to do so?

@smitpatel
Copy link
Member

Yes, for the providers which requires usage of function to do string comparison. It is same as why we cannot rewrite in preprocessor string.compare() > 0 to a > b because not all providers support string > string.

@roji
Copy link
Member

roji commented Feb 5, 2020

Yes, for the providers which requires usage of function to do string comparison

But those providers must take care of string equality without function in any case, since that's the expression the compiler generates for C# s1 == s2, so they'd have to provide a translation from that into whatever function they need...

On the other hand, AFAIK an expression with NodeType=GreaterThan and string operands is never produced in C# (or even in VB.NET), which is why it makes sense for us not to support it...

@smitpatel
Copy link
Member

Those providers will take care of it in translation. Why convert to one thing to another which in turn get converted back to main thing again. We already have this issue which required to be fixed either way due to >. The simplification is just fragmenting the code by doing same thing in multiple places without providing any additional value.

On the other hand, AFAIK an expression with NodeType=GreaterThan and string operands is never produced in C# (or even in VB.NET), which is why it makes sense for us not to support it...

Incorrect conclusion. string.Compare is a way to write a > b since you cannot write directly. We need to support it.

@roji
Copy link
Member

roji commented Feb 7, 2020

On the other hand, AFAIK an expression with NodeType=GreaterThan and string operands is never produced in C# (or even in VB.NET), which is why it makes sense for us not to support it...

Incorrect conclusion. string.Compare is a way to write a > b since you cannot write directly. We need to support it.

My main point here is that the simplification from string.Compare(a, b) == 0 to a == b (or if you prefer, for some reason, string.Equals(a,b)) can/should be universal to all providers (relational or not), which is why I think it belongs in pre-processing and not in post-processing - why repeat that for each provider type.

Re greater/less than, it seems odd to internally transform string.Compare(a,b) > 0 to a > b simply because that construct is never produced AFAIK in C# or VB. But if we want to do that, then like equality why not do this transformation in preprocessing (because again, it's universal)?

Maybe a quick call would help clear this up.

@smitpatel
Copy link
Member

it seems odd to internally transform string.Compare(a,b) > 0 to a > b simply because that construct is never produced AFAIK in C# or VB.

There are many constructs which are not produced in C# but still possible in SQL, we internally transform it because SQL supports it. And we don't do it because it's not possible/universal.

Repeat for each provider because each provider supports different things.

@roji
Copy link
Member

roji commented Feb 7, 2020

Regardless of the greater-than/less-than question:

My main point here is that the simplification from string.Compare(a, b) == 0 to a == b (or if you prefer, for some reason, string.Equals(a,b)) can/should be universal to all providers (relational or not), which is why I think it belongs in pre-processing and not in post-processing - why repeat that for each provider type.

Repeat for each provider because each provider supports different things.

The fact that string.Compare(a, b) == 0 is equivalent to a == b is not provider dependent... The former is just a more complicated way to express the latter. So why not normalize that in preprocessing rather than repeating for each provider?

@roji
Copy link
Member

roji commented Feb 7, 2020

Discussed and accepting @smitpatel's argument that we should preserve the LINQ expression tree as much as possible rather than apply optimizing/normalizing transformations in preprocessing.

@ajcvickers
Copy link
Member

@roji Don't want to poke a sleeping lion, but here goes. Surely normalization of VB generated expression trees to what C# would have generated for the same intention is good to have in the core stack, not for each providers?

@roji
Copy link
Member

roji commented Feb 10, 2020

That was my original motivation here, I'll let @smitpatel answer (you two can discuss offline too, I've made my peace with either possibility 😄).

@smitpatel
Copy link
Member

Surely normalization of VB generated expression trees to what C# would have generated for the same intention is good to have in the core stack, not for each providers?

It is in core stack not in provider code.

@roji
Copy link
Member

roji commented Feb 11, 2020

To clarify, there are two separate things here... For s1 == s2, VB produces Operators.CompareString(s1, s2, false) == 0. So:

  1. We convert Operators.CompareString to string.Compare. This was committed and is happening in preprocessing (so core stack, not provider-specific).
  2. I originally also convertged string.Compare(s1, s2) == 0 to s1 == s2, since they're mathematically equivalent - this is the "controversial" part that @smitpatel doesn't want in preprocessing/core stack.

@smitpatel
Copy link
Member

@maumar - check OP to see if this is fixed.

@maumar maumar modified the milestones: Backlog, 5.0.0 May 15, 2020
maumar added a commit that referenced this issue May 15, 2020
Adding optimization to during post processing (null semantics)
Trying to match CASE block that corresponds to CompareTo translation. If that case block is compared to 0, 1 or -1 we can simplify it to simple comparison.
Also added generic CASE block optimization, when constant is compared to CASE block, and that constant is one of the results

Fixes #16092
maumar added a commit that referenced this issue May 15, 2020
Adding optimization to during post processing (null semantics)
Trying to match CASE block that corresponds to CompareTo translation. If that case block is compared to 0, 1 or -1 we can simplify it to simple comparison.
Also added generic CASE block optimization, when constant is compared to CASE block, and that constant is one of the results

Fixes #16092
@maumar maumar added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label May 15, 2020
maumar added a commit that referenced this issue May 15, 2020
Adding optimization to during post processing
Trying to match CASE block that corresponds to CompareTo translation. If that case block is compared to 0, 1 or -1 we can simplify it to simple comparison.
Also added generic CASE block optimization, when constant is compared to CASE block, and that constant is one of the results

Fixes #16092
maumar added a commit that referenced this issue May 15, 2020
Adding optimization to during post processing
Trying to match CASE block that corresponds to CompareTo translation. If that case block is compared to 0, 1 or -1 we can simplify it to simple comparison.
Also added generic CASE block optimization, when constant is compared to CASE block, and that constant is one of the results

Fixes #16092
maumar added a commit that referenced this issue May 15, 2020
Adding optimization to during post processing
Trying to match CASE block that corresponds to CompareTo translation. If that case block is compared to 0, 1 or -1 we can simplify it to simple comparison.
Also added generic CASE block optimization, when constant is compared to CASE block, and that constant is one of the results

Fixes #16092
@maumar maumar closed this as completed in f7d5168 May 19, 2020
@ajcvickers ajcvickers modified the milestones: 5.0.0, 5.0.0-preview6 Jun 1, 2020
@ajcvickers ajcvickers modified the milestones: 5.0.0-preview6, 5.0.0 Nov 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants