Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query: Add caching of generated RelationalCommand based on nullability of parameters #15892

Closed
smitpatel opened this issue Jun 1, 2019 · 12 comments · Fixed by #18035
Closed
Assignees
Labels
area-perf closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Milestone

Comments

@smitpatel
Copy link
Contributor

No description provided.

@roji
Copy link
Member

roji commented Jul 19, 2019

Here's a requirement from the PostgreSQL side...

Thanks to PostgreSQL's fully support for arrays, we can translate the following:

var ids = new[] { 1, 2, 3 };
var customers = ctx.Customers.Where(c => ids.Contains(c.Id)).ToList();

Into the following:

SELECT * FROM customers WHERE id = ANY (@ids);

This obviates doing expansion of parameters into constants at runtime. The one place where this fails is if there's a null somewhere inside ids; to preserve null semantics, we'd need to translate to:

SELECT * FROM customers WHERE id = ANY (@ids) OR id IS null;

In other words, I'd ideally be able to cache to commands based on whether the array parameter's value contains a null anywhere.

@smitpatel
Copy link
Contributor Author

I don't think it is relevant here in that context.
parameter nullability is @ids being null. Sniffing if @ids array contains a null value is no different than caching based on parameter values. And later has significantly low value.

@smitpatel
Copy link
Contributor Author

Also modifying parameter values during Sql generation is not possible (not it should be made possible). So while there is possibility of adding additional term (just like how Inexpression expands) but @ids will still contain null values, unless you expand it out.

@roji
Copy link
Member

roji commented Jul 20, 2019

Sniffing if @ids array contains a null value is no different than caching based on parameter values

Depending on how you look at it, it's another form of that caching. Looking at the nullability of a parameter isn't exactly the same as looking whether an array parameter contains null.

And later has significantly low value.

Why do you say that? In the non-PostgreSQL case, expanding a parameterized array to constant can yield potentially endless SQL based on the array's contents. As an optimization, instead of expanding the array to a constant we could expand to a set of parameterized scalar parameters, e.g. x IN (@p1, @p2) for two parameters, but that still leaves a different query for each number of parameters in the array.

PostgreSQL allows you to sidestep all that by having only two SQLs - one for when the array contains null, and one for when it does not. Note that different queries can have a pretty important perf impact (plan caching at the server, possibility to prepare statements...). That is why I'm looking into this.

Also modifying parameter values during Sql generation is not possible

I wasn't asking for that anywhere, only to add the additional nullability check in SQL.

@divega
Copy link
Contributor

divega commented Jul 26, 2019

In the non-PostgreSQL case, expanding a parameterized array to constant can yield potentially endless SQL based on the array's contents. As an optimization, instead of expanding the array to a constant we could expand to a set of parameterized scalar parameters, e.g. x IN (@p1, @p2) for two parameters, but that still leaves a different query for each number of parameters in the array.

We have discussed doing this (see parameter rewrite in #13617 (comment)) and even repeating the value of the last element of the array to fill-in to fixed array sizes.

I see that, what @roji is describing, using a TVPs (which incidentally would be very similar to what @roji is describing), and a string we would need to build to pass to STRING_SPLIT() as possible cases of specialized parameter binding in which constants in the source expression tree end up not mapping 1:1 with constants in the generated SQL.

In all those cases there is some processing required to get from the input values to the actual parameters for the query, so I don't think we should reject the idea of sniffing into the individual parameter values to check if any of them is null, either to generate different SQL or to producing an extra bool parameter to short-circuit the null check, e.g.:

SELECT * FROM customers WHERE id = ANY (@ids) OR (@any_id_is_null AND id IS null);

Of course, the simpler and more performant the solution, the better.

FWIW, in the original example this extra sniffing shouldn't be needed because the array is of type int [] and not int? [] :trollface:

From what I remember, we have seen other cases in which matching values other than null when we sniff parameters could lead to simpler SQL. I am not convinced the value of this is low, but it hasn't been high enough so far.

@smitpatel
Copy link
Contributor Author

This issue does not track caching based on parameter value sniffing. While it may look tempting but wrt implementation cost, it may not cross value-cost bar.

@divega
Copy link
Contributor

divega commented Jul 26, 2019

I agree we should split this part of the discussion into a separate issue.

@roji
Copy link
Member

roji commented Jul 26, 2019

@smitpatel is there already another issue for caching based on parameter sniffing? Did a quick search was surprised not to find it yet.

While it may look tempting but wrt implementation cost, it may not cross value-cost bar.

The point for me is that if I understand correctly, we intend to implement parameter sniffing and caching in any way, to able able to get do tighter null semantics (i.e. eliminate current unneeded null checks). So what I'm asking for can hopefully be a relatively small incremental addition over that (sniff values instead parameter arrays as opposed to only null parameters).

FWIW, in the original example this extra sniffing shouldn't be needed because the array is of type int [] and not int? [] :trollface:

Good point :)

@smitpatel
Copy link
Contributor Author

we intend to implement parameter sniffing and caching in any way, to able able to get do tighter null semantics

  • For caching we intend to only check if parameter is null or not null. No more than 2 states. We don't care what is the value if it is not null.
  • If we actually look into non-null value then it won't be cached.

@roji
Copy link
Member

roji commented Jul 26, 2019

@smitpatel I posted on the above precisely to discuss this, am assuming design/decisions haven't been locked down yet... It should a function of added complexity/effort no?

@smitpatel
Copy link
Contributor Author

smitpatel commented Jul 26, 2019

This issue specifically track the second level caching of select expression we had in previous pipeline. Hence my very first comment that it is not relevant in this issue.
We had one working & effective system in past. We can re-implement it. It does not mean that we cannot extend it. But as of now, we do not have design for such caching which sniff parameter values. I see negative value trying to add a functionality for which we don't have design blocking addition of working component from past. Please file a new issue, and present a design on how parameter sniffing based caching is supposed to work.

@roji
Copy link
Member

roji commented Sep 3, 2019

Proposal for a more general parameter-sniffing caching mechanism: #17598

@ajcvickers ajcvickers modified the milestones: 3.1.0, 3.1.0-preview2 Oct 24, 2019
@ajcvickers ajcvickers modified the milestones: 3.1.0-preview2, 3.1.0 Dec 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-perf closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. type-bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants