Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow DbFunction to return scalar value from aggregates #11850

Open
SteffenMangold opened this issue Apr 29, 2018 · 27 comments
Open

Allow DbFunction to return scalar value from aggregates #11850

SteffenMangold opened this issue Apr 29, 2018 · 27 comments

Comments

@SteffenMangold
Copy link

There is no solution for calling STDEV aggregation function from Entity Core.

@smitpatel
Copy link
Contributor

There is no LINQ operator for that. But it would be easy to write a user defined DbFunction to call & translate to STDEV on server.

@bricelam
Copy link
Contributor

Should we provide one?

Dim stdev = Aggregate b In db.Benchmarks Into EF.Functions.Stdev(b.Result)

@SteffenMangold
Copy link
Author

That would be great 🤗

@bricelam
Copy link
Contributor

Notes for triage: PostgreSQL has stddev; SQLite has none.

@smitpatel smitpatel changed the title STDEV aggregation with MSSQL Allow DbFunction to return scalar value from aggregates Apr 30, 2018
@smitpatel
Copy link
Contributor

Re-purposing the issue. Currently there is no support to write a DbFunction which translates to aggregate function over server. Same restriction goes for our method call translators. We need to lift this restriction before we can translate aggregate functions on server.

@smitpatel smitpatel removed their assignment Apr 30, 2018
@pmiddleton
Copy link
Contributor

The heart of the issue is that ReLinq won't recognize the function as a ResultOperator by default. We will have to integrate to get the methods translated correctly, and then deal with the functions during query creation.

@SteffenMangold
Copy link
Author

Is it some kind of a big deal? I try to investigate a realistic timeframe for this. Just to know how to go on with our project. Thanks for your interest so long! ,😊

@bricelam
Copy link
Contributor

bricelam commented Apr 30, 2018

It's not too bad to work around using query types and FromSql:

class ScalarResult<T>
{
    public T Value { get; set; }
}

class MyContext : DbContext
{
    public DbSet<Benchmark> Benchmarks { get; set; }
    public DbQuery<ScalarResult<double>> Doubles { get; set; }
}
var db = new MyContext();
var stdev = Enumerable.Single(
    from r in db.Doubles.FromSql("SELECT STDEV(Result) AS Value FROM Benchmarks")
    select r.Value);

@roji
Copy link
Member

roji commented May 1, 2018

I know this issue has been repurposed, but there seems to definitely be value in having a EF.Functions.Stdev() which would be translated to SQL by providers which support it (we have at least SQL Server and PostgreSQL). We can open a different issue to track this.

On a related note, how easy is it for providers to add translation for additional aggregate functions? Regular functions are quite simple (via IMethodCallTranslator) but methods such as Sum and Average seem to receive some pretty special handling in the core. Is it possible (or is it somehow planned) to add support for translating aggregate functions with the same ease as regular ones?

@pmiddleton
Copy link
Contributor

@roji

Aggregates are treated special by both ReLinq and Core so supporting them will require some extra changes in the backend.

I will take a look at this after we finish with merging in TVFs.

@divega - Should we use this item to track or open another?

@smitpatel
Copy link
Contributor

Aggregates are considered ResultOperators in ReLinq and for supporting a single aggregate function means, adding corresponding CustomResultOperator. Registering it in query parser and then translating it in VisitResultOperator method. It is neither easy nor scalable for multiple operators.

It is true that adding function for Stdev in EF has value but due to lack of good underlying support, I re-purposed this issue to add back-end support first. Once that is done, I will open another issue to track specific work for Stdev. (or it could be added as proof of concept that pipeline works). My main motivation for re-using same issue was, avoid having multiple tracking issues which talks about adding more functions which would have been essentially blocked on this one.

@pmiddleton
Copy link
Contributor

@smitpatel - I was thinking that it might be possible to add a single ResultOperator that can deal with all custom aggregates. That way we could deal with custom clr aggregate functions in Sql Server also.

If that isn't going to work then just supporting the big hitters should still be possible.

@roji - does postgres support STDEVP, VAR, VARP? Those are the ones from Sql Server which are still missing.

@divega divega added this to the Backlog milestone May 1, 2018
@divega divega added the help wanted This issue involves technologies where we are not experts. Expert help would be appreciated. label May 1, 2018
@roji
Copy link
Member

roji commented May 2, 2018

Makes sense - it does seem more important to add infrastructure for provider-defined (and even user-defined?) aggregate functions than to add ad-hoc support for STDDEV. At the same time if it's really trivial to add stuff like STDDEV alongside sum and average, why not...

Here are the PostgreSQL docs for aggregate statistical functions (note that the page lists all aggregate functions, not just statistical, so scroll around). STDEVP is supported (it's called stddev_pop) as well as variance and var_pop, and there are some other interesting ones in there as well.

@SteffenMangold
Copy link
Author

Can someone tell me what the process is after "up-for-grabs"? Only to know about what time horizon we are talking. :)

@divega
Copy link
Contributor

divega commented May 3, 2018

@SteffenMangold This is the backlog. It means it is definitively not going into 2.1. It may go into the next release if someone (e.g. @pmiddleton has done a lot of great contributions around function support in EF Core) picks it up. Otherwise we may choose to do it anyway in the next release, but given other important features in the backlog or already assigned to the 2.2 milestone, I think the chances are small.

In the meanwhile my recommendation would be to write the SQL in FromSql(). With the improvements included in 2.1, you should be able to declare a query type in your model that contains the result of the aggregate and that you can use for context.Query().FromSql("...").

@pmiddleton
Copy link
Contributor

I will be able to look at this for the 2.2 milestone.

@roji
Copy link
Member

roji commented May 4, 2018

@pmiddleton just to be sure, this would include provider-facing infrastructure (similar to today's existing method and member translators), and not just user-facing DbFunction support, right?

@pmiddleton
Copy link
Contributor

@roji

It's all up in the air right now, as I have not started, so we can shoot for whatever we want.

Are you thinking of an interface for providers to preregister known aggregates so users don't have to. Similar to how base functions are added in Relational via IMethodProvider?

@roji
Copy link
Member

roji commented May 4, 2018

That's right. For stuff like STDDEV and others, which exist on some providers but not on others, we shouldn't require to use DbFunction but rather have the provider translate something like EF.Functions.Stddev(), if at all possible... Ideally something like the existing SqlServerCompositeMethodCallTranslator.

@SteffenMangold
Copy link
Author

Did somebody already picked that task?

@pmiddleton
Copy link
Contributor

The query pipeline is being rewritten in 3.0 so this is currently blocked by that work.

@SteffenMangold
Copy link
Author

Is 3.0 the next major release or is there some other 2.x coming before? Just want to estimate if 1 or 5 years. ^^

@ajcvickers
Copy link
Contributor

ajcvickers commented Dec 4, 2018

@SteffenMangold 3.0 is the next major release, that is after 2.2.

@divega divega added good first issue This issue should be relatively straightforward to fix. help wanted This issue involves technologies where we are not experts. Expert help would be appreciated. and removed help wanted This issue involves technologies where we are not experts. Expert help would be appreciated. good first issue This issue should be relatively straightforward to fix. labels May 31, 2019
@ajcvickers ajcvickers added good first issue This issue should be relatively straightforward to fix. and removed help wanted This issue involves technologies where we are not experts. Expert help would be appreciated. labels Aug 5, 2019
@ajcvickers ajcvickers removed the good first issue This issue should be relatively straightforward to fix. label Aug 30, 2019
@smitpatel
Copy link
Contributor

See #20052

@roji
Copy link
Member

roji commented Jun 26, 2020

Just some extra details on aggregate functions...

Some aggregate functions accept some additional non-set arguments. For example, PG string_agg concatenates a set of strings, and also accepts a delimiter (SELECT string_agg(name, ',') FROM users). Pretty trivial, just mentioning it.

More interesting: there are some cases of functions accepting multiple sets; a good example is jsonb_object_agg, which reduces two arbitrary sets of values into a single json document:

postgres=# select jsonb_object_agg('key_' || x::text, x) from
generate_series(1,4) as x;
                 jsonb_object_agg
--------------------------------------------------
 {"key_1": 1, "key_2": 2, "key_3": 3, "key_4": 4}
(1 row)

@MarcAndreJean
Copy link

Is this being proposed for 6.0? Any updates? :)

@roji
Copy link
Member

roji commented Oct 27, 2021

@MarcAndreJean this is not in the 6.0 release - see the milestone above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants