-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Pivot LINQ extension #11963
Comments
@izzycoding - What do you envision the linq method would look like? I actually started working on this a few weeks ago. I'm still debating what the actual Linq method should look like. That is why I'm curious to see how you think it should work. Currently I am using QueryTypes to define the pivot results which has lead to pivot calls in this form. public class PivotSchool
{
public double? Wisconsin { get; set; }
public double? Minnesota { get; set; }
public double? Oklahoma { get; set; }
public double? Alabama { get; set; }
}
var results = context.Schools.Pivot(s => s.Name,
sl => sl.Sum(s => s.Attendence),
s => new PivotSchool()).ToList(); In order to pivot you need to specify the pivot column and aggregate. In order to then compose further operators onto the query you need a result type. This syntax hits all of those points, but I'm still not completely sold on it. cc: @divega |
@pmiddleton Does Pivot() in your example have two generic arguments for the source element type (School) and the output element type (PivotSchool)? I am undecided, but perhaps this is better, even if you have to repeat the source element type: var results = context.Schools.Pivot<School, PivotSchool>(
s => s.Name,
sl => sl.Sum(s => s.Attendence)).ToList(); |
@pmiddleton I would actually expect the composure to be done via .Select() or .Cast(). Something like: var results = context.EntityData.Where(e => e.Id == ...Entity Id I Want...)
.Pivot(k => k.Name, v => v.Value)
.Select(ed => new EntityType {
propA = ed.propA,
propB = ed.propB
});
// or;
var results = context.EntityData.Where(e => e.Id == ...Entity Id I Want...)
.Pivot(k => k.Name, v => v.Value)
.Cast<EntityData>(); This to me would feel cleaner and more readable from code. |
@divega - I would expect the extension method would already know about School in @pmiddleton example. |
I do have the overload you mentioned. If you are pivoting on an entity type or query type then you can use that overload and avoid the However if the shape of the query feeding into the pivot is an anonymous type (say from a select or join) then you need the compiler to infer all the generic types. Therefore the I really don't like the syntax, but I don't see anyway around it. I'm all ears if anyone has an idea. |
@pmiddleton |
@izzycoding Re these examples: var results = context.EntityData.Where(e => e.Id == ...Entity Id I Want...)
.Pivot(k => k.Name, v => v.Value)
.Select(ed => new EntityType {
propA = ed.propA,
propB = ed.propB
});
// or;
var results = context.EntityData.Where(e => e.Id == ...Entity Id I Want...)
.Pivot(k => k.Name, v => v.Value)
.Cast<EntityData>(); The main question is what type Pivot returns. It seems you are assuming that it would be able to dynamically make up a type that contains PropA and PropB based on the data returned by the query. Also Cast works differently. You seem to be after something more akin AutoMapper's ProjectTo method. One interesting option your example suggest that I think could work is that we could have a version of Pivot that returns a property bag. I am not sure if that would be possible for EF Core currently, but I suspect we could make it work. But I believe if we could make this work it would have a couple of nice characteristics:
So the example above would become something like this: // manual biding:
var results = context.EntityData.Where(predicate)
.Pivot(k => k.Name, v => v.Value)
.Select(ed => new EntityType {
propA = ed["propA"],
propB = ed["propB"]
});
// automatic binding:
var results = context.EntityData.Where(predicate)
.Pivot(k => k.Name, v => v.Value)
.ProjectTo<EntityData>(); cc @pmiddleton have you considered this as an option? |
@divega - I like the idea of having a PropertyBag as the output. However, as a first attempt I would guess an IEnumerable<KeyValuePair<string, string>> return value (as the most generic form), or just use reflection and then improve the implementation later. It would be easy then to provide an explicit conversion operator in MyCustomResultType to enable the Cast() to work. I have not really looked into the source of EFC as yet so unsure how the rest works with regards to mapping values to class properties. However, I would suspect some of the 2.1 converters might be useful to enable DBType to PivotType conversions. |
@izzycoding - In your example I am confused as to what the parameters are in the pivot call .Pivot(k => k.Name, v => v.Value) In order to pivot you need a pivot column and and an aggregate. I don't follow what you expect k and v to be. |
@divega - I had considered a client side implementation, but I wasn't sure if it was a good idea. Given that pivot is typically used for statistics/reporting a client side implementation could require pulling a lot of data into memory from the database in order to perform the pivot. If we do want to add a client implementation I think adding an overload which returns a property bag should be doable. One thing which I think would be extremely hard to do is trying to get a pivot to work server side without knowing the result type. You need that result type in order to know what the valid pivot values are. To work without it we would have to first run a query to get the valid pivot values and then build a second query with those values. I don't think things are built today to deal with that scenario. |
@pmiddleton - My example was probably not clear enough. Please view it as: .Pivot(col => col.RequiredKeyColumn, agr => agr.AgregateFunctionOrColumn); In my example the Agregate was based on an ID column instead of an agregate function like Sum() |
@pmiddleton fair enough. I had completely forgotten that the column names were mandatory in the PIVOT syntax. |
@smitpatel @ajcvickers @anpete I have run into an issue that I need help with. Take the following pivot query where PivotSchool is a Query Type and School is an Entity Type. public class School
{
public int Id { get; set; }
public string Name { get; set; }
public string Conference { get; set; }
public double Attendence { get; set; }
}
public class PivotSchool
{
public double? Wisconsin { get; set; }
public double? Minnesota { get; set; }
public double? Oklahoma { get; set; }
public double? Alabama { get; set; }
}
context.Schools.Pivot<School, PivotSchool>(s => s.Name, sl => sl.Sum(s => s.Attendence)).ToList(); The Pivot feature is implemented as a ReLinq ResultOperator and handled by EF inside of RelationalResultOperatorHandler. The issue I'm having with this query is that it directly selects the results of the pivot operation (in this case PivotSchool). However the shape of this query is set to School by RelationalEntityQueryableExpressionVisitor when the MainFromClause of the query model is visited. I need a way to change the shape of the query in RelationalResultOperatorHandler, but there is no way to create a new Shaper/Materializer at this point in the code. Does anyone have any ideas on how I can reshape the query, short of re-implementing RelationalResultOperatorHandler.CreateShaper inside of RelationalResultOperatorHandler? |
@pmiddleton - Presently there are only 2 kind of shaper changes happen in ResultOperatorHandler.
Since this is new kind of requirement (which happens on server side) there is no way to do it right now. |
@smitpatel - let's pretend for a moment I'm dumb... yea pretend :) What exactly are you looking for? Just the results of calling .toString() on the model and expression, or is there a helper class which gives me something prettier? |
@smitpatel - Oh and if it is just that simple then this is what I have For the model (I don't have PivotResultOperator printing anything pretty right now) value(Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryable`1[Microsoft.EntityFrameworkCore.Query.PivotSqlServerTests+School]) => Pivot The Expression going in is. _ShapedQuery(queryContext, value(Microsoft.EntityFrameworkCore.Query.Internal.ShaperCommandContext), UnbufferedEntityShaper) I believe I just need this coming out (won't know for sure until I can get this far) _ShapedQuery(queryContext, value(Microsoft.EntityFrameworkCore.Query.Internal.ShaperCommandContext), UnbufferedEntityShaper) RIght now I'm thinking of trying to refactor the CreateShape method out of RelationalEntityQueryableExpressionVisitor and into it's own class so that it can be shared with RelationalResultOperatorHandler. I have already tried creating a fake EntityQueryable constant I could send through |
output of following things inside your HandlePivot method in RelationalResultOperatorHandler And once you see output of the last statement, what is your expectation, how it should look like once pivot result operator is applied. |
(from School s in DbSet
IEnumerable ShapedQuery( Here is what is currently outputting at the end of the PivotMethod. IEnumerable ShapedQuery( What I need is is that UnbufferedEntityShaper changed to PivotSchool. The "Select 1" outer query Sql gets cleaned up after the HandlePivot method so you can ignore that here. |
Copy over code for CreateShaper from here. |
@smitpatel - Ok that worked, but I don't like the fact that there is now duplicate code. I am going to refactor this into a shared class. Any thoughts on what folder it should live in? I'm leaning toward Relational/Query/Internal. |
@pmiddleton - I suggest keep it duplicated and file an issue to clean up. We may or may not need it in 3.0 depending on outcome of #12795 |
Sorry to dump more on you but I have run into a few issues which need some design decisions :) I need a way to identify the values from the pivot column which go into the Here is an example of what I am talking about. Take these two entity types which represent players on a school American Football team. Soccer teams will be supported in the unit tests 2.5 upgrade 😁 public class Player
{
public int Id { get; set; }
public string Name { get; set; }
public string Position { get; set; }
public int Weight { get; set; }
public int SchoolId { get; set; }
}
public class School
{
public int Id { get; set; }
public string Name { get; set; }
} Now take a pivot query against those two tables which produces the following output Query Type. public class SchoolPositionWeight
{
public int? Quarterback { get; set; }
public int? WideReceiver { get; set; }
public string SchoolName { get; set; }
} context.Players.Select(p =>
new {
ShoolName = p.Name,
p.Position,
p.Weight
}).Pivot(b => b.Position,
bl => bl.Max(b => b.Weight),
b => new SchoolPositionWeight()) In order to build the query correctly I need to run the following configuration during model creation. var psp = modelBuilder.Query(typeof(SchoolPositionWeight));
psp.Property("Quarterback").IsPivot();
psp.Property("WideReceiver").IsPivot(); Ok now for the questions.
|
@divega - I have another one for yea :) PostgreSQL does not support the Pivot operator as defined by Sql Server and Oracle. However it does have something very similar, the Crosstab function. This function performs a pivot on the source however it does not support an aggregate. This means our Linq Pivot function needs some design work. How do we want to deal with some providers who support the aggregate and some who don't? We could go with two Pivot methods one with and one without the aggregate. We could make the aggregate the last parameter and optional. We could define a PostgreSql specific Crosstab function. Any thoughts on the best way to go to keep in line with existing EF design. @roji - What are your thoughts? Do you see anything in this thread which would be an issue for PostgreSql? |
@pmiddleton I've tried to find some time to dive into this but haven't been very successful :( In any case you seem to know at least as much as I do about pivoting in PostgreSQL. From a quick look, according to this issue it seems like aggregating in PostgreSQL is supported by using an aggregate function inside the Is pivoting something that's generally supported in other databases (Oracle, MySQL..)? Is support more or less uniform (or is it fragmented, like the lack of aggregate for PostgreSQL)? Would client evaluation occur on databases which don't support pivoting? |
@roji - Thanks for that link. Based on that it looks like PostgreSQL can support the same type of pivot that Sql and Oracle do. It will just require more because the base query needs to have the aggregate manually applied by the provider. From my research Oracle, Sql, and PostgreSQL are the only databases with a pivot operation built in. Maybe @divega has more knowledge in this area for other databases. I'm not sure about adding client pivoting support. @divega thoughts? This feature as a whole is really dependent on the outcome of #12795 as it is currently tightly coupled to ReLinq. |
Can add this to efcore 6 roadmap? |
@mojtabakaviani at the moment, since this feature has received few votes and is very non-trivial, and EF Core 6 already has a lot of features planned, there's probably little chance we'll be able to include it in that release. As always, that could change based on user feedback and requests. |
I am going to try take another look at this again this winter/spring. Assuming I can find some time between work and helping with virtual school for my daughter. No guarantees. |
I have various types in my app which are sudo-dynamic or generalised.
In table form I have an entity table that just has ID and type columns and then a details table with key, value, FK(ID) for the properties (not actually this but is the simplest way to demo).
It would be nice to be able to generate a query that gets the properties so that the key field is the column name and the values are listed grouping them by the ID. This would allow me to query data and apply it as-if it is a view and map it to a Poco.
This would also be useful for various statistics queries we need to run (which we have to do in C# currently due to mixed DB implementations of Pivot functionality).
It would be good to have this such that if the DB supports it then it’s done in SQL and if not it is done in-memory after executing the initial query.
The text was updated successfully, but these errors were encountered: