-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent insert can cause AsSplitQuery to fail to populate unrelated child collections #33826
Comments
Sorry for taking so long to answer this. As you noted, the moment you're doing split queries without serializable/snapshot isolation, you're exposed to all kinds of inconsistent data scenarios; this is one of the reasons split queries aren't the default loading mode. But I do agree that the behavior you're describing sounds odd and could indicate a bug in EF's materialization logic... I'll place this in the backlog for further investigation, but it's unlikely we'll get to this any time soon, as similar such inconsistencies may appear in any case (even if a bit less unexpected). |
Hello! Another reproduction, based on issue's author reproduction. using Microsoft.EntityFrameworkCore;
namespace EFTest;
public class Widget
{
public Widget(int id, IEnumerable<Child> children)
{
Id = id;
Children = children.ToHashSet();
}
// EF calls this when materialising the Widget
private Widget(int id)
{
Id = id;
// Strategically update Widget 2 before loading it's children
if (id == 1)
{
using var dbContext = new MyDbContext();
var widget = new Widget(2, Enumerable.Empty<Child>());
var menu = dbContext.Menus.First();
dbContext.Set<Widget>().Attach(widget);
widget.Menu = menu;
dbContext.SaveChanges();
}
}
public int Id { get; private init; }
public HashSet<Child> Children { get; private init; } = new();
public Menu? Menu { get; set; }
public override string ToString() =>
$"Widget {Id} with children: [{string.Join(',', Children.Select(c => c.Name))}]";
}
public record Menu(int Id);
public record Child(int Id, string Name)
{
public int WidgetId { get; private set; }
}
public class MyDbContext : DbContext
{
private const string ConnectionString = "";
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
optionsBuilder.UseSqlServer(ConnectionString);
}
public DbSet<Widget> Widgets { get; private init; } = null!;
public DbSet<Menu> Menus { get; private init; } = null!;
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.Entity<Widget>(e =>
{
e.Property(widget => widget.Id).ValueGeneratedNever();
e.HasMany<Child>(c => c.Children)
.WithOne();
e.HasOne<Menu>(widget => widget.Menu)
.WithMany();
});
modelBuilder.Entity<Menu>(e => { e.Property(c => c.Id).ValueGeneratedNever(); });
modelBuilder.Entity<Child>(e =>
{
e.Property(c => c.Id).ValueGeneratedNever();
e.Property(c => c.Name);
e.Property(c => c.WidgetId);
});
}
}
public static class Program
{
public static async Task Main()
{
await using (var dbContext = new MyDbContext())
{
await dbContext.Database.EnsureCreatedAsync();
await dbContext.Widgets.ExecuteDeleteAsync();
await dbContext.Menus.ExecuteDeleteAsync();
dbContext.Menus.Add(new Menu(1));
dbContext.Widgets.Add(new Widget(1, new[] {new Child(11, "A1"), new Child(12, "B1"), new Child(13, "C1")}));
dbContext.Widgets.Add(new Widget(2, new[] {new Child(21, "A2"), new Child(22, "B2"), new Child(23, "C3")}));
await dbContext.SaveChangesAsync();
}
await using (var dbContext = new MyDbContext())
{
var widgets = await dbContext.Widgets
.Include(w => w.Children)
.Include(w => w.Menu)
.AsSplitQuery()
.ToListAsync();
Console.WriteLine(string.Join(Environment.NewLine, widgets));
}
}
} Received output:
Expected output:
Adding transaction with isolation level = Snapshot or changing AsSplitQuery to AsSingleQuery helps to get expected output. In EF Core 2.1, which used split queries by default, there was not such problem. Provider and version information EF Core version: v8.0.8 P.S. full workaround is listed below await using (var dbContext = new MyDbContext())
{
// open connection manually that mean that it won't be return to connection pool unit we close it manually
await dbContext.Database.OpenConnectionAsync();
var tran1 = await dbContext.Database.BeginTransactionAsync(IsolationLevel.Snapshot);
var widgets = await dbContext.Widgets
.Include(w => w.Children)
.Include(w => w.Menu)
.AsSplitQuery()
.ToListAsync();
await tran1.CommitAsync();
// https://github.com/dotnet/SqlClient/issues/96 workaraound
var tran2 = await dbContext.Database.BeginTransactionAsync(IsolationLevel.Unspecified);
await tran2.CommitAsync();
await dbContext.Database.CloseConnectionAsync();
Console.WriteLine(string.Join(Environment.NewLine, widgets));
} |
I'm not sure at this point exactly how things worked in EF Core 2.1, but what you're reporting really is the expected behavior when doing split query; without a higher isolation level (e.g. Snapshot), there are no consistency guarantees, so this sort of error can occur. As this is a concurrency issue and therefore highly sensitive to timing, my guess is that EF Core 2.1 had exactly the same behavior, but the issue simply didn't manifest. Out of curiosity, can you please post the fully SQLs for the queries generated by both EF Core 2.1 and 8.0, so we can compare them? |
Still can't understand why it is expected behaviour, because Menu field is not somehow related to loading children of Widget. When I read this statement from documentation
I expect that i will get Widget with empty Menu field and with its children.
Maybe I should understand this statement in terms of ef core? I mean that ef core can do some strange things when use splitQueries. Returning to the queriesWe can see, that when 2.1 loading child collections, it does not include Menu id in select output as 8.0 do NET 8: SELECT [w].[Id], [w].[MenuId], [m].[Id]
FROM [Widgets] AS [w]
LEFT JOIN [Menus] AS [m] ON [w].[MenuId] = [m].[Id]
ORDER BY [w].[Id], [m].[Id]
SELECT TOP(1) [m].[Id]
FROM [Menus] AS [m]
exec sp_executesql N'SET IMPLICIT_TRANSACTIONS OFF;
SET NOCOUNT ON;
UPDATE [Widgets] SET [MenuId] = @p0
OUTPUT 1
WHERE [Id] = @p1;
',N'@p1 int,@p0 int',@p1=2,@p0=1
SELECT [c].[Id], [c].[Name], [c].[WidgetId], [w].[Id], [m].[Id]
FROM [Widgets] AS [w]
LEFT JOIN [Menus] AS [m] ON [w].[MenuId] = [m].[Id]
INNER JOIN [Child] AS [c] ON [w].[Id] = [c].[WidgetId]
ORDER BY [w].[Id], [m].[Id] NETCOREAPP2.1: SELECT [w].[Id], [w].[MenuId], [w.Menu].[Id]
FROM [Widgets] AS [w]
LEFT JOIN [Menus] AS [w.Menu] ON [w].[MenuId] = [w.Menu].[Id]
ORDER BY [w].[Id]
SELECT TOP(1) [m].[Id]
FROM [Menus] AS [m]
BEGIN TRANSACTION
exec sp_executesql N'SET NOCOUNT ON;
UPDATE [Widgets] SET [MenuId] = @p0
WHERE [Id] = @p1;
SELECT @@ROWCOUNT;
',N'@p1 int,@p0 int',@p1=2,@p0=1
COMMIT TRANSACTION
SELECT [w.Children].[Id], [w.Children].[Name], [w.Children].[WidgetId]
FROM [Child] AS [w.Children]
INNER JOIN (
SELECT DISTINCT [w0].[Id]
FROM [Widgets] AS [w0]
LEFT JOIN [Menus] AS [w.Menu0] ON [w0].[MenuId] = [w.Menu0].[Id]
) AS [t] ON [w.Children].[WidgetId] = [t].[Id]
ORDER BY [t].[Id] |
@adebelyi thanks for the added detail.
This is something that isn't clear to me (but as I said, I'm unfamiliar with how EF 2.1 split queries worked). I'm probably missing something, but the principal (Menu) key must be projected in the 2nd split query which returns the children, otherwise how is EF to know which child belongs to which principal? I do agree that a more thorough investigation is needed here at some point, and that there may be a possible bug in EF (at the very least the odd behavior originally signaled above). |
I think that you misunderstood the reproduction a bit. Tables:
Queries:
To bind child to widget EF should use WidgetId field not MenuId. In NET8 in child load query EF adds menu id in select but 2.0 not:
Okay, thanks. |
When materialising entities via a query with split querying enabled, it's possible to encounter missing child entities purely by concurrently inserting unrelated data into the same tables. This can happen even when there have been no changes to the affected parent/child rows and ordering of the returned rows is completely stable. EF runs the expected queries and it does appear to successfully retrieve the child entities from the database, but it discards them.
Minimal consistent repro that sneakily inserts an unrelated record in between executing the first and second split queries (.NET 8.0, EF Core v8.0.5, SQL Server):
Program.cs
:Add your connection string (
MyDbContext.ConnectionString
), generate an initial migration (dotnet ef migrations add Initial
), then run the program.Received output:
Expected output:
The repro no longer exhibits the issue if you remove the
OrderByDescending(w => w.Id)
, but you can make the issue reappear by simply giving "Widget 2" an ID of0
. It seems that the moment the "parentless" (from the split query's perspective, anyhow) child row is encountered during enumeration, EF stops stitching together child entities with parent entities entirely. This means that depending on the order of the results you may experience child entities missing for all, or some, or no parent entities.I acknowledge that the docs say that for split querying data consistency is not guaranteed without using serialisable/snapshotting transactions, but I found this scenario a little too unexpected and couldn't find any existing reports of this same scenario, so I thought I'd at least ask whether this is expected. The thing that makes this so unexpected is that the right data is being loaded by EF and it has enough information to stitch things together, but the presence of an additional row causes it not to do so. I'd expect these results if I was deleting "Widget 1" in between the first and second query such that EF sees the parent but no children, and I was fully willing to accept those consequences when not using serialisable/snapshotting transactions, but I didn't expect that I could make "Widget 1"s children disappear (for a single query) by merely inserting "Widget 2" at the wrong time even though "Widget 1" and all its related data was untouched the whole time.
Provider and version information
EF Core version: v8.0.5
Database provider: Microsoft.EntityFrameworkCore.SqlServer (I was able to reproduce the same behaviour with Npgsql.EntityFrameworkCore.PostgreSQL as well)
Target framework: .NET 8.0
Operating system: I've reproduced this behaviour on Windows, Linux & macOS
IDE: I've reproduced this behaviour in both Visual Studio and Rider
The text was updated successfully, but these errors were encountered: