Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap Multiple Eager Loaded Queries that leverage Include in Repeatable Read transactions #9014

Closed
caleblloyd opened this issue Jun 29, 2017 · 6 comments

Comments

@caleblloyd
Copy link

caleblloyd commented Jun 29, 2017

I am opening this issue against the behavior observed in EF Core 1.0. I have not had time to investigate changes in EF Core 2.0 so if by chance this has been added already please let me know.

Eager Loading with .Include generates multiple SQL queries. They are not inside of a transaction, however. This means that between the queries, new records could be inserted affecting ranges in the WHERE clause and causing incorrect data to get loaded.

Pomelo.EntityFrameworkCore.MySql solves this in 1.0 by wrapping queries that have in Include clause in a RepeatableRead transaction. I think that this logic should be moved into EFCore.Relational. It's pretty important for consistency's sake.

Here's an example of how we do it in 1.0:

The transaction isolation level should be RepeatableRead

@ajcvickers
Copy link
Contributor

@caleblloyd Can you post a specific example of how the wrong data is loaded? i.e. The SQL that gets generated and what inserts happen between the SQL statements that cause the data to be wrong?

@caleblloyd
Copy link
Author

Sure. Consider the following query that fetches a blog and it's associated posts. Assume that id is a randomly assigned GUID.

LINQ Query:

await _db.Blogs.Include(m => m.Posts).OrderByDescending(m => m.Id).Take(10).ToListAsync()

Generated SQL:

info: Microsoft.EntityFrameworkCore.Storage.Internal.RelationalCommandBuilderFactory[1]
      Executed DbCommand (1ms) [Parameters=[@__p_0='?'], CommandType='Text', CommandTimeout='0']
      SELECT `m`.`Id`, `m`.`Title`
      FROM `Blogs` AS `m`
      ORDER BY `m`.`Id` DESC
      LIMIT @__p_0
info: Microsoft.EntityFrameworkCore.Storage.Internal.RelationalCommandBuilderFactory[1]
      Executed DbCommand (5ms) [Parameters=[@__p_0='?'], CommandType='Text', CommandTimeout='0']
      SELECT `b`.`Id`, `b`.`BlogId`, `b`.`Content`, `b`.`Title`
      FROM `BlogPost` AS `b`
      INNER JOIN (
          SELECT DISTINCT `m`.`Id`
          FROM `Blogs` AS `m`
          ORDER BY `m`.`Id` DESC
          LIMIT @__p_0
      ) AS `m0` ON `b`.`BlogId` = `m0`.`Id`
      ORDER BY `m0`.`Id` DESC

Race Condition

Now say between the 1st query that retrieves the Blogs and the 2nd query that retrieves the Posts, a new Blog is created that lands in the middle of the LIMIT(10). It will cause the INNER JOIN in the 2nd post to not match up with the IDs in the first query, and the 10th Blog's posts will not be loaded.

Solution

Wrap both queries in a RepeatableRead transaction

@caleblloyd
Copy link
Author

@ajcvickers can the EF team get this on the schedule for 2.1? I believe that providing a method to fix this race condition in DBMS's that support Repeatable Read transactions is important for consistency.

@ajcvickers
Copy link
Contributor

@caleblloyd It's something we plan to do some thinking around, but we're not even sure what approach to take at this time. We're not sure yet whether always using repeatable read is the best approach for the default behavior, whether it should be opt-in, whether it should remain an application concern, or whether to do something different. @anpete and @divega will do some investigation and thinking, but it may not happen for 2.1

@smitpatel
Copy link
Contributor

Clearing milestone for this since it is similar to #14062

@ajcvickers
Copy link
Contributor

Closing this as a duplicate of #12098 with a reference here to consider this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants