-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transaction commit failure: avoid retry by default? #22904
Comments
@Timovzl Ideally, yes we shouldn't retry on commit failure, but if you look at https://github.com/dotnet/efcore/blob/b970bf29a46521f40862a01db9e276e6448d3cb0/src/EFCore/Storage/IExecutionStrategy.cs the |
@AndriySvyryd I completely forgot - it's the application itself that invokes the commit! Doing something only for the automatic transaction that Could the provider intercept TransactionStarted and TransactionCommitted/TransactionRolledBack/TransactionFailed, and suppress retries in between? For example, one implementation could be to catch [retryable] exceptions happening during this time, and rethrow them wrapped in an exception of a type that is not retried. |
@Timovzl what kind of commit errors do you have in mind? As far as I know (coming from PostgreSQL), any server error raised out of commit should mean that the commit failed, rather than unsure - can you point to docs to the contrary? Loss of network connectivity could indeed generate a network error which leaves the commit status uncertain, but that's a different category of errors, and any retries will likely (but not surely) fail. |
@roji I'm referring specifically to this (emphasis mine):
Indeed, if something happens to the connection while the commit is past the point of no return, then the application will not be aware of the successful commit. As such, with an exception on commit, the application cannot know the state of the transaction in the database. (Granted, certain exception types may indicate that the transaction is definitely not committed.) I can think of a few things that might break the connection. A severed cable. The firewall. A failing emergency power supply. Maybe a serverless database being moved onto another physical machine (although I strongly hope and suspect that the old instance is shut down cleanly).
That's... optimistic! :) If the application does not know whether the transaction was committed or not, retrying is dangerous. Doing so while the transaction was already committed has one of three potential results:
While the best case is nice, the worst case can be bad for business! It would be much safer if we always got the neutral case (in this very rare fail-on-commit scenario). By not retrying during commits, and instead letting exceptions bubble up, we always get the neutral case. I'm curious to know if you think suppressing retries between TransactionStarted and TransactionCommitted/TransactionRolledBack/TransactionFailed would do the trick. |
The main point I was trying to make is the distinction between network errors between the driver (e.g. SqlClient) and the database (e.g. SQL Server) - which can indeed result in unknown commit state - and errors successfully reported from the database to the driver. To the extent of my knowledge, if a database successfully reports an error when commit occurs, the commit failed rather than being in an unsure state (but I'm not SQL Server expert). In other words, if we special-case commit here in any way, that may be for client-server networking errors only. |
@roji It's probably worth talking to @AndriySvyryd on this. This issue with SqlClient has been investigated extensively, and that's where this guidance comes from. |
That would cover the vast majority of transient errors. |
I think that depends, deadlocks are transient and can happen very frequently in some user scenarios (not sure those can happen on commit though); these can definitely be safely, no? |
I think non-connection transient errors on commit are rare enough that we don't have to worry about them. And the best way of handling any commit errors would still be supplying |
I'm guessing when As for deadlocks, they usually happen earlier than commit time, right? |
@Timovzl Yes and yes |
We should be careful about behavior we assume across databases (e.g. what errors can occur at commit time). Since we're concentrating on networking transient errors, what would we do if the |
Not retrying should still be a safer default for any database.
It's retried according to the current execution strategy. If a fatal exception is thrown it will not be handled. |
OK, thanks for the explanations @AndriySvyryd. |
With #10119 we can continue retrying safely |
On this point:
Would a simple way to do this currently be to implement a simple |
@AndriySvyryd I really wish that were true, but I don't believe it is. One (very proper) example of such user code is:
If the commit fails, even if |
@AndriySvyryd or @roji, this suggestion looks promising as a workaround. Unfortunately, Do you see any alternative workarounds? |
You are right. That enhancement would only help the cases where the
Assuming that the operation failed is dangerous (e.g. when inserting rows with generated keys) that's why we recommend going this route. |
With regards to failure on transaction commit, the docs state:
In contrast to other Entity Framework defaults, this seems like an unsafe default.
If a drastic operation results in a failure on commit, which would we prefer?
Let's say that a client attempts to make a payment. Performing it twice could have bad consequences.
However, if we were to return an internal server error (since we did not catch the exception of this unexpected and highly unlikely failure), then we have technically informed the client that the result is unsure. Experienced developers know this, and can act accordingly. (Sure, some clients may assume that the attempt has failed, but that is a mistake. At least we have been as clear as possible.)
I'm aware that for optimal results we could manually implement
verifySucceeded
for each method, but that adds a lot of work and code overhead. More importantly, it's no reason to keep us from choosing the best possible defaults!Should we have retrying execution strategies not retry by default when a failure on commit occurs?
The text was updated successfully, but these errors were encountered: