-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After SqlException application state is corrupted #97
Comments
Connections that come out of the connection pool are bad and have a trashed transaction isolation. I raised an issue on this almost ten years ago and eventually found out it's considered intended behavior. The solution for both is the same darn thing: for (;;) {
connection = new Connection(...);
try {
using (cmd = connection.CreateCommand()) {
cmd.CommandText = "SET TRANSACTION ISOLATION LEVEL READ COMMITTED;"; // Default, or use whatever you think your connections should be at
cmd.ExecuteNonQuery();
}
break;
} catch (DbDataConnection) { /* bad connection from pool */ }
connection.Dispose();
} The bad connections don't get returned to the pool a second time, so the for loop drains all the bad ones out and you're ready to go again. |
It's odd behaviour. Do you remember the details of why it was intended? |
Back in the .NET Framework 1.0/2.0 days SqlConnection was a wrapper around a native library that did all the work including the pooling and had the "Inherit transaction isolation level behavior" they dared not fix in the native code for backwards compatibility. (I think a native caller could tell if it came out of the pool or not.) I filed the bug on MS Connect against .NET Framework 2.0/3.5 and was given this explanation. The other half about bad connections connections that were good when Close() was called might not be when you get them back out, which is a problem inherent with pooling unless active code is written to verify the state. Just about every pool API from that era had the same problem. This lead to "Object Cesspool" appearing as an anti-pattern for awhile before some other people obliterated it off the lists. Once you know about it (which the documentation doesn't tell you about) it's easy to fix. When #12072 was left unfixed too long I went through a bunch of reflection work to implement my own pool. My code depends on the exact implementation of SqlConnection now. I don't have to care anymore. |
@saurabh500, we got a memory dump we can share, hopefully you can get something out of it. The only change in the application is that it started to get more load than it used to. The issue is happening weekly, so if you need anything to help troubleshoot we could probably provide it. |
@afsanehr @keeratsingh to help on this issue. |
@epignosisx It would be good to provide the version of SqlClient that you are using as well, since SqlClient is a library that doesn't ship as part of .Net Core SDK. 1- It seems like something in the Connection Pool is getting corrupted once this happens. Since a restart fixes it.
2- The error seems related to some SSL issue based on the exception details (Provider SSL). However, our connection strings do not specify the Encrypt attribute.
|
The System.Data.SqlClient version is 4.4.2.
Once this error occurs, subsequent requests start failing consistently, it goes away once the app pool is recycled. This behavior, in addition to the stacktrace is what makes me think it might be something getting corrupted with the connection pool. There are other 3 instances of the app running in other physical servers and did not experience the issue. This helps rule out suspicions that SQL Server was acting up. The SQL operations that are going through this server are very simple. In fact, it is not our code, it's part of the Microsoft.AspNetCore.Caching.SqlServer package. SQL queries: I did find that when making the queries, the code was not disposing of the
Understood. Thanks for the explanation. Yes, subsequent connections are failing quickly. HTTP requests are taking 15 ms to complete (IIS logs). |
There are couple of nuances of connection pool, that should be mentioned here.
It is possible that, you could be hitting a combination of the above two issues. Do you have a Minimum Pool Size configured for your connection pool. This makes the connection pool, make sure that Min Pool Size number of connections are maintained in the pool. |
Thanks @epignosisx for providing the information. Can you change your config to use the Min Pool Size so that you have ready TCP connections? Do you know how many concurrent connections are open to Sql Server when this issue happens? I am trying to rule out the possibility of Connection Pool max size being reached (viz 100 by default) and not getting a connection in a timely manner. |
At one time the behavior on reaching the pool size was it throwing InvalidOperationException. The last time I tested this was in .NET Framework 2.0 though. We thought we were hitting the pool size legitimately. Much later we discovered we had a bug that leaked connections. |
@saurabh500 I'm trying to get the concurrent connections to SQL Server when it happened. In the meantime, I can share the memory dump, but I checked your Github profile, but it is not showing an email address. |
@epignosisx Something changed and my email was private . I have made the email available on the profile |
@saurabh500 I emailed a link to the memory dump to you early today. |
Ack. I did reply back sometime ago. I will take a look into the dump. |
Hi @saurabh500, the issue keeps happening at least weekly. Have been able to take a look at the memory dumps I have share with you via email? |
I took another stab at the memory dumps with my limited Windbg knowledge and I found something curious:
This is the breakdown of the SqlConnection's innerConnection: System.Data.ProviderBase.DbConnectionClosedPreviouslyOpened: 2629 @saurabh500 @Wraith2 Do these numbers say anything to you? |
We were getting the same exception for a console app.
Tried various things including:
In the end what worked (at least for the last 4 days) is to change the connection string to use IP Address instead of the server name. If this continues to hold, it would point to a DNS issue? Not sure how IP vs ServerName affects "pre-login handshake". |
@Code-DJ : You know connection pools are per-process right? The trick is to replace you code to open a connection with that loop. |
@jhudsoncedaron didn't know that. That makes sense on why it didn't work. If the tool fails again, will move your logic into the actual tool and try it. Thanks! |
As recently announced in the .NET Blog, focus on new SqlClient features an improvements is moving to the new Microsoft.Data.SqlClient package. For this reason, we are moving this issue to the new repo at https://github.com/dotnet/SqlClient. We will still use https://github.com/dotnet/corefx to track issues on other providers like System.Data.Odbc and System.Data.OleDB, and general ADO.NET and .NET data access issues. |
Incorrect. Connection pools are per connection string. e.g., changing the Application Name in the pool will cause a second pool to be created.
What you're describing isn't a client bug, but a bug in The workaround you're describing could be made better by checking the @saurabh500 , are you saying that connections that wait too long to go through the queue can in turn re-trigger scenario 1 (If an error occurs while getting a connection from the pool, then subsequent attempts to establish new connections to SQL Server fail for a period of time), resulting in a possible meta-stable failure state when connections consistently take 5 seconds to handshake? |
"The transaction isolation level issue with sp_reset_connection was fixed/changed in SQL Server 2014." It wasn't. I reproduced it on SQL 2014 back when I used to run that. The workaround needs to remain even if that particular issue is fixed because it's still fixing the stale connection in pool problem. |
It appears there are two types of isolation level issues, and the changes to #96 and #1098 (duplicate) independently verify your statement on SQL Server 2017 or later. So, both statements are true - sp_reset_connection did improve in SQL Server 2014 to improve transaction isolation levels, but it did not address inner transaction blocks committing or rolling back with a different isolation level. |
Closing issue as stale. |
Environment:
OS - Windows Server 2012 R2
.NET Core: 2.0.5
SQL Server 2016
Exception:
This issue has come up in production in two separate occasions this week and once the ASP.NET Core app enters into this state only restarting the app resolves the problem. A few points I want to share:
1- It seems like something in the Connection Pool is getting corrupted once this happens. Since a restart fixes it.
2- The error seems related to some SSL issue based on the exception details (
Provider SSL
). However, our connection strings do not specify theEncrypt
attribute.3- There are a few issues reported with the same exception here, but in Linux/Mac environments. In this case this is an all Windows environment.
The text was updated successfully, but these errors were encountered: