-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved general connection pool #1814
Improved general connection pool #1814
Conversation
…ansactionState#1675
Imagine a situation where you have an application which uses a large number of connections for a short time meaning that the pool has a relatively high population and then the app calms down to a steady state there it uses connections more slowly. In this case with a stack a small number of connections will be pushed and popped off the stack regularly and then the other items will get older and eventually be removed by the pool cleanup because they're inactive. In this case with a queue all connections are kept alive. You will retain all of the items in the pool not aging any out (depending on the number of items in the pool) so pools will be more highly populated. |
Codecov ReportBase: 71.38% // Head: 71.34% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #1814 +/- ##
==========================================
- Coverage 71.38% 71.34% -0.04%
==========================================
Files 290 290
Lines 61236 61234 -2
==========================================
- Hits 43712 43690 -22
- Misses 17524 17544 +20
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
I've been looking at CleanupCallback() trying think about your scenario and I don't think that's how it works. The pool pruning logic doesn't care when a connection was last used. If it is idle in the pool when the method runs, it is subject to cleanup regardless of when it was last used. Yes, using a stack (as-is) will favor keeping the last used connection as the most used and least likely to be cleaned up, but changing to a queue won't keep idle connections from being pruned.. It will just more evenly distribute connection use across all of the items in the pool. |
It has been some time since I tried to understand the connection pool logic and it's very possible that I misunderstood the structure. I thought this was how it worked. Since you've looked through the code with that scenario in mind and say it's fine then I'm ok with the change. I just wanted to flag up what I thought might be the reason for the stack. |
@DavoudEshtehari am on vacation at the moment, and it's once I'm back it's very unlikely I'll be able to look at this before .NET Conf (at least November 10th). I'll happy to take a look afterwards though, if that's not too far away. |
@roji thank you for the prompt reply. I wish you the best with .NET Conf, and I hope we could gain the benefit of your thoughts as well. |
/AzurePipelines run |
Azure Pipelines successfully started running 1 pipeline(s). |
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/ProviderBase/DbConnectionPool.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/ProviderBase/DbConnectionPool.cs
Outdated
Show resolved
Hide resolved
…erBase/DbConnectionPool.cs
@DavoudEshtehari have you measured performance by running parallel async testing with these changes? I tried but I saw the current implementation being slightly more performant. You may want to benchmark to have consistent results (incl non-pooled connections). Here's my repro (just a basic app): Test SqlConnection.OpenAsync performance. Feel free to modify and add output per connection attempt, and the difference would be visible when connecting to Azure SQL. This is an important use case I would recommend keeping close and not regressing. |
Looking at the changes, I don't think this should make any significant difference, negative or positive, so you may disregard my last comment. Slight movement in results is possible due to network jitter so that should be fine. Stacks should be preferred for pool when you're looking at high usability, such that when accessing a pool element, it need not move other items around unlike queue, so time taken should be less to fetch/add an element in a heavy pool, I think. Applications creating and pooling large number of connections would most likely notice that, so I would be considerate of that. I would like to understand why you think changing it to queue/bag would benefit a multi-threading app; in case I'm missing a logical reasoning? |
To compare the Stack and Queue, I don't have internal implementation knowledge of them. So, I put it for those who have such ability. Even though they don't have significant perf differences, I believe the benefit of using it on a multi-threaded app can be beneficial.
When I was running the repro, I noticed a reduction in the number of failures, plus a difference in the stack trace. You can find it by referring to the originally reported stacktrace and the result of this change. |
This is something we looked into in Npgsql a while ago as well. A long time ago, the pool was implemented as a stack to promote good pruning (for the same reasoning @Wraith2 explains in this comment). However, we ended up moving away from that, and not relying on the data structure for pruning. In other words, the current pruning algorithm samples the idle/busy state several times, and if it determines that pruning should occur, closes any idle connection (we don't care whether it's one that happened to get handed out recently or not). The current Npgsql implementation uses System.IO.Channels, which is a conceptual queue that always manages efficient handing off of elements (connections) across threads, as well as efficiently waiting when the queue is empty (i.e. Max Pool Size has been reached). In any case, I'd highly recommend measuring perf via benchmarking before making any actual changes. Would also be happy to continue talking about this topic (pooling implementation is very interesting and not at all trivial). |
While investigating issue #1675, I noticed that the underlying data structure used at
DBConnectionPool
for storing internal connection objects is aStack
, which won't be the best choice in a high-demand multi-thread application.