-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MySQL performance and connectivity improvements #444
Conversation
@henrikno @IceCreamYou @Trundle @eirslett cursory glance looks ok (although in general, this stuff should have tests) If I don't hear back otherwise, I'll merge by Friday |
@postwait you might also be interested in this. |
How about using HikariCP? |
If you really want it to scream on mysql, I'd suggest checking out finagle-mysql. It's pretty basic, but it's implemented from the ground-up to be asynchronous, so it plays more nicely with the existing zipkin abstractions. |
Yes, HikariCP does look good. I'll try to run some comparison tests with it this afternoon.
Thanks for the feedback. Doesn't look quite as easy as HikariCP to switch over to, but I'll check it out if I have time. |
Yeah, we wouldn't be able to just switch, either–finagle-mysql implements the mysql protocol, so it only supports mysql, unlike anorm, which also supports things like sqlite, so it would have to be a totally separate mysql-only implementation. |
HikariCP was trivial to integrate [1], but its performance was essentially identical to the original selection of Apache Commons DBCP. Performance of DBCP was actually slightly better in our benchmark testing although that's probably just natural variance. I think it's a safe conclusion that the pool overhead is negligible, dwarfed by the I/O of the database transaction. Commons DBCP
HikariCP
[1] https://gist.github.com/noslowerdna/9b975aed5c502a10efad |
Interesting... So they're performing approximately the same? ( @brettwooldridge is this a normal case for HikariCP? I haven't used it myself but I've heard its performance is a quite dramatic improvement over other connection pools.) |
fyi we've two options on this:
Historically we've struggled with delayed change, it sometimes forces people into forks, or delays their move out of forks. Anecdotally, delay seems at odds with engagement, which is something we are trying to turn around. For this reason alone, I suggest we merge. @eirslett wdyt? |
@eirslett @noslowerdna @adriancole Thanks for the Cc. 7pm here in Tokyo. I'm on the way home, and formulating a more detailed comment. I'll update in about 4 hours (after sending daughter to dreamland). Thanks for taking a look at HikariCP. While HikariCP is probably best known for being fast, we have actually spent more effort on achieving that speed within the constraints of providing the highest reliability possible. We are confident that HikariCP is not slightly more reliable, but substantially more reliable than currently available pools. Many (most?) available pools (including DBCP) default to a mode of operation where performance is prioritized over reliability. In contrast, HikariCP has no "unsafe" operational modes -- no way to disable "correct" behavior. The benchmark cited on our page is actually extremely generous to other pools. They are run against a JDBC stub-driver in which every operation is an empty method. When a real driver is put into the loop instead, the difference in results begs believability -- but believe them. We should probably beat this drum more loudly, but... So, what is unfair about the typical comparison? I'm going to talk about HikariCP, Apache DBCP2, and Tomcat DBCP here; talking some about speed, and then bringing in the reliability pieces. Running HikariCP-benchmark against the three pools against a real database (MySQL) instead of a stub. First, all three pools in default configuration (+ autocommit=false):
DBCP2What is not visible here is that DBCP2 is generating ~3MB/sec of traffic to the DB, because HikariCPHikariCP is generating zero traffic to the DB. HikariCP also defaults to "rollback on return" (it can't be turned off because that is the correct behavior for a pool), but it additionally tracks transaction state and does not rollback if the SQL has already been committed (or no SQL was run). HikariCP also defaults to "test on borrow" (it can't be turned off...), but employs an optimization that says, "If a connection had activity within the past 1000ms, bypass connection validation." TomcatTomcat DBCP is also generating zero traffic to the DB, but for a different reason. It simply is not validating connections at all, nor is it rolling back on return. Now, let's try to level the playing field as a little. For Tomcat and DBCP, we need to enable connection validation.
DBCP2DBCP2 took a hit here, because it does not have a validation optization like HikariCP. It is still generating ~3MB/sec of traffic to the DB. TomcatTomcat DBCP does support a similar optimzation to HikariCP, the config goes something like this:
But we forgot "rollback on return" for Tomcat:
And there goes the performance. Tomcat is now generating ~3MB/sec of traffic to the DB. It does not track transaction state and therefore must unconditionally rollback. I thought maybe enabling "ConnectionState tracking" might help, but it does not. There is a lot more that HikariCP is doing, not covered here ... guarding against network partitions, checking SQLExceptions for vendor disconnect codes, resetting auto-commit, transaction isolation, catalog, network timeout, tracking open Statements (and closing them), etc. All while keeping the performance levels you see above. |
@brettwooldridge awesome. good luck with dadops! |
fyi I can't merge this for technical reasons right now. Hopefully, they'll resolve by Monday. |
technical issues resolved.. merging |
* | ||
*/ | ||
|
||
import org.specs._ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dependency not configured.. fixing before merge
closed via 2f5f5f5 |
Thanks for all the help, folks. Particularly those interested in performance should scroll up to read the update from @brettwooldridge on db pools (github doesn't notify on edit). This topic might be best pulled into a separate issue/pull request, which we could quickly address before releasing 1.2. |
fwiw I think @brettwooldridge makes a strong case, but up to y'all to decide to raise a PR for switcheroo or not. |
We have identified several improvements for Zipkin deployments that using a MySQL database for storage,