-
Notifications
You must be signed in to change notification settings - Fork 832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sqlplugins/mysql: default interpolateParams in DSN #5425
sqlplugins/mysql: default interpolateParams in DSN #5425
Conversation
By default, the `go-sql-driver/mysql` driver wraps every query in a prepared statement, and this extra prepared statement means every single query is effectively two round trips. A request of ComPrepare to prepare the query, then ComStmtExecute with the statement id and the parameters. In theory, this would be be fine if Temporal was explicitly aware of this and prepared their own statements explicitly and reused the prepared statements over and over through the lifetime. Short of that, wrapping every query in a prepared statement just effectively doubles the queries and double the latency. I had originally considered exposing this as a configuration knob, since this can be set through `connectAttributes` manually, but I don't think there's value in having this be configurable and we can provide a better default. See: https://github.com/go-sql-driver/mysql#interpolateparams
f010930
to
1a8e5be
Compare
I guess there are a few failing test cases, so this might be tricky for me to debug. :/ But if we can make this work, or at minimum exposed more easily as a config option, that'd be great. |
I'll take a look at it and see if I can help you resolve those. I saw in our history that we've had other issues with |
I'd be interested in hearing about it. For context, I'm sorta a middleman here trying to get a better experience for mutual users. :) I'm quite familiar with mysql protocol and the mysql driver side of things, and I know how this behavior works in PlanetScale and vitess. We have some mutual customers that would benefit quite drastically from either, using prepared statements explicitly so they are saved and reued for the lifetime, or avoid them and stick to client side interpolation. It's just a lot of wasted CPU cycles on the server to handle the extra Prepare + Execute for every query. My assumption is that Temporal didn't explicitly choose this behavior, since it's the default of the mysql driver, and I think from the mysql driver, they are erring on the side of caution. We've recommended for most applications to set I'm not actually sure what query patterns Temporal is testing and if it's a side effect of an explicit test, or if it's really a bug we'll need to work around and figure out how to do a query differently. I'm willing to offer up my help how I can, but not being a Temporal user puts me in a slightly challenging position here. This would definitely be mutually beneficial to avoid the overhead of 2 round trip per query if we can make it happen. |
I saw we have a comment at https://github.com/temporalio/temporal/pull/3886/files#diff-068f89497865ae54e0047289cd7c911f3afc746c9a920bd57b4aae4c1bdabac6R32 about this behavior but I'm not sure why that's the case. I'll let you know if I figure anything out. I'm assuming you haven't had anyone running mysql8 try this change yet, as it'd have broken otherwise |
It looks like only our visibility code is failing which is interesting, hrm. |
We run all mysql8, but we don't have a lot of experience with this yet. I'm just did some basic testing and we're working to solicit feedback from customers. The challenge is getting them to adopt this config when it's very hard to set with the default docker-compose setups and whatnot, it's not exposed as an option. As for the The client side interpolation is looking for exactly that: https://github.com/go-sql-driver/mysql/blob/v1.7.1/connection.go#L198-L295 |
Yep. I'll need to dig into our mysql8 visibility code to figure out which query is failing and trace it back |
From my understanding, I know there are 2 databases in play here. I think from my understanding, the visibility database is much less volume than the main temporal database? If that's correct, it might be a safer first step to apply this setting as a default only for temporal and not visibility as an intermediate step since that'd be arguably the biggest win. |
Aha, it's go-sql-driver/mysql#819. Our visibility schema embeds the memo as a These queries are hard-coded and can be trivially prepared; I'll hop on that this afternoon; it shouldn't be a big change and will let us enable that option for you |
Aha, yeah, doing an explicit prepare then for that sounds like a winner. :) |
The simplest approach is actually to track the Memo as As for setting this as our default: I'm curious as to whether there are downsides to using |
I think there's a risk with any client side interpolation, but every ORM and whatnot does it. Given that this mysql driver is the most popular in Go that I'm aware of, I would suspect most cases have been figured out by now, given nobody at large scale would want to run with their default behavior. We encourage setting this on for our customers, etc. They have one use case:
And it sounds like they guard against it by just disallowing it for those encodings. I'd be curious if you had any other cases that you're aware of that would cause an issue that the driver doesn't outright reject. One benefit of MySQL in this case is their types are quite a bit simpler and escaping is a bit easier than in Postgres. |
We're on the same page there. I'm asking around our team just in case someone can think of something but I'm testing a patch that'll allow this to work at the same time |
I put up a new PR at #5428 that contains your change plus the one other change required to support it (and some cleanup while I was here). I'll keep iterating on it as tests finish, and will discuss with the team whether this is hotfix-worthy |
tyty it makes sense that encoding JSON as binary would be incorrect in this context since JSON is explicitly UTF-8. I really appreciate carrying this through. Good open source response. 🫶 |
Yeah. I'm seeing reports in other MySQL drivers and forums about this happening with the utf8mb4 character set specifically (which our mysql8 plugin uses). It looks like there's still a few more issues I have to fix but I should have it all done early next week.
Of course! I'm happy to deliver an easy speedup for customers using our MySQL backend 😄 |
Enabling interpolateParams is causing more problems than I expected; I'll keep poking away at this next week but we have a long weekend this weekend. If you have customers running mysql8 with this set make sure they don't enable it for mysql8-based visibility stores while I figure out what's wrong |
Do you have benchmarks that measure the performance improvement with this set to true, by the way? |
I don't, other than it's objectively doing less than half the work and half the requests against Vitess. I can't easily quantify what the tangible savings is, but that's a lot of round trip and extra CPU on both sides being spent. |
As I mentioned earlier, I think keeping this off entirely from visibility store is more than fine. From my limited understanding of Temporal, it seems a vast majority of the workload is in the main temporal database, not the visibility one. I might be misinterpreting that though. |
I'm a customer of @mattrobenolt's and I wanted to add some voice here. Prepared statements are a scaling concern for my self-hosted Temporal v1.20.4 instance. Here are some quick measures:
I have been needing to scale my number of history services significantly recently, I believe in part due to the query multiplier that prepared statements adds. We have been seeing a lot of logs that indicate DB timeouts, but the DB is healthy, so we think the connection pool may be too small and that there's contention within the history service. The history service isn't under significant CPU or memory pressure. We have 32 instances each with 60 connections right now, which feels significant. @tdeebswihart it looks like #5428 should be close to landing? |
## What changed? We now specify `interpolateParams` by default when using MySQL for our main. While we don't currently support specifying `interpolateParams` for MySQL visibility databases, I changed the type returned when serializing `VisibilitySearchAttributes` as a first step towards doing so (MySQL 8 won't allow binary data within JSON columns, which makes sense). While we could also have explicitly prepared and reused our statements that requires far more work and it may not be the better approach anyways. ## Why? @mattrobenolt did a good job of explaining why in #5425 but I'll summarize it here anyways. When `interpolateParams` is false (the default) the driver will prepare parameterized statements before executing them, meaning we need two round-trips to the database for each query. By setting `interpolateParams` to true the DB driver will handle interpolation and send the query just once to the database, halving the number of round trips necessary. This should improve the performance of all clusters using MySQL. ## How did you test it? Existing tests. ## Potential risks No ## Is hotfix candidate? This is up for discussion
It's as close as "I need to hit the merge button" and I just did, thank you for the reminder! I'd become caught up in other work, pardon the lateness. |
@tdeebswihart that's the best kind of close! I appreciate the work that you and @mattrobenolt have done here, I'm hoping this is a massive win for self-hosted MySQL. |
As far as releasing this goes: we're happy to backport this to the 1.23 release (and make 1.23.1) but don't have the bandwidth to test this functionality ourselves right now. Would y'all (@mattrobenolt) be comfortable with testing it? |
I think @emmercm might be in a better position to test this with us. My knowledge of Temporal is only surface level. Reach out and let's see how it goes. :) |
No need to necessarily backport, I'm on an older Temporal version because I haven't invested the time to do the v1.21 and v1.22 schema upgrades yet - not for any specific reason. I need to spend the time upgrading before I'll be able to effectively test this. |
Alright, then this will be included as part of our next release |
By default, the
go-sql-driver/mysql
driver wraps every query in a prepared statement, and this extra prepared statement means every single query is effectively two round trips. A request of ComPrepare to prepare the query, then ComStmtExecute with the statement id and the parameters.In theory, this would be be fine if Temporal was explicitly aware of this and prepared their own statements explicitly and reused the prepared statements over and over through the lifetime.
Short of that, wrapping every query in a prepared statement just effectively doubles the queries and double the latency.
I had originally considered exposing this as a configuration knob, since this can be set through
connectAttributes
manually, but I don't think there's value in having this be configurable and we can provide a better default.See: https://github.com/go-sql-driver/mysql#interpolateparams
How did you test it?
I have tested locally, and we are starting to tell our customers (PlanetScale) to set this manually in their
connectAttributes
.Potential risks
There's a potential for some reason, the behavior to change and not work with client side interpolation, but it's unknown currently and I think would be considered a bug in the
go-sql-driver/mysql
library if anything is uncovered.Is hotfix candidate?
I'd say Yes, but I personally have little experience with the Temporal community. :)