-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to ensure character set is set for database connection? #133
Comments
@seb303 Excellent question, thanks for posting! The charset can indeed be set using the However, as you've rightfully pointed out, the connection may be be recreated at a later time when using a lazy connection. In this case, the charset would only apply to the original connection and would not be applied to any subsequent connections that are created automatically. This does not occur when using the I agree that we should provide better support for this for lazy connections. Afaict there are multiple options to make this work:
What do you think? |
Hello @clue I think the problem with
As for how the library should implement support for character sets: I think it would make sense if It would also work to have some kind of generic hook to execute a query on connection, or fire a connect event where a "SET NAMES" query could be executed. But this is more cumbersome and I can't think of any uses beyond setting the charset. I think having a method to change the charset like I also do not like the idea of trying to parse "SET NAMES" statements from SQL queries to remember / automatically repeat. |
I agree and it looks like the best way forward would be to accept an optional $factory->createConnection('user:secret@localhost/database?charset=utf8mb4'); Once this is added to the At the moment, this project is hard-wired to default to
charset query parameter to automatically build a SET NAMES utf8mb4 query as per https://dev.mysql.com/doc/refman/5.6/en/charset-connection.html. In the future, we may want to map the given charset query parameter to a charset/collation ID as per https://dev.mysql.com/doc/internals/en/character-set.html#packet-Protocol::CharacterSet. On top of this, we should add some tests with some help of https://dev.mysql.com/doc/refman/5.6/en/charset-introducer.html, https://dev.mysql.com/doc/refman/8.0/en/charset-binary-set.html and https://dev.mysql.com/doc/refman/5.6/en/charset-literal.html.
PRs are very much welcome 👍 (As much as I'd love to work on this myself, there are currently no immediate plans to build this from my end (no demand at the moment and more important outstanding issues currently). If you need this for a commercial project and you want to help sponsor this feature, feel free to reach out and I'm happy to take a look.) |
Unfortunately my programming skills are not up to the job, otherwise I would have a go at this myself. (I'm very much a newcomer to OO and Async PHP.) But I did a quick experiment of editing the value of So it would be necessary to query INFORMATION_SCHEMA.COLLATIONS in order to map a given charset string to its ID. More straightforward therefore (and probably less overhead) to just send a SET NAMES query on connection. Furthermore, it seems you cannot really say for sure that this project is hard-coded to default to utf8 since it's not guaranteed that charset id 0x21 will always be utf8. |
As I understand it, this library doesn't support setting the character set automatically, so it's necessary to send a "SET NAMES ..." statement to do this. But I think it won't work to do this just once because one can't be sure when the database connection will be re-made.
e.g.
So it seems necessary to send SET NAMES every time a query is executed...
e.g.
This seems a bit inefficient, and there's still no guarantee that the database connection won't get dropped and re-made between the SET NAMES query and the next query.
Am I missing something here? Is there a better way (other than built-in support for character set, which would be nice of course).
The text was updated successfully, but these errors were encountered: