-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change collation in query to adapt to charset #337
Conversation
where can do you configure the encoding? would it be possible to have that place also configure a binary encoding to go with that, which by default is just $encoding.'_bin' but can be customized, then read that value in the client for that query? or alternatively, have the client check if the driver is on utf8 and if not not use any collate in the query? from what i understand, collate is only used as a performance improvement. or is it also about being case sensitive or a vs ä? |
I've been looking for the commit in which this change is, and I've found it: 5273f18 Looks like it is about case-sensitivity. So I guess it it is not possible to simply remove it in case of UTF-8, right? Seems there was a problem at some point. The current configuration is from the DoctrineBundle, so not really related to Jackalope. If we want to introduce a new configuration we would have to find a place for it first. I have just checked the available collations on my system, and there is a |
i guess that is about as save as the current code using hardcoded utf8_bin. maybe we can add a `setCaseSensitiveEncoding method on the client class, as a last resort for people that have a special case with some requirements? something like:
(and only call this in the mysql case.) |
@dbu But that's only useful if it is at least protected, right? If we make it private nobody can change it, so there would be no reason to introduce that. Or am I overseeing something now? 🙈 |
if ($this->getConnection()->getDriver() instanceof \Doctrine\DBAL\Driver\PDOMySql\Driver) { | ||
$query = 'SELECT id FROM phpcr_nodes WHERE path COLLATE utf8_bin = ? AND workspace_name = ?'; | ||
$params = $this->conn->getParams(); | ||
$charset = isset($params['charset']) ? $params['charset'] : 'utf8'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a problem with that solution... The charset
option and the default charset with which a table is created can be different... How should we handle that? Couldn't find another way to get that from the SchemaManager
in Doctrine for the specific table 😕
Just to be clear, I am talking about this situation in the config:
doctrine:
dbal:
charset: utf8mb4
default_table_options:
charset: utf8
collate: utf8_unicode_ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh well. but in that case lets expect people to set the case sensitive encoding manually?
i think we really must avoid doing a database call in every request to know this, so it must come from the connection / driver if thats possible at all. worst case we just add this new method and have a compiler pass in the phcpr-bundle that reads the dbal configuration and does something sensible?
@lsmith77 can you help us here on where to get the right information?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm would need to dig in here too .. but I agree we should do as little work as possible at run time ..
so what we can do at install time or at config compile time the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear: The current state doesn't involve any database call at all, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, currently not. i was just saying that we must not end with something that does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And you are also saying that the current solution is not satisfying, right? Would it be enough to add the setCaseSensitiveEncoding
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i would say so. have a setter, and a get method that does the code you added above and below, so that this code can just say if mysql and not binary, do collate with $this->getCaseSensitiveEncoding()
@dbu <https://github.com/dbu> But that's only useful if it is at least
protected, right? If we make it private nobody can change it, so there
would be no reason to introduce that. Or am I overseeing something now? 🙈
i missed to write the setter in my illustration...
i would have a
/**
* Set table encoding to use for case sensitive lookups in MySQL.
*/
public function setCaseSensitiveEncoding($encoding)
|
@dbu Added these methods, is it ok now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool! added a few comments, otherwise looks good.
can you please also add an entry to the changelog to mention this fix and the method in case somebody runs into problems?
$params = $this->conn->getParams(); | ||
$charset = isset($params['charset']) ? $params['charset'] : 'utf8'; | ||
|
||
return $charset === 'binary' ? $charset : $charset . '_bin'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please do $this->caseSensitiveEncoding = $charset === 'binary' ? $charset : $charset.'_bin';
instead of return, so its only done once
@@ -439,6 +444,23 @@ public function setCheckLoginOnServer($bool) | |||
$this->checkLoginOnServer = $bool; | |||
} | |||
|
|||
public function setCaseSensitiveEncoding($encoding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a docblock explaining that this is to control what collate should be used when querying for nodes. and mention that its only relevant for mysql and usually the autodetection of appending _bin to the charset is good enough.
@dbu Fixed your comments. |
@@ -1,6 +1,13 @@ | |||
Changelog | |||
========= | |||
|
|||
1.2.8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we merge this change into master, it will become 1.3, as master is for 1.3 and there are other changes in it. do you urgently need a fixed release? vondrasekpavel is busy doing lots of cleanups and fixes, we would release sometimes in february.
for an urgent fix, i would have to update the 1.2 branch to have everything that is in 1.2 (it currently looks massively outdated: 1.2...1.2.8). if you need a fix urgently, i can do that and then merge your branch into 1.2 and tag 1.2.9 instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I thought master is the next hotfix release... Actually @alexander-schranz has to tell how fast he needs that fix, for me the next minor release is also good 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have talked to @alexander-schranz, it's not that important. Changed it to 1.3
great, then it will go into 1.3. i guess we should release that soonish, quite a few fixes and improvements going on in the last weeks! thanks daniel! |
Cool, thank you! Now only doctrine/DoctrinePHPCRBundle#270 is missing in order for it to work in a symfony environment :-) |
thanks for the reminder, merged that as well.
|
This change is related to doctrine/DoctrinePHPCRBundle#270. If a connection to the RepositorySchema with a charset of
utf8mb4
is passed it is not allowed to useutf8_bin
in a select statement.I simply changed it in here to
utf8mb4_bin
, which will work then. However, I know that this can't be the final solution. The question is how that one should look like. We could use theconnection
to get the charset and use an if in here, or use the charset and simply postfix it with_bin
, maybe check before if this collation really exists (is that somehow possible? What should be done in case it isn't?).