-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MDEV-32336 deb default config - use collation-server = utf8mb4_uca1400_ai_ci #2775
MDEV-32336 deb default config - use collation-server = utf8mb4_uca1400_ai_ci #2775
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The migration to utf-8 was done in Debian by having Debian defaults that override the upstream defaults (latin1) as they seem outdated and unsuitable for most users. According to https://mariadb.com/kb/en/server-system-variables/#character_set_server and https://mariadb.com/kb/en/server-system-variables/#collation_server this still seems to be the case.
Would it not be the best solution to simply remove these customizations in debian/
and default to upstream server defaults, and ensure upstream has modern and sensible values?
If changing upstream default is not an option, the change in debian/
should be properly documented. The Jira https://jira.mariadb.org/browse/MDEV-32336 does not explain why the new default should be specifically utf8mb4_uca1400_ai_ci
nor does the commit message explain why this change should be done to this value:
MDEV-32336 deb default config - use collation-server =
utf8mb4_uca1400_ai_ci
utf8mb4_general_ci has been outdated for a while. Lets use our modern
standard.
See tips 2 & 3 in about good commit message. The commit message should say something about why utf8mb4_uca1400_ai_ci
is the best value and why this change is done now in 11.3 and not say in 11.4 as https://jira.mariadb.org/browse/MDEV-25829 says.
Good point on commit message. Later and actually a standard is the general criteria however I'll find some better words. Looking back on why utf8mb4_general_ci, was copied from utf8 comment from 2012 in 438ed04 in and in 7c2079f was made non-commented, but updated to utf8mb4. MDEV-25829, I'm not sure is a genuine target version of 11.4, but I hadn't looked for it either. I'd certainly welcome better upstream defaults too. 11.3 was chosen as only non-packaged releases have been done on this so I was assuming its still compatible and not causing packaging regressions. |
f1a83e2
to
410f86b
Compare
What is the real difference between |
Mainly from https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci - utf8mb4_general_ci isn't a standard, it was just a (slightly) quicker (and dirty) implementation. |
410f86b
to
2194317
Compare
You probably want to have the git commit message updated with the text you posted in PR comments here, and use as title something like "MDEV-32336: Use utf8mb4_uca1400_ai_ci as default collation in Debian" |
utf8mb4_general_ci has been outdated for a while and contained loosely standardized collations. UCA-14.0.0 has a more defined collation with multiple benefit that new users may not immediately consider, or may assume to be default. By defining default collation for utf8mb4 to be uc1400_ai_ci newly created tables will have a modern standard collation.
2194317
to
ae122c7
Compare
Acceptable? |
Is that correct new notation that I don't know |
Description
utf8mb4_general_ci has been outdated for a while. Lets use our modern standard.
How can this PR be tested?
my_print_defaults --mysqld on install in debian
Basing the PR against the correct MariaDB version
PR quality check