-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expanded language support for regional variations #356
Comments
IMO you should change the enum from LanguageCode to LocaleCode . This also serves as a mechanism to showcase different products in your store based on country. Happy to discuss more details on slack. |
Prior ArtW3 Language tags in HTML and XMLhttps://www.w3.org/International/articles/language-tags/
SyliusAs far as I can tell, Sylius uses the Symphony Intl package, and on https://demo.sylius.com/admin they list 593 locales: Sylius locales
SaleorSaleor is built on the Django framework, which comes with support for locales in the format
HTTP Accept-Language headerhttps://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language
Intl.getCanonicalLocales()console.log(Intl.getCanonicalLocales('EN-US'));
// expected output: Array ["en-US"]
console.log(Intl.getCanonicalLocales(['EN-US', 'Fr']));
// expected output: Array ["en-US", "fr"]
try {
Intl.getCanonicalLocales('EN_US');
} catch (err) {
console.log(err);
// expected output: RangeError: invalid language tag: EN_US
} |
That seems to cover most cases, but e.g. with Chinese we have 2 dimensions: the script and the location. So we can have:
so that all the Chinese variations end up as:
On top of that, a Chinese speaker in Hong Kong may want to translate into Cantonese ( I'm sure there are other important cases where the However, for practical purposes, the |
SolutionSo deciding to use the Covering every possibility would involve (~175 ISO 639-1 codes) * (~250 ISO 3166 region codes) = over 43,000 combinations.
Such a pragmatic list could be obtained from this Unicode CLDR summary chart, taking all rows from the table labeled This can be expressed as Javascript that can be run directly on the linked page: Array.from(document.querySelectorAll('.body table')[1].querySelectorAll('tr'))
.filter(row => {
const pageCell = row.querySelector('td:nth-child(3)');
return pageCell && pageCell.innerText.startsWith('Languages ')
})
.filter(row => {
const codeCell = row.querySelector('td:nth-child(7)');
return codeCell && codeCell.innerText.match(/^[a-z]{2}(_[A-Za-z]+)?$/)
})
.map(row => {
const nameCell = row.querySelector('td:nth-child(6)');
const codeCell = row.querySelector('td:nth-child(7)');
return `${codeCell.innerText}: ${nameCell.innerText}`;
}).join('\n') which yields the following list of 157 languages:
|
Great solution! Just to complement the prior art: |
IMO, this is a good list of 125 languages to support. For Chinese, zh-CN and zh-TW are the 2 variants for simplified and traditional chinese. |
A recent pull request #354 adds support for Traditional Chinese (as spoken in HK and Taiwan). We already have Simplified Chinese translations. This exposes a shortcoming in our language handling: we currently only list the ISO 639-1 codes for languages, without distinguishing between regional variations.
So, e.g. we have the
LanguageCode
enum which is widely used throughout the core, which only has "zh" for all variations of Chinese.We could add support for common IETF language tags (which is have used in this PR), which would then allow us to support the
zh-CN
andzh-TW
UI translations, as well as in places such as product translations etc.The question then is "which full IETF tags to support?"
Add all possible IETF tags
Cases like Chinese are interesting because the actual writing system is distinct between Simplified and Traditional. Whereas an American can totally understand anything written in British English and vice-versa, does the same apply to Chinese speakers in Beijing and Taipei?
How about Madrid (es-ES) and Mexico City (es-MX)? I personally cannot answer these questions, since I am only very familiar with English (and to a lesser extent German).
Adding all variations seems like a bad idea, e.g. according to https://datahub.io/core/language-codes#resource-ietf-language-tags there are over 100 variations of English and over 40 variations of French. In practice, a single "English" version of anything would probably suffice (although I say this as an English speaker from England, so perhaps my perspective is too narrow). In any case, I'm pretty sure that listing 100 variations of English in UI menus is not a desirable solution.
Add tags on an ad-hoc basis
Another approach would be to default to the ISO 639-1 codes as we currently have, and only add regional variations as-and-when the need arises (e.g. a pull request like the one that triggered this issue comes in).
In this case, we'd then need to update the
LanguageCode
enum as well as thelanguage-translation-strings.ts
file to allow localization of these new variants.The text was updated successfully, but these errors were encountered: