Skip to content

Commit

Permalink
Handle browsers with no navigator.language better (#521) (#674)
Browse files Browse the repository at this point in the history
The code for setting `config["lang"]`, in `isso/app/config.js`,
assumes that browsers will always provide a value for
`navigator.language` and/or `navigator.userLanguage`.  Per bug #521,
this is not a safe assumption.

While I was attempting to fix this, I noticed that regional variants
of a language (`zh-TW`, `pt-BR`) were being handled in an ad-hoc,
unreliable manner.  I also noticed a new user-agent language property
[`navigator.languages`][] which more closely matches the semantics of
[`Accept-Language`][]—it would be good to support that.

This patch addresses all the above problems, as follows:

1. Add a new configuration property `data-isso-default-lang` that
   specifies the language to use (instead of English) when the browser
   *doesn’t* have a preference.

2. Document that we expect the value of `data-isso-lang` and
   `data-isso-default-lang` to be a [BCP 47][] language tag, because
   this is what `navigator.language` etc contain.  (The practical
   upshot is that tags like `zh-TW` are officially allowed now.)

3. In `config.js`, compile an array of candidate language tags, in
   descending order of preference, from all available sources:
   `data-isso-lang`, `navigator.languages`, `navigator.language`,
   `navigator.userLanguage`, `data-isso-default-lang`, and finally
   `"en"` as a backstop.  Handle any or all of the above being
   null/undefined/empty.  This array goes into `config["langs"]`.
   `config["lang"]` is removed.

4. In `i18n.js`, select the first entry in `config["langs"]` for which
   we have both pluralforms and a translation, and make that the value
   of `i18n.lang`.  An English backstop is supplied here too for extra
   defensiveness.  Also, if we don’t have a translation for say
   `zh-HK`, try `zh` before moving on to the next entry in the array.

5. New function `utils.normalize_bcp47` ensures that we process
   language tags, whereever they came from, case-insensitively and
   treating `_` as equivalent to `-`.

6. Move `utils.ago` to `i18n.ago` to avoid a circular dependency
   between utils and i18n.

[`navigator.languages`]: https://developer.mozilla.org/en-US/docs/Web/API/NavigatorLanguage/languages
[`Accept-Language`]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language
[BCP 47]: https://tools.ietf.org/html/bcp47
  • Loading branch information
zackw authored Feb 4, 2022
1 parent 7d47825 commit 48a4736
Show file tree
Hide file tree
Showing 6 changed files with 176 additions and 61 deletions.
12 changes: 11 additions & 1 deletion CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,25 @@ Changelog for Isso

- Don't ignore missing configuration files.
(Jelmer Vernooij)

- Serve isso.css separately to avoid `style-src: unsafe-inline` CSP and allow
clients to override fetch location (#704, ix5):

data-isso-css-url="https://comments.example.org/css/isso.css"

- New "samesite" option in [server] section to override SameSite header for
cookies. (#700, ix5)

- Fallback for SameSite header depending on whether host is served over https
or http (#700, ix5)

- Improved detection of browser-supplied language preferences (#521)
Isso will now honor the newer `navigator.languages` global property
as well as `navigator.language` and `navigator.userLanguage`.
There is a new configuration property `data-isso-default-lang`
that specifies the language to use (instead of English) when none
of these is available. (The existing `data-isso-lang` *overrides*
browser-supplied language preferences.)

0.12.4 (2021-02-03)
-------------------

Expand Down
30 changes: 25 additions & 5 deletions docs/docs/configuration/client.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,12 +121,32 @@ Defaults to `true`.
data-isso-lang
--------------

Override useragent's preferred language. Isso has been translated in over 12
languages. The language is configured by its `ISO 639-1
<https://en.wikipedia.org/wiki/ISO_639-1>`_ (two letter) code.
Always render the Isso UI in this language, ignoring what the
user-agent says is the preferred language. The default is to
honor the user-agent's preferred language, and this can be
specified explicitly by using ``data-isso-lang=""``.

The value of this property should be a `BCP 47 language tag
<https://tools.ietf.org/html/bcp47>`_, such as "en", "ru", or "pt-BR".
Language tags are processed case-insensitively, and may use
underscores as separators instead of dashes (e.g. "pt_br" is treated
the same as same as "pt-BR").

You can find a list of all supported languages by browsing the
`i18n directory
<https://github.com/posativ/isso/tree/master/isso/js/app/i18n>`_ of
the source tree.

data-isso-default-lang
----------------------

Render the Isso UI in this language when the user-agent does not
specify a preferred language, or if the language it specifies is not
supported. Like ``data-isso-lang``, the value of this property should
be a BCP 47 language tag. Defaults to "en".

You find a list of all supported languages on `GitHub
<https://github.com/posativ/isso/tree/master/isso/js/app/i18n>`_.
If you specify both ``data-isso-default-lang`` and ``data-isso-lang``,
``data-isso-lang`` takes precedence.

data-isso-reply-to-self
-----------------------
Expand Down
45 changes: 42 additions & 3 deletions isso/js/app/config.js
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
define(function() {
define(["app/utils"], function(utils) {
"use strict";

var config = {
"css": true,
"css-url": null,
"lang": (navigator.language || navigator.userLanguage).split("-")[0],
"lang": "",
"default-lang": "en",
"reply-to-self": false,
"require-email": false,
"require-author": false,
Expand Down Expand Up @@ -40,6 +41,44 @@ define(function() {
// split avatar-fg on whitespace
config["avatar-fg"] = config["avatar-fg"].split(" ");

return config;
// create an array of normalized language codes from:
// - config["lang"], if it is nonempty
// - the first of navigator.languages, navigator.language, and
// navigator.userLanguage that exists and has a nonempty value
// - config["default-lang"]
// - "en" as an ultimate fallback
// i18n.js will use the first code in this array for which we have
// a translation.
var languages = [];
var found_navlang = false;
if (config["lang"]) {
languages.push(utils.normalize_bcp47(config["lang"]));
}
if (navigator.languages) {
for (i = 0; i < navigator.languages.length; i++) {
if (navigator.languages[i]) {
found_navlang = true;
languages.push(utils.normalize_bcp47(navigator.languages[i]));
}
}
}
if (!found_navlang && navigator.language) {
found_navlang = true;
languages.push(utils.normalize_bcp47(navigator.language));
}
if (!found_navlang && navigator.userLanguage) {
found_navlang = true;
languages.push(utils.normalize_bcp47(navigator.userLanguage));
}
if (config["default-lang"]) {
languages.push(utils.normalize_bcp47(config["default-lang"]));
}
languages.push("en");

config["langs"] = languages;
// code outside this file should look only at langs
delete config["lang"];
delete config["default-lang"];

return config;
});
83 changes: 63 additions & 20 deletions isso/js/app/i18n.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ define(["app/config", "app/i18n/bg", "app/i18n/cs", "app/i18n/da",
"use strict";

var pluralforms = function(lang) {
switch (lang) {
// we currently only need to look at the primary language
// subtag.
switch (lang.split("-", 1)[0]) {
case "bg":
case "cs":
case "da":
Expand All @@ -23,14 +25,11 @@ define(["app/config", "app/i18n/bg", "app/i18n/cs", "app/i18n/da",
case "hu":
case "it":
case "ko":
case "pt_BR":
case "pt_PT":
case "pt":
case "sv":
case "nl":
case "vi":
case "zh":
case "zh_CN":
case "zh_TW":
return function(msgs, n) {
return msgs[n === 1 ? 0 : 1];
};
Expand Down Expand Up @@ -77,14 +76,6 @@ define(["app/config", "app/i18n/bg", "app/i18n/cs", "app/i18n/da",
}
};

// useragent's prefered language (or manually overridden)
var lang = config.lang;

// fall back to English
if (! pluralforms(lang)) {
lang = "en";
}

var catalogue = {
bg: bg,
cs: cs,
Expand All @@ -104,25 +95,50 @@ define(["app/config", "app/i18n/bg", "app/i18n/cs", "app/i18n/da",
oc: oc,
pl: pl,
pt: pt_BR,
pt_BR: pt_BR,
pt_PT: pt_PT,
"pt-BR": pt_BR,
"pt-PT": pt_PT,
ru: ru,
sk: sk,
sv: sv,
nl: nl,
vi: vi,
zh: zh_CN,
zh_CN: zh_CN,
zh_TW: zh_TW
"zh-CN": zh_CN,
"zh-TW": zh_TW
};

var plural = pluralforms(lang);
// for each entry in config.langs, see whether we have a catalogue
// entry and a pluralforms entry for it. if we don't, try chopping
// off everything but the primary language subtag, before moving
// on to the next one.
var lang, plural, translations;
for (var i = 0; i < config.langs.length; i++) {
lang = config.langs[i];
plural = pluralforms(lang);
translations = catalogue[lang];
if (plural && translations)
break;
if (/-/.test(lang)) {
lang = lang.split("-", 1)[0];
plural = pluralforms(lang);
translations = catalogue[lang];
if (plural && translations)
break;
}
}

// absolute backstop; if we get here there's a bug in config.js
if (!plural || !translations) {
lang = "en";
plural = pluralforms(lang);
translations = catalogue[lang];
}

var translate = function(msgid) {
return config[msgid + '-text-' + lang] ||
catalogue[lang][msgid] ||
translations[msgid] ||
en[msgid] ||
"???";
"[?" + msgid + "]";
};

var pluralize = function(msgid, n) {
Expand All @@ -136,7 +152,34 @@ define(["app/config", "app/i18n/bg", "app/i18n/cs", "app/i18n/da",
return msg ? msg.replace("{{ n }}", (+ n)) : msg;
};

var ago = function(localTime, date) {

var secs = ((localTime.getTime() - date.getTime()) / 1000);

if (isNaN(secs) || secs < 0 ) {
secs = 0;
}

var mins = Math.floor(secs / 60), hours = Math.floor(mins / 60),
days = Math.floor(hours / 24);

return secs <= 45 && translate("date-now") ||
secs <= 90 && pluralize("date-minute", 1) ||
mins <= 45 && pluralize("date-minute", mins) ||
mins <= 90 && pluralize("date-hour", 1) ||
hours <= 22 && pluralize("date-hour", hours) ||
hours <= 36 && pluralize("date-day", 1) ||
days <= 5 && pluralize("date-day", days) ||
days <= 8 && pluralize("date-week", 1) ||
days <= 21 && pluralize("date-week", Math.floor(days / 7)) ||
days <= 45 && pluralize("date-month", 1) ||
days <= 345 && pluralize("date-month", Math.floor(days / 30)) ||
days <= 547 && pluralize("date-year", 1) ||
pluralize("date-year", Math.floor(days / 365.25));
};

return {
ago: ago,
lang: lang,
translate: translate,
pluralize: pluralize
Expand Down
2 changes: 1 addition & 1 deletion isso/js/app/isso.js
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ define(["app/dom", "app/utils", "app/config", "app/api", "app/jade", "app/i18n",

// update datetime every 60 seconds
var refresh = function() {
$(".permalink > time", el).textContent = utils.ago(
$(".permalink > time", el).textContent = i18n.ago(
globals.offset.localTime(), new Date(parseInt(comment.created, 10) * 1000));
setTimeout(refresh, 60*1000);
};
Expand Down
65 changes: 34 additions & 31 deletions isso/js/app/utils.js
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
define(["app/i18n"], function(i18n) {
define(function() {
"use strict";

// return `cookie` string if set
Expand All @@ -12,32 +12,6 @@ define(["app/i18n"], function(i18n) {
return n.length >= width ? n : new Array(width - n.length + 1).join(z) + n;
};

var ago = function(localTime, date) {

var secs = ((localTime.getTime() - date.getTime()) / 1000);

if (isNaN(secs) || secs < 0 ) {
secs = 0;
}

var mins = Math.floor(secs / 60), hours = Math.floor(mins / 60),
days = Math.floor(hours / 24);

return secs <= 45 && i18n.translate("date-now") ||
secs <= 90 && i18n.pluralize("date-minute", 1) ||
mins <= 45 && i18n.pluralize("date-minute", mins) ||
mins <= 90 && i18n.pluralize("date-hour", 1) ||
hours <= 22 && i18n.pluralize("date-hour", hours) ||
hours <= 36 && i18n.pluralize("date-day", 1) ||
days <= 5 && i18n.pluralize("date-day", days) ||
days <= 8 && i18n.pluralize("date-week", 1) ||
days <= 21 && i18n.pluralize("date-week", Math.floor(days / 7)) ||
days <= 45 && i18n.pluralize("date-month", 1) ||
days <= 345 && i18n.pluralize("date-month", Math.floor(days / 30)) ||
days <= 547 && i18n.pluralize("date-year", 1) ||
i18n.pluralize("date-year", Math.floor(days / 365.25));
};

var HTMLEntity = {
"&": "&amp;",
"<": "&lt;",
Expand Down Expand Up @@ -68,6 +42,35 @@ define(["app/i18n"], function(i18n) {
.replace(/\n/gi, '<br>');
};

// Normalize a BCP47 language tag.
// Quoting https://tools.ietf.org/html/bcp47 :
// An implementation can reproduce this format without accessing
// the registry as follows. All subtags, including extension
// and private use subtags, use lowercase letters with two
// exceptions: two-letter and four-letter subtags that neither
// appear at the start of the tag nor occur after singletons.
// Such two-letter subtags are all uppercase (as in the tags
// "en-CA-x-ca" or "sgn-BE-FR") and four-letter subtags are
// titlecase (as in the tag "az-Latn-x-latn").
// We also map underscores to dashes.
var normalize_bcp47 = function(tag) {
var subtags = tag.toLowerCase().split(/[_-]/);
var afterSingleton = false;
for (var i = 0; i < subtags.length; i++) {
if (subtags[i].length === 1) {
afterSingleton = true;
} else if (afterSingleton || i === 0) {
afterSingleton = false;
} else if (subtags[i].length === 2) {
subtags[i] = subtags[i].toUpperCase();
} else if (subtags[i].length === 4) {
subtags[i] = subtags[i].charAt(0).toUpperCase()
+ subtags[i].substr(1);
}
}
return subtags.join("-");
};

// Safari private browsing mode supports localStorage, but throws QUOTA_EXCEEDED_ERR
var localStorageImpl;
try {
Expand All @@ -92,10 +95,10 @@ define(["app/i18n"], function(i18n) {

return {
cookie: cookie,
pad: pad,
ago: ago,
text: text,
detext: detext,
localStorageImpl: localStorageImpl
localStorageImpl: localStorageImpl,
normalize_bcp47: normalize_bcp47,
pad: pad,
text: text
};
});

0 comments on commit 48a4736

Please sign in to comment.