Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i18n: introduce script to swap in new locale to LHR #8755

Merged
merged 17 commits into from
Jun 25, 2019
Merged

Conversation

paulirish
Copy link
Member

icuMessagePaths has been sitting there waiting for us for a while! Finally we can put it to use.

It's written as isomorphic code (lol), assuming we may want to do this clientside or whatever... But the initial motivation was to generate reports in all locales in our Now deployments. That PR will followup this one.

@paulirish paulirish requested a review from brendankenny as a code owner April 30, 2019 23:06
@paulirish paulirish changed the title i18n: introduce script to swap in new locale to LHR i18n: introduce script to swap in new locale to LHR Apr 30, 2019
Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huzzah!!!

it's happening ron paul gif

@@ -0,0 +1,28 @@
/**
* @license Copyright 2018 Google Inc. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2019 :)

Object.entries(icuMessagePaths).forEach(([icuMessageId, messageInstancesInLHR]) => {
for (const instance of messageInstancesInLHR) {
// The path that _formatPathAsString() generated
const path = /** @type {LH.I18NMessageValuesEntry} */ (instance).path || /** @type {string} */ (instance);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we typeof check this instead of the type assertions?

for (const instance of messageInstancesInLHR) {
// The path that _formatPathAsString() generated
const path = /** @type {LH.I18NMessageValuesEntry} */ (instance).path || /** @type {string} */ (instance);
const values = /** @type {LH.I18NMessageValuesEntry} */ (instance).values || undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth commenting that values are the string template values we need to replace

const icuMessageIdRegex = /(.* \| .*)$/;
if (!icuMessageIdRegex.test(icuMessageId)) throw new Error('This is not an ICU message ID');

const {formattedString} = _formatIcuMessage(locale, icuMessageId, '', values);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

empty fallback message concerns me a bit, maybe we should have _formatIcuMessage throw if we can't find the icuMessageId and the fallback message is undefined? Maybe we should use a placeholder string that's non-empty?

not sure what to do there 🤷‍♂

const i18n = require('./i18n.js');

/**
* Replaces all strings within an LHR with ones from a different locale
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this might be worth being the place where we explain how the icuMessagePaths property is setup and meant to be used?


/* eslint-disable no-console, max-len */

const _set = require('lodash.set');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍

// when using accented english, force the use of a different locale for number formatting
const localeForMessageFormat = locale === 'en-XA' ? 'de-DE' : locale;
// pre-process values for the message format like KB and milliseconds
const valuesForMessageFormat = _preprocessMessageValues(icuMessage, values);
const valuesForMessageFormat = _preprocessMessageValues(messageForMessageFormat, values);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patrickhulce note this change.. it was using the fallback message instead of the one pulled from the locales files. which seemed odd. right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was using the fallback message instead of the one pulled from the locales files. which seemed odd. right?

It is weird. As long as there isn't an old mismatched translation for the locale it shouldn't matter, but agreed that checking that the values will actually be able to go into the string we want them to (and preparing them to do so) is the right thing to do.

Mismatched translations could become a problem at some point. If we've updated a string in en-US.json and it has different values than the not-yet-updated strings in all the other locales, I'm pretty sure that will either throw in _preprocessMessageValues or below in the formatter.

Maybe we should have a check in string collection that deletes strings in other locales if the expected values don't match anymore.

@paulirish
Copy link
Member Author

fwiw if you check out https://lighthouse-git-arabic.googlechrome.now.sh/ you can see where i'm going with this.

image

@connorjclark
Copy link
Collaborator

we should add this:

<meta name="google" content="notranslate">

either just for these now deployments, or directly in the report.

Copy link
Member

@brendankenny brendankenny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does seem like we have a missing abstraction that could unify both this and the regular translation process nicely. Basically to be able to pass icuMessagePaths to a method that populates _icuMessageInstanceMap from it, then run i18n.replaceIcuMessageInstanceIds() on the LHR exactly as we do in Runner.

But this is self contained enough (and refactorable if we ever feel compelled to do so) it seems fine to move forward with

} else {
path = /** @type {LH.I18NMessageValuesEntry} */ (instance).path;
// `values` are the string template values to be used. eg. `values: {wastedBytes: 9028}`
values = /** @type {LH.I18NMessageValuesEntry} */ (instance).values;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need these casts anymore


/* eslint-env jest */

describe('swap-locale', () => {

This comment was marked as resolved.

lighthouse-core/lib/i18n/swap-locale.js Outdated Show resolved Hide resolved
lighthouse-core/lib/i18n/swap-locale.js Outdated Show resolved Hide resolved
// when using accented english, force the use of a different locale for number formatting
const localeForMessageFormat = locale === 'en-XA' ? 'de-DE' : locale;
// pre-process values for the message format like KB and milliseconds
const valuesForMessageFormat = _preprocessMessageValues(icuMessage, values);
const valuesForMessageFormat = _preprocessMessageValues(messageForMessageFormat, values);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was using the fallback message instead of the one pulled from the locales files. which seemed odd. right?

It is weird. As long as there isn't an old mismatched translation for the locale it shouldn't matter, but agreed that checking that the values will actually be able to go into the string we want them to (and preparing them to do so) is the right thing to do.

Mismatched translations could become a problem at some point. If we've updated a string in en-US.json and it has different values than the not-yet-updated strings in all the other locales, I'm pretty sure that will either throw in _preprocessMessageValues or below in the formatter.

Maybe we should have a check in string collection that deletes strings in other locales if the expected values don't match anymore.

const formattedStr = i18n.formatMessageFromIdWithValues(locale, icuMessageId, values);
// Write string back into the LHR
_set(lhr, path, formattedStr);
} catch (e) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unconditional catch here seems excessive since all problems would be silenced. Some wouldn't be interesting (e.g. string from removed audit isn't available in new locale), but some could be a real problem. Maybe at least log the issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe at least log the issue?

or log the ones that aren't 'No ICU message string to format' if that gets excessive

* @param {*} [values]
* @return {string}
*/
function formatMessageFromIdWithValues(locale, icuMessageId, values) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about

Suggested change
function formatMessageFromIdWithValues(locale, icuMessageId, values) {
function getFormattedFromIdAndValues(locale, icuMessageId, values) {

to match getFormatted (param order for getFormatted vs _formatIcuMessage is unfortunate but whatever)

// Renderer formatted strings
expect(lhrEn.i18n.rendererFormattedStrings.notApplicableAuditsGroupTitle).toEqual('Not applicable');
expect(lhrEs.i18n.rendererFormattedStrings.notApplicableAuditsGroupTitle).toEqual('No aplicable');
});

This comment was marked as outdated.

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great to me!

* which will be used in the replacement within `i18n._formatIcuMessage()`
*
* An example:
* "icuMessagePaths": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unindent this 2 spaces so it's easy to tell its nested? :)

Object.entries(icuMessagePaths).forEach(([icuMessageId, messageInstancesInLHR]) => {
for (const instance of messageInstancesInLHR) {
// The path that _formatPathAsString() generated
let path;

This comment was marked as resolved.

/* eslint-disable max-len */
// Renderer formatted strings
expect(lhrEn.i18n.rendererFormattedStrings.notApplicableAuditsGroupTitle).toEqual('Not applicable');
expect(lhrEs.i18n.rendererFormattedStrings.notApplicableAuditsGroupTitle).toEqual('No aplicable');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even notice this was translated 😆 is there a better example?

types/lhr.d.ts Outdated
@@ -8,7 +8,8 @@ import LHError = require('../lighthouse-core/lib/lh-error.js');

declare global {
module LH {
export type I18NMessageEntry = string | {path: string, values: any};
export type I18NMessageValuesEntry = {path: string, values: any};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should type values as Record<string, string | number>?

@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@paulirish
Copy link
Member Author

ready for another look

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks pretty good to me! just a few primetime cleanup nits

*/
'use strict';

/* eslint-disable no-console, max-len */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably shouldn't leave these on if this is for real now :)

maybe we have a script version that does the console.loging?


// via Spanish
const lhrEnEsRT = swapLocale(swapLocale(lhrEn, 'es'), 'en-US');
expect(lhrEnEsRT).toMatchObject(lhrEn);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should toEqual, right? or do things get deleted

});

if (missingIcuMessageIds.length) {
console.error(`No message in locale (${locale}) found for:\n`, missingIcuMessageIds);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe these get returned as warnings if we split the logging into a script?

// Write string back into the LHR
_set(lhr, path, formattedStr);
} catch (err) {
if (err.message === 'No ICU message string to format') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels like it will get out of date and we won't notice :)

any way you can think of to make this a little more foolproof? export the string in i18n maybe? a flag on the error? 🤷‍♂

beforeEach(() => {
// silence console.error spam about messages not found
// eslint-disable-next-line no-console
console.error = jest.fn();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully won't need this one :)

Copy link
Member

@exterkamp exterkamp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just make the CI happy!

const _set = require('lodash.set');

const i18n = require('./i18n.js');
const LHError = require('../lh-error.js');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appveyor is mad about this unused import.

@exterkamp exterkamp added cla: yes and removed cla: no labels Jun 7, 2019
@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants