Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strings in requirejs and build process should support nested objects #786

Closed
9 of 10 tasks
zepumph opened this issue Aug 6, 2019 · 8 comments
Closed
9 of 10 tasks

Comments

@zepumph
Copy link
Member

zepumph commented Aug 6, 2019

From phetsims/rosetta#193 and mentioned by @pixelzoom at the end of #619 (comment). It seems important to consolidate translatable string logic into the same file. @samreid and I brainstormed and tried to understand the scope of duplicated string logic (rosetta/TranslationUtil, chipper/getStringMap, chipper/string.js chipper/chipper-strings.js etc). We should have a single file that all can use.

Over in phetsims/rosetta#193 in the a11ystring-plugin branch I factored a lot of the chipper code out into ChipperStringUtils, but it was very much a prototype and could be done better. Now after discussing with @samreid, we understand the problem better and are able to understand more of the cases that this logic needs to cover.

In general we will need to support three cases:
requirejs
built mode
rosetta

Files to consider:

  • support chipper-strings.js (probably can delete this) (see Can phet.chipper.stings.get and string!getStringFromFileContents be simplified? #796)
  • string.js
  • ChipperStringUtils.js
  • support "value" in intermediate nested objects. assert that "value" keys are always strings
  • support getting the string, but also keep in mind the need to support getting the whole key's object.
  • make sure reportUnusedStrings uses this system. (support reportUnusedStrings for nested a11y strings #780)
  • Create a "Node" typedef or something that recognizes a "value" property as well as the possibility of being an intermediary
  • Audit ROSETTA/TranslationUtils.js and other files to support this.
  • Create a single function to create a map, this way we can have a single way to access the map (hopefully). We need to decide whether or not we should include REPO/ in the key, or pull that out into a parent object in getStringMap.
  • From Add flag to identify a11y strings in strings_*.json rosetta#193 we will support nesting in en json files, but everywhere else will have a flat structure.
@samreid
Copy link
Member

samreid commented Aug 9, 2019

We have the different formats to support:

Structured form in requirejs, written by developers

{
  "moveInSmallerSteps": {
    "value": "Move slower"
  },
  "a11y": {
    "chemistryBook": {
      "value": "Chemistry book"
    }
  }
}

Flat version in babel, written by rosetta

{
  "bookLabel": {
    "value": "el libro o <br> zoom en el libro",
    "history": [
      {
        "userId": 282036,
        "timestamp": 1563484834340,
        "oldValue": "",
        "newValue": "el libro o <br> zoom en el libro",
        "explanation": null
      }
    ]
  },
  "bookTitle": {
    "value": "Libro",
    "history": [
      {
        "userId": 282036,
        "timestamp": 1563484834340,
        "oldValue": "",
        "newValue": "Libro",
        "explanation": null
      }
    ]
  }
}

Flat version in the built sim, written by chipper:

window.phet.chipper.strings = {
  "en": {
    "SCENERY_PHET/key.esc": "‪Esc‬",
    "SCENERY_PHET/key.enter": "‪Enter‬",
    "FRICTION/bookLabel": "‪book or<br> zoomed-in book‬",
    "FRICTION/bookTitle": "‪Book‬",
    "FRICTION/moveBookHeader": "‪Move Book‬",
    "FRICTION/a11y.frictionIncreasingAtomsJigglingTemperaturePattern": "‪Jiggling {{jigglingAmount}}, {{temperature}}‬",
   },
  "ar_SA": {
    "SCENERY_PHET/key.esc": "‪Esc‬",
    "SCENERY_PHET/key.enter": "‪Enter‬",
    "FRICTION/bookLabel": "‪book or<br> zoomed-in book‬",
    "FRICTION/bookTitle": "‪Book‬",
    "FRICTION/moveBookHeader": "‪Move Book‬",
    "FRICTION/a11y.moveInSmallerStepsWith": "‪Move slower with Shift plus Arrow keys, or Shift plus letter keys W, A, S, or D.‬",
    "FRICTION/a11y.jiggle.aLot": "‪jiggle  a lot‬",
    "FRICTION/a11y.jiggle.aLittle": "‪jiggle a little‬"
    }
}

It seems the latter version combines fallback strings (country => language => english) into the compiled result?

Should we unify these by making the window.phet.chipper.strings more like this? Or is that too much work and too little benefit:

window.phet.chipper.strings = {
  "en": {
    "SCENERY_PHET": {
      "key.esc": "‪Esc‬",
      "key.enter": "‪Enter‬"
    },
    "FRICTION": {
      "bookLabel": "‪book or<br> zoomed-in book‬",
      "bookTitle": "‪Book‬",
      "moveBookHeader": "‪Move Book‬",
      "a11y.frictionIncreasingAtomsJigglingTemperaturePattern": "‪Jiggling {{jigglingAmount}}, {{temperature}}‬",
    }
  },
  "ar_SA": {
    "SCENERY_PHET": {
      "key.esc": "‪Esc‬",
      "key.enter": "‪Enter‬"
    },
    "FRICTION": {
      "bookLabel": "‪book or<br> zoomed-in book‬",
      "bookTitle": "‪Book‬",
      "moveBookHeader": "‪Move Book‬",
      "a11y.moveInSmallerStepsWith": "‪Move slower with Shift plus Arrow keys, or Shift plus letter keys W, A, S, or D.‬",
      "a11y.jiggle.aLot": "‪jiggle  a lot‬",
      "a11y.jiggle.aLittle": "‪jiggle a little‬"
    }
  }
}

@samreid
Copy link
Member

samreid commented Aug 9, 2019

TranslationUtils.extractStrings is difficult for me to understand, looks like it is hardcoded to use a specific search path, using heuristics like:

  const matches = data.match( /string!([\w\.\/-]+)/g );

  // if no matches are found, it probably means the sim url was not correct
  if ( matches === null ) {
    return null;
  }

  const simShaInfo = new RegExp( '"' + simName + '": {\\s*"sha": "(\\w*)",\\s*"branch": "(\\w*)"', 'g' ).exec( data );
  const sha = ( simShaInfo && simShaInfo.length > 1 ) ? simShaInfo[ 1 ] : 'master'; // default to master if no sha is found

  for ( let i = 0; i < matches.length; i++ ) {
    const projectAndString = matches[ i ].substring( 7 ).split( '/' );
    const projectName = projectAndString[ 0 ];
    const string = projectAndString[ 1 ];

    projects[ projectName ] = projects[ projectName ] || [];

    if ( !_.includes( projects[ projectName ], string ) ) {
      projects[ projectName ].push( string );
    }
  }

If we change the structure of the string representation in the sim file, we would need to add branching logic here to support both styles.

@zepumph
Copy link
Member Author

zepumph commented Aug 9, 2019

It could be nice to set a course for complete consolidation, and then keep an open mind about where to back track to optimize a cost benefit analysis. I also think that this has too much overlap with Rosetta to not loop in @jbphet for comment. I feel like it is quite likely that we will not want to change all the babel files.

On other side to look into is if there is any website code that relies on this string format. @mattpen do any website databases use strings from Sims?

@samreid
Copy link
Member

samreid commented Aug 9, 2019

It seems there are two parts of important logic to factor out.

  1. Looking up the correct locale-specific fallback string (only needs to happen during the build)
  2. Looking up the string at the correct position in the structure (only happens in requirejs mode--flat structure used everywhere else).

Now that I understand the system better, I'm not convinced that changing format will be a significant improvement. For instance, we don't really need a function that looks up a string at a given position which can be reused by rosetta, because once the data gets to rosetta it is a flat list (and hence lookup is trivial). We don't need a function that looks up the right fallback in a built sim because the fallbacks are already filled in (and hence lookup is trivial). Maybe our best use of time would be (a) incremental improvements on the proposed branch plus documentation about the string formats and pipeline rather than (b) sweeping changes that introduce potentially unnecessary format changes. I'd like to discuss with @zepumph more.

@zepumph
Copy link
Member Author

zepumph commented Aug 9, 2019 via email

@mattpen
Copy link
Contributor

mattpen commented Aug 12, 2019

Nothing on the website directly access babel strings. The titles are transferred from babel to the website database as part of the build process. The file that might need to change is perennial/js/build-server/createTranslationsXML.js.

@zepumph
Copy link
Member Author

zepumph commented Sep 10, 2019

I worked on this more today in the a11ystring-plugin branch today. Basically this was just cleaning up the implementation that was already there. I didn't take any larger steps that were outlined in the beginning of this issue.

Maybe our best use of time would be (a) incremental improvements on the proposed branch plus documentation about the string formats and pipeline rather than (b) sweeping changes that introduce potentially unnecessary format changes. I'd like to discuss with @zepumph more.

@samreid I totally agree with this. The commits above simplify and clean up the general way that we are accessing strings from a map, but most of the algorithm stayed. I like these changes, and hope that after review and discussion with you we may be able to move this to master.

Here is a summary of the work done in these branches so far.

Instead of supporting a new, nested pattern everywhere, only the english json files support this. Everywhere else is flat (sometimes with a namespace prefixed on the key like FRICTION/friction.title). ChipperStringUtils.getStringFromMap is the majority of the logic. Given a map and a key, it is smart enough to decide how to parse the map for the key. Here are the use cases for this file:

  • Requirejs mode for the english strings (which support nesting).
  • requirejs mode for other locales (which are all flat).
  • when building the simulation (running in the requirejs optimizer), creating the string map in getStringMap.js (which is the case that includes the namespace prefix, and, for english strings, nested objects)

A worry I have is that the flat nature of these maps integrates keys that include periods, and keys in which the period means it is a nested object. I get around this by checking the top level first, but (a) this could yield weird logic that we aren't searching for, and (b) there are perhaps other cases that will have to handle this, like in the future, having rosetta render a more tree-like view for translating nested strings. I am fine punting that problem solving for when we have more time to devote to rosetta.

@samreid over to you.

samreid added a commit that referenced this issue Sep 23, 2019
@zepumph zepumph changed the title Create TranslatableStringUtil.js requirejs and build process should support nested objects Sep 23, 2019
@zepumph zepumph changed the title requirejs and build process should support nested objects Strings in requirejs and build process should support nested objects Sep 23, 2019
@zepumph
Copy link
Member Author

zepumph commented Sep 23, 2019

@samreid and I co-reviewed and worked on this today. I think that this issue should be renamed since we aren't going to create a TranslatableStringUtil anymore.

We checked off boxes in the first issue as we went.

Audit ROSETTA/TranslationUtils.js and other files to support this.

No rosetta code should need be touched because the output to the sim is still the same flat map (same output from CHIPPER/getStringMap). See phetsims/rosetta#215 for the issue to double check.

I think that the goal of this issue was to:

  • make sure that en json files support nested strings
  • still support flat objects everywhere else, especially where ever rosetta interfaces with.
  • simplify the code as much as possible so that requirejs and build string related code are as aligned in implementation as possible.
  • provide support to move a11y strings from *A11yStrings.js files over to their en json counterparts.

Everything above has been done and reviewed with @samreid. We have created side issues for outstanding work, and will hopefully be able to move to master after the a11y check is added in phetsims/rosetta#214.

zepumph pushed a commit to phetsims/perennial that referenced this issue Oct 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants