Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hundreds of 404 errors for string files in Studio. #1308

Closed
pixelzoom opened this issue Aug 22, 2022 · 44 comments
Closed

Hundreds of 404 errors for string files in Studio. #1308

pixelzoom opened this issue Aug 22, 2022 · 44 comments

Comments

@pixelzoom
Copy link
Contributor

Related to #1302 ... There are hundreds of 404 errors when attempting to load any sim into Studio.

To reproduce, go to phetmarks, choose any PhET-iO-instrumented sim, choose mode "studio", press "Launch" button.

For example:

load-unbuilt-strings.js:112          GET http://localhost:8080/babel/geometric-optics/geometric-optics-strings_aa.json 404 (Not Found)
requestStringFile @ load-unbuilt-strings.js:112
(anonymous) @ load-unbuilt-strings.js:157
(anonymous) @ load-unbuilt-strings.js:156
(anonymous) @ load-unbuilt-strings.js:166
load-unbuilt-strings.js:112          GET http://localhost:8080/babel/geometric-optics/geometric-optics-strings_fy.json 404 (Not Found)
requestStringFile @ load-unbuilt-strings.js:112
(anonymous) @ load-unbuilt-strings.js:157
(anonymous) @ load-unbuilt-strings.js:156
(anonymous) @ load-unbuilt-strings.js:166
load-unbuilt-strings.js:112          GET http://localhost:8080/babel/geometric-optics/geometric-optics-strings_ga.json 404 (Not Found)
requestStringFile @ load-unbuilt-strings.js:112
(anonymous) @ load-unbuilt-strings.js:157
(anonymous) @ load-unbuilt-strings.js:156
(anonymous) @ load-unbuilt-strings.js:166

...

load-unbuilt-strings.js:112          GET http://localhost:8080/babel/scenery-phet/scenery-phet-strings_zu.json 404 (Not Found)
requestStringFile @ load-unbuilt-strings.js:112
(anonymous) @ load-unbuilt-strings.js:161
(anonymous) @ load-unbuilt-strings.js:158
(anonymous) @ load-unbuilt-strings.js:96
load (async)
requestStringFile @ load-unbuilt-strings.js:83
(anonymous) @ load-unbuilt-strings.js:157
(anonymous) @ load-unbuilt-strings.js:156
(anonymous) @ load-unbuilt-strings.js:166
load-unbuilt-strings.js:112          GET http://localhost:8080/babel/sun/sun-strings_zu.json 404 (Not Found)
requestStringFile @ load-unbuilt-strings.js:112
(anonymous) @ load-unbuilt-strings.js:161
(anonymous) @ load-unbuilt-strings.js:158
(anonymous) @ load-unbuilt-strings.js:96
load (async)
requestStringFile @ load-unbuilt-strings.js:83
(anonymous) @ load-unbuilt-strings.js:157
(anonymous) @ load-unbuilt-strings.js:156
(anonymous) @ load-unbuilt-strings.js:166
load-unbuilt-strings.js:112          GET http://localhost:8080/babel/tambo/tambo-strings_zu.json 404 (Not Found)
requestStringFile @ load-unbuilt-strings.js:112
(anonymous) @ load-unbuilt-strings.js:161
(anonymous) @ load-unbuilt-strings.js:158
(anonymous) @ load-unbuilt-strings.js:96
load (async)
requestStringFile @ load-unbuilt-strings.js:83
(anonymous) @ load-unbuilt-strings.js:157
(anonymous) @ load-unbuilt-strings.js:156
(anonymous) @ load-unbuilt-strings.js:166
load-unbuilt-strings.js:112          GET http://localhost:8080/babel/twixt/twixt-strings_zu.json 404 (Not Found)
@pixelzoom
Copy link
Contributor Author

In Slack, @samreid said:

A proposed plan for the 404s was that each repo would have a JSON list of which translations it has, and we would only request the translations in the registry. But it only affects unbuilt version and we didn’t feel it was a blocking problem. I guess the main question is making sure those json registries get updated at the appropriate times.

@jonathanolson
Copy link
Contributor

That would be the plan, and it would presumably need a rosetta change (and seems a bit fragile). Thoughts on whether that would be worth avoiding the 404s?

Alternatively, couldn't we just... create a bunch of empty files, one for every locale/repo? @samreid thoughts on this?

@pixelzoom
Copy link
Contributor Author

I think avoiding the 404 errors is important. They could result in someone missing important errors. I'm certainly not going to scroll back to the beginning of that console output to see if any important errors preceeded the 404 errors.

@samreid
Copy link
Member

samreid commented Aug 22, 2022

Alternatively, couldn't we just... create a bunch of empty files, one for every locale/repo?

Interesting plan, maybe we should check with @jbphet and @liammulh to see if that would have ramifications for rosetta?

@liammulh
Copy link
Member

I don't understand why creating empty files that Rosetta presumably wouldn't be fetching would negatively affect Rosetta.

I'd like to understand this further, but I'm not on PhET hours right now.

@samreid
Copy link
Member

samreid commented Aug 23, 2022

@zepumph also reported that loading some 1500 404s on the state wrapper was causing timeout errors, we speculated that loading that many non-404s may have the same problem? But we would have to solve that to support 100% i18n anyways, so maybe we should start there?

@liammulh
Copy link
Member

Can we discuss this in dev meeting?

@samreid
Copy link
Member

samreid commented Aug 24, 2022

I wrote this script to test creating all the locale files:

// Copyright 2022, University of Colorado Boulder

// Copied from load-unbuilt-strings.js
const locales = 'aa,ab,ae,af,ak,am,an,ar,ar_MA,ar_SA,as,av,ay,az,ba,be,bg,bh,bi,bm,bn,bo,br,bs,ca,ce,ch,co,cr,cs,cu,cv,cy,da,de,dv,dz,ee,el,en,en_CA,en_GB,eo,es,es_CO,es_CR,es_ES,es_MX,es_PE,et,eu,fa,ff,fi,fj,fo,fr,fu,fy,ga,gd,gl,gn,gu,gv,ha,hi,ho,hr,ht,hu,hy,hz,ia,ie,ig,ii,ik,in,io,is,it,iu,iw,ja,ji,jv,ka,kg,ki,kj,kk,kl,km,kn,ko,kr,ks,ku,ku_TR,kv,kw,ky,la,lb,lg,li,lk,ln,lo,lt,lu,lv,mg,mh,mi,mk,ml,mn,mo,mr,ms,mt,my,na,nb,nd,ne,ng,nl,nn,nr,nv,ny,oc,oj,om,or,os,pa,pi,pl,ps,pt,pt_BR,qu,rm,rn,ro,ru,rw,ry,sa,sc,sd,se,sg,sh,si,sk,sl,sm,sn,so,sq,sr,ss,st,su,sv,sw,ta,te,tg,th,ti,tk,tl,tn,to,tr,ts,tt,tw,ty,ug,uk,ur,uz,ve,vi,vo,wa,wo,xh,yo,za,zh_CN,zh_HK,zh_TW,zu'.split( ',' );

// Find all directories in a directory
const fs = require( 'fs' );

const dirs = fs.readdirSync( '../babel', { withFileTypes: true } )
  .filter( dirent => dirent.isDirectory() )
  .map( dirent => dirent.name ).filter( dir => dir !== '.git' );

dirs.push( 'sun' );
dirs.push( 'tambo' );
dirs.push( 'twixt' );

dirs.forEach( dir => {
  locales.forEach( locale => {

    if ( locale !== 'en' ) {

      const repo = dir;
      const filename = `../babel/${repo}/${repo}-strings_${locale}.json`;
      if ( !fs.existsSync( filename ) ) {
        try {
          fs.mkdirSync( `../babel/${repo}` );
        }
        catch( e ) {}
        fs.writeFileSync( filename, `{
}` );
      }
    }
  } );
} );

Running it, I observed the following:

  • It creates 16.762 files in babel.
  • Launching gravity and orbits with ?locales=* launches very fast with no 404s (I couldn't tell without measuring timing whether it was faster than with the 404s).
  • We would need to regenerate more files after (a) adding another locale or (b) adding another repo.
  • Does the rosetta UX distinguish between "no translation file exists" vs "a translation file exists but is empty"? Should it?
  • Also, I observed that after running this code, unbuilt mode thinks there is a translation for every locale, even though many are empty. So presumably (a) unbuilt sims and (b) built sims would need to distinguish between "no translation file exists" vs "a translation file exists but is empty".

@samreid
Copy link
Member

samreid commented Aug 24, 2022

My network panel for GAO with the strings:

image

and reverting to the 404s:

image

So maybe the 200s are a little faster than the 404s?

@samreid samreid removed their assignment Aug 24, 2022
@samreid
Copy link
Member

samreid commented Aug 24, 2022

  • I also noticed the list of strings is very different in built mode:

image

and unbuilt mode:

image

This will make testing more difficult. For instance, @KatieWoe discovered a layout bug in phetsims/qa#830 but I couldn't reproduce it locally due to this problem.

@samreid
Copy link
Member

samreid commented Aug 24, 2022

This problem is making it take 35 seconds to launch one sim on phettest:

image

Also discovered by @Nancy-Salpepi in phetsims/gravity-and-orbits#434 (comment).

Not sure if changing 404s to 200s would fix it or not. It may be that requesting too many locales would be too slow anyways.

@samreid
Copy link
Member

samreid commented Aug 25, 2022

From today's dev meeting:

@kathy-phet thinks creating these files eagerly is problematic.
@pixelzoom asks if you can detect the existence of a file before requesting it.
@zepumph and @jonathanolson report that does not exist. It cannot be try/caught.

@jonathanolson says: When rosetta makes a change. We just bundle them into one conglomerate string file per that has everything for that repo.
@zepumph says: We could load a file from babel that indicates which translations exist. Then we would request this file first.
@jonathanolson yes, but if phettest has performance problems requesting a ton of files, we still have N repos * M locales = N*M files to load. Why not bundle them into one dependency?
@pixelzoom but what about sims that don't want all the translations, why would they have to load all the translations.?

@jonathanolson one file per repo in babel that has just the strings (no history). Like joist-strings.json with all the locales. Then when loading a sim like friction, it wouldn't include CCK strings.

@zepumph asks: A script that runs a postprocess step, runs once a night. Doesn't need to hook into rosetta. Just a development tool.
@samreid: Run it on grunt update too?

@jbphet
Copy link
Contributor

jbphet commented Aug 31, 2022

@liammulh and I discussed this in our collaboration meeting today. We're in agreement that having a dummy locale file for every possible locale would be problematic, and I think it would get way worse if and when we expand the list of languages we use, which is probably inevitable, since we've already talked about moving from ISO 639-1 to ISO 639-3.

We came up with an alternative approach to the idea of creating a conglomerated file with all the strings: Create just a list of all locales for which a translation exists in each repo. Then the load-unbuilt-strings code would, when the '*' locale is specified, first request the list of supported locales, then use that list to decide which files to load. Here are some of the potential advantages to this approach:

  • We would not have redundant versions of the string information checked in to the babel repo
  • We could not have situations where that redundant information was out of sync
  • It would be much faster to create, since there would be no need to process the string files themselves, just their names
  • We could create this pretty quickly with a script, but after that Rosetta could maintain this information fairly easily, i.e. it would just update the list each time a new translation was created for a repo+locale

Example of what the string list file would look like:

greenhouse-effect-locales.json

{
  "locales": [ "ar", "ar_SA", "de", ... ]
}

Assigning back to @jonathanolson and @samreid to get their input.

@jbphet jbphet assigned jonathanolson and samreid and unassigned liammulh Aug 31, 2022
@samreid
Copy link
Member

samreid commented Sep 1, 2022

A locales file sounds reasonable to me and preferable to a conglomerate file. But I wonder if we can skip this step by relying on everyone's http-server to output a file listing? For instance, if I navigate to: http://localhost/main/babel/gravity-and-orbits/, I see

image

Would it be easy enough to xhr that page, and use a regex to detect locales? I'm using http-server, but also saw similar output with apache. This way it would always be up-to-date and we wouldn't need any scripts or management or maintenance for it.

@samreid samreid assigned jbphet and liammulh and unassigned samreid Sep 1, 2022
@liammulh
Copy link
Member

liammulh commented Sep 1, 2022

But I wonder if we can skip this step by relying on everyone's http-server to output a file listing?

That would only work if the local copy of babel is current. But I think it would be reasonable if every developer keeps their copy of babel current.

One of the reasons JB and I liked the idea of having a {repo}-locales.json file in babel for each repo is that it could be useful in other scenarios.

@samreid
Copy link
Member

samreid commented Sep 1, 2022

@liammulh and I discussed that it "may" work to use the directory listing, but if even one phet developer or server machine doesn't output a compatible directory listing, then we would need the JSON file.

I tested loading https://bayes.colorado.edu/continuous-testing/ct-snapshots/1661987918125/babel from a recent snapshot and was met with this error:

image

So I think the "directory listing" approach is doomed to fail and we should use the locale list.

But now that we are locked in to having to create a file, why not experiment with the conglomerate version? If it saves us even 0.5 seconds every time we launch a sim locally it will be well worth it.

@jbphet jbphet removed their assignment Sep 23, 2022
@samreid
Copy link
Member

samreid commented Sep 27, 2022

Summarizing recommended work from above:

  • Rename _generated to _generated_conglomerate or _generated_composite_strings. UPDATE: I went with _generated_development_strings to match the filename and the notion that these strings are only used during development.
  • What didn't seem to work well was changing the fallback locale, then changing the locale. This didn't seem to behave as I would expect. Do we really need to enabled our phet-io users to set a different fallback locale? UPDATE: I marked it as phetioReadOnly: true
  • There was no help text added. There probably should be. In grunt --help it just says, "Custom task."
  • In generateDevelopmentStrings, There is a catch block for the case where no translations are found that doesn't do anything - it just keeps going. While this would probably work, I think it would be better to warn the user and exit.
  • the task should include an npm update.
  • we could add it to the daily automatic update. If not, probably once a week should be enough.
  • I'm going to mark it as "Excluded" from my IDE, and I think everyone probably should. UPDATE: I sent a slack message to @channel

@samreid
Copy link
Member

samreid commented Sep 27, 2022

In generateDevelopmentStrings, There is a catch block for the case where no translations are found that doesn't do anything - it just keeps going. While this would probably work, I think it would be better to warn the user and exit.

There are actually several repos with no translations. For instance "twixt". This process creates an empty conglomerate file which can be loaded alongside the files that do have translations. I will document it as such.

samreid added a commit that referenced this issue Sep 27, 2022
samreid added a commit to phetsims/babel that referenced this issue Sep 27, 2022
@samreid
Copy link
Member

samreid commented Sep 27, 2022

I think I have addressed or commented on all the recommendations, @jbphet can you please review?

@samreid samreid assigned jbphet and unassigned samreid Sep 27, 2022
@jbphet
Copy link
Contributor

jbphet commented Oct 4, 2022

Mostly the responses seem fine. Two questions.

First, this is what fallbackLocalesProperty looks like in Studio. The documentation doesn't seem to match what I'm seeing. This might be because I'm in the unbuilt mode, and feel free to let me know if so, but the docs say, "The list MUST end in 'en'", but it doesn't - it's an empty list. This seems potentially confusing to users.

image

Second, there is a checkmark next to the item above that says, "the task should include an npm update", but I don't see that change in the code and when I run for-each.sh perennial/data/active-repos grunt generate-development-strings I have the same problem I had before, namely that a lot of the repos don't get strings generated because they haven't had an npm install run. Did you decide against adding an update/install?

@jbphet jbphet assigned samreid and unassigned jbphet Oct 4, 2022
@samreid
Copy link
Member

samreid commented Oct 7, 2022

For the npm install, I added that to the documentation in this commit: ca7595f. Is that sufficient? I don't think our grunt tasks should typically run npm operations (aside from perennial maintenance tasks).

Regarding fallbackLocalesProperty, I agree it looks problematic. @jonathanolson can you please comment on that aspect?

const fallbackLocalesProperty = new Property<string[]>( [], {
  tandem: Tandem.GENERAL_MODEL.createTandem( 'fallbackLocalesProperty' ),
  phetioDocumentation: 'an ordered list of locales to fall back on, for example: ["es", "de", "en"]. The list MUST end in "en".

says it must end in en but the initial value is [].

@samreid samreid assigned jonathanolson and jbphet and unassigned samreid Oct 7, 2022
@jbphet
Copy link
Contributor

jbphet commented Oct 10, 2022

@samreid said:

For the npm install, I added that to the documentation[.]

That's fine by me, thanks for the clarification. It makes sense to me that we shouldn't actively run npm install, since that could end up chewing up a lot of disk space.

Unassigning myself, leaving it assigned to @jonathanolson to respond to the question above.

@jbphet jbphet removed their assignment Oct 10, 2022
@jonathanolson
Copy link
Contributor

fallbackLocales definitely should NOT require containing the fallback locale. It can be empty and works great when empty. I suggest correcting the documentation added in phetsims/joist@9a80165.

@zepumph
Copy link
Member

zepumph commented Oct 11, 2022

  • I recommend reverting phetsims/joist@7b1411b, as the fallbackLocalesProperty is basically entirely a PhET-iO Property for specifying a particular fallback over the "phet" implementation of the fallback algorithm.

@zepumph
Copy link
Member

zepumph commented Oct 11, 2022

All is ready here, and final behavior of the fallbackLocalesProperty will be solidified in https://github.com/phetsims/phet-io/issues/1893. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants