-
Notifications
You must be signed in to change notification settings - Fork 29.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Language Packs #39178
Comments
@egamma added Option 4 |
@dbaeumer Another thought for option 1: Let's call that index to key translation indexing. Could we index a Marketplace extension at the time the extension is installed/updated? Possibly also after Code itself is updated? Also, we wouldn't load the translation until that indexing has been done? That way, indexing only happens once and loading up the extension would be fast and behave exactly the same as the core languages. Code can do the indexing in its core at startup and cache it in the user data dir. The first launch would be slow, but consecutive launches would be fast, as long as the extension and Code itself are the same versions. |
@joaomoreno nice idea. |
Moved to December to continue discussion |
@joaomoreno I liked the idea of indexing at run time. Adding to it, can we make them selectable as themes. User can install multiple language extensions and can select the language using a picker. We can do the indexing when the language is selected and reload the window? |
That's trickier since UI labels are present in the main process too (menus). |
We do have some sort of picker: F1 > Configure Language The picked language is a special setting since we need to read it very early in the startup phase to configure the nls plugin of the loader correctly. So it is not stored in the usual settings. |
How about prompting to quit and restart VS Code for the changes to apply?
Can we write the selected language into that file after selection? |
Yes, that is what is currently happening when you use F1 > Configure Language. |
Right, My thought is to show a picker of languages to select, once selected, we can write into the file and ask the user to restart. Not sure if this functionality is already exists. |
Yes, that could directly write to the file and restart VS Code. I had an item for this but we closed it since users don't change language often. Usually only once. |
//cc @aeschli |
@aeschli and I discussed this on how to best structure the content of a language pack (at least for translations inside the VS Code core (no extensions)). The conclusion was to pre-compute as much as possible and to have files as large as possible. For the code I therefore propose the following structure: {
"data": {
"vs/code/electron-main/main": [
"Guten Tag",
"Gute Nacht"
]
},
"keys": {
"vs/code/electron-main/main": {
"goodMorning": 0,
"goodNight": 1
}
},
"hashes": {
"vs/code/electron-main/main": "Key sequence hash"
}
} As seen the file already contains the messages stored as an array. If the hash value matches the one we will store in I did some first performance testing and we should be able to fully regenerate a bundle in ~200ms. This is only a price we pay when a new release is installed. A second startup will not pay that price. And it will only happen for translation we don't ship in the box. A big unknown is currently translations for bundled extension. The reason is that we don't do any bundling here and therefore there are quite some files to write which is performance wise not good. To not block the startup for too long we could do this in parallel when the workbench already loads. We would only need to synchronize with the extension host startup. |
Actually with the work @joaomoreno is doing we could generate the files during the install process at least for Windows and Mac. |
I coded the language pack generation for the core and the extensions. Since the language pack
@aeschli lets discuss this in more detail when you are back in the office. |
Language Packs (under construction)
A while ago we opened VS Code's language set to contribution from the community by moving the translation database to Transifex. Since then quite some languages got added. However there is currently no vehicle to install these languages with a stable version of VS Code. The stable version still only ships with the 9 core VS Code languages. Two extra languages (pt_br and hu) have been added to the insider build.
Instead of pre-bundling all new languages with VS Code we should come to a model where these languages can be installed later on like users install additional feature via extensions.
How is VS Code localized
I will first outline how VS Code is localized today. The localization consists of the following parts:
Tagging strings to be translated
VS Code uses a tagging approach to mark strings to be translated directly in the source code. It therefore provides a translation function
nls.localize
. Strings pass to that function as an argument are tagged for translation. Strings in single quotes are in general treated as 'technical' which don't require any translation. Strings in double quotes outside a localize function call are treated as strings that need translation but aren't and are flagged by a linter rule as untranslated. A typicalnls.localize
call looks like this:We also maintain an npm module that allows for the same approach in extension code. The npm module is called
vscode-nls
.During normal compile time the strings inside the
nls.localize
call stay as they are. This ensures for quick turn around cycles during development time. Furthermore it is important to note that the truth of the strings is in source code (TypeScript and JavaScript). VS Code doesn't maintain resource bundles or property files in other formats.Extracting strings
Strings to be translated are automatically extracted from the source code during build time. This extraction process does the following things:
key
and thevalue
and puts them into a special meta data file (json format).key
with an indexThe meta data file contains all strings with their key / value pair that are used inside VS Code. It is named
nls.metadata.json
and it is produced during the build process and ships with VS Code.The above example looks like this in a version we ship:
Pushing to Transifex
The content of the
nls.metadata.json
is then used to upload the strings to be translated to Transifex. Since VS Code has thousands of strings the translation is grouped into smaller projects to make them easier to handle in Transifex. The following source file describes how these strings are grouped into projects: https://github.com/Microsoft/vscode/blob/master/build\lib\i18n.resources.json#L1Pulling translations from Transifex
Translations are pull from Transifex and stored alongside the source in the VS Code GitHub repository. They are all under the i18n folder. Storing the translation together with the source code is necessary to be able to version source code and translations together. Otherwise it would for example be very hard to do a recovery build on an older version with exactly the same translations. The translated strings are stored under the
i18n
folder where the first sub folder is the translated language. The structure underneath the langue folder is isomorphic to the source code folder structure under thesrc
folder. However ts/js source files which don't contain any translatable strings will not have a correspondingi18n.json
file. The files in thei18n
folder are all machine generated and should never be edited by a developer.Building translation Bundles
During build time (when we build s shippable version of VS Code) the build process will also generate translation bundles per supported language (currently the 9 code languages). These translation bundles do have the same granularity as the source bundles have. For example there is a workbench.main.js (which bundles most of our workbench code). So there are corresponding workbench.main.nls.${lang}.js files which contain the translated strings.
These translation bundles are optimized for memory footprint and low CPU consumption when looking up strings. This is achieved using the following two techniques:
A translation bundle is
statically
linked to a VS Code version and it is very likely not functioning correctly with a different VS Code version.This optimization happens for all VS Code core code and our built in extensions. The mechanism and the necessary build tools are also available for outside extensions via the
vscode-nls-dev
npm module.Language Packs
It is desirable that language packs come as extensions and are managed by the market place. We don't want to add another channel to host language packs nor do we want to ship all languages in the box (size, language deprecation, ...). To provide such language pack extensions we need to explorer wo things:
Building Language Packs
Especially the optimization we do when building translation bundles (key -> index replacement and default language lookup during build time) makes it harder for third parties to produce language pack extensions. In general we have three choices:
Option 1
Basically we would still replace the key by an index during build time. However during runtime we would do the following for non core languages:
nls.metadata.json
file into memorynls.metadata.json
We could think about giving up on the current solution later on and only use a dynamic runtime solution instead of the optimized statically linked build time solution. This would avoid loading the
nls.metadata.json
file but would leave the key in the generated JS file.Option 2
We publish all build tools that are currently inside the code VS Code repository as standalone npm package so that third parties can run the same scripts to bundle translation files.
Option 3
We publish all language extensions to the market place during build time. The translations itself would still come from Transifex. However the translations would also go into the i18n folder like our core languages do and our build scripts would generate language extensions and publish them to the market place.
Option 4
We either include all languages available in the normal VS Code build or we have two different VS Code builds. A first which includes the nine core languages and a second call VS Code International which includes all language currently available in Transifex. The advantage would be no changes to build scripts, loader / start up code or extension installation. However we would need to create and manage these additional builds.
Proposal
I am favoring Option 3. Pros are:
Cons are:
I don't like Option 1 since it will add a second translation bundle runtime story with its own set of bugs (at least for a while; I am convinced that the optimization we do are worth it especially during startup time). If we stick with two different solutions (core / contributed languages) core language and contributed languages will look different and we will always ship all core languages in the box. The only advantage I can see with option 1 is that it would allow to start VS Code with an old outdated language pack since missing strings will dynamically fall back to English during runtime.
Option 2 would be doable and would allow us to treat core and contributed languages the same. However from the experiences with maintaining a separate repository and npm module it might not be worth compared to option 3. In addition we would end up with more outdated language packs when we ship a new version of VS Code.
Language Pack Extensions
Since option 3 (as option 2) language packs are statically 'linked' against a VS Code version we would need the following features for extensions (if not already present):
The text was updated successfully, but these errors were encountered: