-
-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size impact of editor and class reference translations on editor binaries #3421
Comments
Some numbers of the actual impact of each resource on binary size. Builds made with Editor l10n only(All files in
Classref l10n only(All files in
All included
Footnotes
|
Maybe we should remove all translations from the editor binary, and distribute them as a separate "language packs", and auto download the required pack when the language is selected / on the first start. It this case, we can also move all extra editor fonts (about 7 MB) to the language packs as well (and use better CJK fonts, with the better coverage and regional variants). Currently, we use |
I dont really think its a problem, the editor gets bigger, but we could probably improve the translation compression code. One obvious thing that comes to mind is using an md5 or even a 64 bit hash rather than the English key. |
That could be interesting, but it would need significant changes to the way we package and distribute releases, and ensure that the process for fetching new translations and fonts is seamless (which is not trivial if e.g. fonts need to be imported in the project to be usable in the editor). It's also worth noting that translations and fonts for a given language don't necessary go together. You do need the fonts for e.g. Arabic to use the Arabic editor translations, but you do not need the translations for all the languages that you want to support for game i18n (i.e. you may develop a game using the editor in pt-BR while you want to support game localization to Arabic and CJK - you don't need their editor translations, only their fonts). |
As for reducing editor binary size due to font embedding, loading system fonts is worth investigating too: #306 This will result in a slightly different appearance across platforms, but I think this is acceptable for non-Latin languages. |
Editor size isn't a huge problem to me but if we can figure out the way to keep it us small as possible that is obviously a bonus. Increase by 20% in editor size due to a feature majority of users won't need and those who need it will only need tiny fraction of available translations anyway stands in a big contrast to how Godot normally handles things avoiding bloat. Is 90mb acceptable size, yes it is but is 70 MB a better size obviously. So if I was to vote I would vote for something similar to the way we handle Godot exports you don't have all exports available until you need them and then you just download the one you need which is nice solution. And likely exporting to mobile is used by more people than German translation for example. |
Started with this, which brings the size down to 75.12 MiB for the same build conditions as #3421 (comment), i.e. a relative increase of 1.11 MiB compared to pre-classref translation builds. That's a total of 5.73 MiB used for translations in the binary (compared to 30 MiB before). |
This reduces the size of the editor binaries significantly, as we otherwise embed all WIP translations, including ones with very low completion ratios, and end up paying for the size of all `msgid`s for each locale. Cf. godotengine/godot-proposals#3421 for details. The thresholds used are: - 30% for the editor interface (should already include most common strings while more obscure ones like UndoRedo action names might be untranslated). - 10% for the class reference: this is a HUGE resource and 10% is already a lot of useful content, especially if focused on the most used APIs. For 3.x, we also exclude languages that require complex text layout support to be displayed properly. This currently reduces the size of the editor binary by 17% on Linux. The list will be synced manually every now and then.
This reduces the size of the editor binaries significantly, as we otherwise embed all WIP translations, including ones with very low completion ratios, and end up paying for the size of all `msgid`s for each locale. Cf. godotengine/godot-proposals#3421 for details. The thresholds used are: - 30% for the editor interface (should already include most common strings while more obscure ones like UndoRedo action names might be untranslated). - 10% for the class reference: this is a HUGE resource and 10% is already a lot of useful content, especially if focused on the most used APIs. This currently reduces the size of the editor binary by 17% on Linux. The list will be synced manually every now and then. (cherry picked from commit 8425c58)
There's no need to add fonts to the project / import, using the same global config folder as export templates do is probably a better way.
Fonts and translations can be separate downloads ("download full language pack" or "download only fonts" options). It's also giving users more control over fonts in general (unlike the custom font editor setting, it allows more than one file in the font stack, e.g. adding emoji fonts). I'm currently experimenting with moving stuff out of the binary in this branch https://github.com/bruvzg/godot/tree/lang_packs_poc, right now it's done for the fonts and translations (no installation/download UI), and seems to work fine. |
@bruvz's idea sounds interesting, altough it risks to become way too complicated IMO. One nice thing about godot is that it's pretty much self-contained, with the exception of release templates (which you technically don't need). Anyways, modular language packs or not, @reduz' idea seems like the next logical step to do IMO. |
We could remove the Pirate locale to save some space in the editor binary. I know it's no fun, but it's one way to further reduce the impact on binary size 🙂 |
This reduces the size of the editor binaries significantly, as we otherwise embed all WIP translations, including ones with very low completion ratios, and end up paying for the size of all `msgid`s for each locale. Cf. godotengine/godot-proposals#3421 for details. The thresholds used are: - 30% for the editor interface (should already include most common strings while more obscure ones like UndoRedo action names might be untranslated). - 10% for the class reference: this is a HUGE resource and 10% is already a lot of useful content, especially if focused on the most used APIs. For 3.x, we also exclude languages that require complex text layout support to be displayed properly. This currently reduces the size of the editor binary by 17% on Linux. The list will be synced manually every now and then.
This reduces the size of the editor binaries significantly, as we otherwise embed all WIP translations, including ones with very low completion ratios, and end up paying for the size of all `msgid`s for each locale. Cf. godotengine/godot-proposals#3421 for details. The thresholds used are: - 30% for the editor interface (should already include most common strings while more obscure ones like UndoRedo action names might be untranslated). - 10% for the class reference: this is a HUGE resource and 10% is already a lot of useful content, especially if focused on the most used APIs. For 3.x, we also exclude languages that require complex text layout support to be displayed properly. This currently reduces the size of the editor binary by 17% on Linux. The list will be synced manually every now and then.
Some other potential ways to decrease editor size:
|
It's already excluded since godotengine/godot#54020 (same with all translations that don't go over the threshold for inclusion). |
Describe the project you are working on
Godot editor localization
Describe the problem or limitation you are having in your project
We discussed this on Rocket.Chat
#translation
, so I'm summarizing the findings here so that we can work on solving some or all of the inefficiencies in our current internationalization workflow.We currently have two engine resources which can be localized uses gettext PO files:
editor/translations/
.doc/translations/
(this one is new in 4.0 and 3.4 following [3.x] i18n: Add support for translating the class reference godot#53511).The current process is that we embed the PO files directly in the editor binary by generating a header with zlib compressed contents of each file: https://github.com/godotengine/godot/blob/d742dcd3ceaa614d2688caed59ec0c75d4041985/editor/editor_builders.py#L75-L122
These embedded PO files are then loading by the editor using the PO resource loader (as if they were external
.po
files included with the editor binary).This worked OK while we had only the editor translations, but now with the much bigger class reference resource, we're starting to see a big impact on binary size:
Once compiled and optimized, the 90M
doc_translations.gen.h
leads to a 16M increase of the size of the Windows editor binary:The 26M of
editor_translations.gen.h
must similarly account for a handful of MBs too, didn't check.Some problems identified:
msgid
s (source strings) and PO metadata (comments), and that for every single PO file. So even languages which are only 1% translated add a whopping 2.5M toeditor/doc_translations.gen.h
for example, as it's the size of theclasses.pot
file once compressed and written as a byte array.Describe the feature / enhancement and how it helps to overcome the problem or limitation
I don't have a full-fledged proposal to change this yet, finding one will be the aim of this proposal.
There are a few low-hanging fruits we can work on though:
Add a filter in
editor/SCsub
to only include translations with a high enough completion ratio. This could be done using gettext to get a percentage, but that would add a dependency on our buildsystem, so the simplest is probably to just hardcode a list based on the completion ratios on Weblate. This will significantly reduce the cost of embedding near empty translations (that we need to keep in the repo for Weblate itself so that they can be worked on by translators).Strip comments (aside from
#, fuzzy
as we need it in the PO loader to skip fuzzymsgstr
s) ineditor/editor_builders.py
before compressing the file and writing to the header. This should save a significant amount of bytes as there's a lot of comments indicating the provenance of strings in the source code.Check if the compressed file contents could be saved in a more size-optimized format than an endless array of integers. It's weird that zlib compressed data ends up taking more space than the original files.
Later on, we might want to rethink how we handle those translations so that we don't need to bundle the whole PO files (which is a source format, not meant for direct consumption, though that's the workflow we use also for game translations).
We could write a Python parser that converts the gettext PO file contents to data structures that Godot can consume readily. So instead of using the PO resource loader, we could write a new Translation format directly with the contents optimized for minimal size usage (e.g. in a
Map<StringName msgid, Map<StringName lang, StringName msgstr>>
, so we "pay" only once formsgid
s, and not for each language). @reduz also suggested using a md5 hash of the source strings as keys.To be discussed further, but I'd first start by doing some of the low hanging fruits above so we see how much size is left taken by translations. A few MBs is a small price to pay for internationalization, but if it's 30% of the binary it starts being more problematic.
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
To be determined based on discussion.
If this enhancement will not be used often, can it be worked around with a few lines of script?
No, it's about optimizing the editor binary size.
Is there a reason why this should be core and not an add-on in the asset library?
See above.
The text was updated successfully, but these errors were encountered: