Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracted translations include unused translations #24145

Closed
1 task
fischeversenker opened this issue Oct 26, 2022 · 12 comments
Closed
1 task

Extracted translations include unused translations #24145

fischeversenker opened this issue Oct 26, 2022 · 12 comments

Comments

@fischeversenker
Copy link
Contributor

fischeversenker commented Oct 26, 2022

Command

extract-i18n

Is this a regression?

  • Yes, this behavior used to work in the previous version

The previous version in which this bug was not present was

No response

Description

The extracted translations from the extract-i18n builder include trans-units from unused templates/components unless these components import and use the $localize function. This is the case for both app components as well as library components.

Minimal Reproduction

Reproduction repository: https://github.com/fischeversenker/ng-decorator-tree-shaking
(don't be fooled by the name of the repository. I re-used an existing repro-repo.)

In essence:

  • Import a library module that exports components with translatable strings but where the components don't use the $localize function
  • don't use any of these components in your app
  • run extract-i18n on your app and verify that the generated messages.xlf file contains unused translations from these components

Exception or Error

No response

Your Environment

Angular CLI: 14.2.6
Node: 16.16.0
Package Manager: yarn 1.22.19 
OS: linux x64

Angular: 14.2.7
... animations, common, compiler, compiler-cli, core, forms
... localize, platform-browser, platform-browser-dynamic, router

Package                         Version
---------------------------------------------------------
@angular-devkit/architect       0.1402.6
@angular-devkit/build-angular   14.2.6
@angular-devkit/core            14.2.6
@angular-devkit/schematics      14.2.6
@angular/cli                    14.2.6
@schematics/angular             14.2.6
ng-packagr                      14.2.2
rxjs                            7.5.7
typescript                      4.7.4

Anything else relevant?

No response

@alan-agius4 alan-agius4 added the needs: discussion On the agenda for team meeting to determine next steps label Oct 27, 2022
@alan-agius4
Copy link
Collaborator

This is currently working as intended as since tree-shaking and dead-code elimination are not done during extraction.

Let me bring this up with the rest of the team.

@bridzius
Copy link

bridzius commented Nov 9, 2022

@alan-agius4 - a related question.
If tree-shaking/dead-code elimination is not done, does that mean that in theory it should be possible to extract-i18n statically without running the build?

Asking this, because extract-i18n is an extremely heavy operation (basically a full build) in large applications with several hundreds of feature modules.

@fischeversenker
Copy link
Contributor Author

Thanks for your input, @bridzius.
Although I see your point and get your question, it is quite a different topic than what I was asking for. Would you mind opening a separate issue for that, please? 🙏

@alan-agius4, thanks for the feedback. Do you have any update on the team's decision?

@alan-agius4
Copy link
Collaborator

@fischeversenker, we still didn’t get to this issue. We should be able to discuss this the next or following week.

@dgp1130
Copy link
Collaborator

dgp1130 commented Dec 15, 2022

It is often desirable to extract messages from unused components to support a workflow such as:

  1. Write a new (unused) component and merge it into the repo.
  2. Extract messages, translate them, and merge them back into the repo.
  3. Connect the component into the application to display it.

This is useful because translations can often take a long time O(days) and the process is not easily started in an automated fashion without merging the new component first. Otherwise you need to add the new component to an existing one but then hide it via a flag so international users don't see untranslated content. It takes a certain scale before this workflow is useful, but we use it a lot in Google.

There's a separate question here about whether to extract libraries which has a lot more nuance. It's come up before in #17140 so we can continue the discussion there.

@dgp1130 dgp1130 closed this as completed Dec 15, 2022
@dgp1130 dgp1130 removed the needs: discussion On the agenda for team meeting to determine next steps label Dec 15, 2022
@fischeversenker
Copy link
Contributor Author

fischeversenker commented Dec 19, 2022

Thanks for the feedback, @dgp1130 👍

I understand your internal use case. I'll have a look at the linked issue and try to find a workaround for my use case.

What might not have been perfectly clear in the title of this issue is that there is a certain inconsistency in the extracted translations, which might be relevant to the workflow you are using at Google:

Unused components that use the global $localize function are excluded from the extracted translations.
Unused components that do not use the global $localize function are included in the extracted translations.

You might want to investigate this. Have a look at the reproduction repository (upgraded to Angular 15) that I provided in the issue description for a quick start 🚀

@dgp1130
Copy link
Collaborator

dgp1130 commented Dec 20, 2022

Ah sorry @fischeversenker, didn't see the difference with $localize. That's definitely weird given that I'm pretty sure <div i18n>Hello, World</div> just generates a $localize call under the hood, so technically both components should use $localize, regardless of how you import it. Chatting with @AndrewKushnir and looking at the compiled output for these libraries, they don't actually generate $localize, but I think that's because they use partial compilation, while i18n only applies to full compilations. So ng extract-i18n is probably doing a full compilation and generating $localize calls.

If we switch tsconfig.lib.prod.json to use compilationMode: 'full' we can see the actual output being processed:

import { Component } from '@angular/core'; // Carried over from user-space.
import { $localize } from '@angular/localize/init'; // Carried over from user-space.
import * as i0 from "@angular/core"; // Added by compiler.

// Carried over from user-space.
export class LibraryWithLocalizeComponent {
    constructor() {
        this.name = $localize `my name`;
    }
}

// ...

// Added by compiler (simplified).
LibraryWithLocalizeComponent.ɵcmp = /*@__PURE__*/ i0.ɵɵdefineComponent({
  // ...
  consts: () => {
    const = $localize `unused library template with localize`;
    return [i18n_0];
  },
});

Both of these use the imported $localize, however $localize is actually a global definition with special semantics to the Angular compiler. So I think this shadowing is actually confusing the compiler. If we use import '@angular/localize'; instead and rely on the global declaration, these two messages extract correctly. That's probably the workaround you need @fischeversenker.

Beyond that immediate problem, this does bring up a few rough edges we can hopefully improve.

  1. Users really shouldn't import @angular/localize or @angular/localize/init. The compiler should do this for you, though it's necessary right now to pass type checking. @alan-agius4 already has a PR for that, so hopefully you should be able to drop the @angular/localize import altogether.
  2. We should ban users shadowing built-in symbols like $localize or i0 since it can lead to situations like this. One nuance is that some users may want to define the $localize implementation at runtime. I have no idea if anyone actually does this, but even then I think you could probably do it without shadowing $localize (something like window.$localize = () => { /* ... */ }).
  3. We should remove the $localize export from @angular/localize/init, given that it's just misleading for users to import and then get confused when it doesn't extract. The global declaration should be the only way to use this symbol. Maybe we should remove it from @angular/localize as well?
  4. If we move forward with 3., we should also consider deleting (or at least stop documenting) @angular/localize and @angular/localize/init given that exposing $localize is their whole purpose. Since that's really an implementation detail of the compiler. The package might need to exist, but it should at least be internal to Angular. There were some discussions some time ago about possibly supporting $localize for non-Angular usages, not sure whatever happened to that or if anyone does it today.

I'll file some issues in the main repository for these and see if we can improve this particular footgun. Thanks for pointing this out, lots of interesting tidbits here. 😁

@jesusreal
Copy link

jesusreal commented Dec 21, 2022

@dgp1130, thanks for taking the time to create those new issues and improve the i18n behaviour 🙏

Coming back to the main issue, I would like to question the decision of not taking action on supporting the removal of unused translations in the i18n extraction.

The workflow you described in your comment above is completely valid in my eyes and I think it is sensible that Angular supports it. But I don't think this should exclude the use case described by @fischeversenker in this issue. I consider it at least as valid as the one you described, and should therefore be supported in addition to what already is in place.

I don't think the Angular experts that spend the day working with the framework are happy to invest a lot of effort on tree shaking their applications for better performance, and end up having translation files that include (many) unused translations. This is a ton of work in terms of maintenance of the translations and I think Angular should save the community this considerable amount of unnecessary effort, which can be much better invested on building great Angular apps.

I (my whole team and I am sure many Angular developers) would be very grateful if you would reconsider your position about this topic and provide a solution to the issue initially described by @fischeversenker. I don't think it should be much effort on your side, though I might be wrong since I am of course missing knowledge on the angular codebase.
My proposal would be to provide a boolean option for the @angular-devkit/build-angular:extract-i18n builder. This would allow to opt-in to extract only the translations used by the app. When this option is set to true, the builder will build the app with the optimization flags enabled (details about the required changes are for sure missing).
Edit: this seems to be the code that would need to be updated, based on the value of the option I mentioned above.

Would you consider providing this feature to the i18n extraction? 🙏

@dgp1130
Copy link
Collaborator

dgp1130 commented Dec 21, 2022

#17140 is intended to track figuring out the full story for using libraries with i18n. There's a lot more complexity here than just excluding library messages when extracting the app, given that we'd need to define if / how libraries ship their own translations, how those translations are consumed by the application, how to handle differences in language / locale, how to override library translations, etc. There's definitely a lot of rough edges there, and it's something we'd like to improve, it just hasn't been prioritized yet.

Tree-shaking the app before message extraction is also more narrow than extracting messages only from application code, given that unused messages would be dropped. So I don't think applying optimizations would be the right mechanism for limiting extracted messages to application code.

I don't think the Angular experts that spend the day working with the framework are happy to invest a lot of effort on tree shaking their applications for better performance, and end up having translation files that include (many) unused translations.

Just to be clear, Angular embeds translations directly in generated code for production builds, so there's no performance impact to having unused messages. We don't ship extracted .xlf files to end users.

@fischeversenker
Copy link
Contributor Author

Just to be clear, Angular embeds translations directly in generated code for production builds, so there's no performance impact to having unused messages. We don't ship extracted .xlf files to end users.

I'd like to pick up on this, to explain more of my motivation: We don't worry about shipping unused translations to our users. We worry about shipping unused translations to our translators.

In the project that I'm currently working on we are working with a lot of big component libraries and we create a lot of small applications that each uses >=1 of these libraries. The majority of the extracted translations from these applications are unused since they are coming from the unused components from these libraries. But we are not easily able to tell our translators which ones are actually needed and which ones aren't. This means that we either have to have them translate everything (separately for each of these apps), or we very carefully scan the app and figure out which of the trans-units are actually required to be translated.

I understand that a lot of this is covered in #17140, but I can imagine a solution that's somewhere in the middle. Not to move the translation files into the libraries, but to eliminate unused trans-units in apps. Something like the suggestion from @jesusreal, where we have a flag that determines whether the i18n-extraction happens before or after tree-shaking.
A quick google search has not revealed any community-built i18n-extraction builder that would allow doing this and I wasn't able to find a neat solution for this in #17140. When time permits, I'll play around with the code in node_modules to adjust the official extract-i18n builder. In the long run, it would still be very much appreciated if you could consider such an option.

@dgp1130
Copy link
Collaborator

dgp1130 commented Jan 9, 2023

Tree shaking messages is one approach we can consider as part of #17140, but ultimately this will require a lot more thought and investigation to determine the ideal solution to library localization and it just hasn't been prioritized yet.

@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators Feb 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants