Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate use of semgrep to catch untranslated strings #6380

Open
eloquence opened this issue Mar 30, 2022 · 4 comments
Open

Investigate use of semgrep to catch untranslated strings #6380

eloquence opened this issue Mar 30, 2022 · 4 comments
Labels
goals: improve developer workflow i18n Anything related to translation or internationalization of SecureDrop

Comments

@eloquence
Copy link
Member

freedomofpress/securedrop-client#1272 added a set of handy semgrep rules to the securedrop-client repo to catch untranslated GUI strings. It'd be good to investigate if similar rules would be helpful in this repo, bearing in mind that the actual patterns would of course need to be different and not generate too many false positives.

@cfm
Copy link
Member

cfm commented May 23, 2022

#6368 and #6465 both offer evidence for the value of this linting.

@cfm cfm added i18n Anything related to translation or internationalization of SecureDrop goals: improve developer workflow labels May 23, 2022
@cfm
Copy link
Member

cfm commented May 23, 2022

Why are these omissions so difficult to catch during manual testing in the string-freeze process? At that point in the localization cycle, strings not (or incorrectly) marked for translation are indistinguishable from strings not yet translated.

@cfm
Copy link
Member

cfm commented May 24, 2022

Time-boxed a cranky stab at this using 38c97bb as my tricky target case. As I expected, regex is Semgrep's only view into our .html Jinja templates, and it's a challenging multi-line match given the nesting of HTML → Jinja → Python → HTML.

Targeting c33cbe4 would be an easier first iteration, to catch the basic one-line {{ gettext('foo') }} case. Note that we'll need to match on both ['"].

@cfm
Copy link
Member

cfm commented Nov 4, 2022

#6380 (comment):

Why are these omissions so difficult to catch during manual testing in the string-freeze process? At that point in the localization cycle, strings not (or incorrectly) marked for translation are indistinguishable from strings not yet translated.

We could solve this problem at least for human eyes by turning on Weblate's "pseudolocale generation":

Pseudolocales are useful to find strings that are not prepared for localization. This is done by altering all translatable source strings to make it easy to spot unaltered strings when running the application in the pseudolocale language.

I'll bring this up next week when we revisit our localization roadmap for v2.6.0 and beyond.

legoktm added a commit that referenced this issue Jul 6, 2023
This lints .format() calls being inside gettext(), which has caused us
problems in the past. This is not a complete solution to #6380 since it
doesn't look at HTML templates.

See <https://beta.ruff.rs/docs/rules/#flake8-gettext-int> for full
details.

Refs #6380.
legoktm added a commit that referenced this issue Jul 6, 2023
This lints .format() calls being inside gettext(), which has caused us
problems in the past. This is not a complete solution to #6380 since it
doesn't look at HTML templates.

See <https://beta.ruff.rs/docs/rules/#flake8-gettext-int> for full
details.

Refs #6380.
legoktm added a commit that referenced this issue Jul 6, 2023
This lints .format() calls being inside gettext(), which has caused us
problems in the past. This is not a complete solution to #6380 since it
doesn't look at HTML templates.

See <https://beta.ruff.rs/docs/rules/#flake8-gettext-int> for full
details.

Refs #6380.
legoktm added a commit that referenced this issue Jul 26, 2023
This lints .format() calls being inside gettext(), which has caused us
problems in the past. This is not a complete solution to #6380 since it
doesn't look at HTML templates.

See <https://beta.ruff.rs/docs/rules/#flake8-gettext-int> for full
details.

Refs #6380.
legoktm added a commit that referenced this issue Jul 26, 2023
This lints .format() calls being inside gettext(), which has caused us
problems in the past. This is not a complete solution to #6380 since it
doesn't look at HTML templates.

See <https://beta.ruff.rs/docs/rules/#flake8-gettext-int> for full
details.

Refs #6380.
legoktm added a commit that referenced this issue Jul 27, 2023
This lints .format() calls being inside gettext(), which has caused us
problems in the past. This is not a complete solution to #6380 since it
doesn't look at HTML templates.

See <https://beta.ruff.rs/docs/rules/#flake8-gettext-int> for full
details.

Refs #6380.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
goals: improve developer workflow i18n Anything related to translation or internationalization of SecureDrop
Projects
None yet
Development

No branches or pull requests

2 participants