Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent formatting of “%s” strings #876

Closed
huftis opened this issue Feb 18, 2018 · 32 comments
Closed

Inconsistent formatting of “%s” strings #876

huftis opened this issue Feb 18, 2018 · 32 comments
Assignees

Comments

@huftis
Copy link
Contributor

huftis commented Feb 18, 2018

Strings having the %s placeholder have inconsistent formatting. Some have quotation marks around the placeholder and some don’t. Example:

  • Is the location “%s” wheelchair accessible?
  • Does %s have a toilet?

Also, some of the strings with quotation marks use the ASCII (i.e. non-typographically correct) quotation marks (" instead of “ and ”):

  • Is "%s" road lit here?

Suggestion: Change all the strings to be consistent by having typographically correct quotation marks around the placeholder. The three strings would look like this:

  • Is the location “%s” wheelchair accessible? [No change.]
  • Does “%s” have a toilet?
  • Is “%s” road lit here? [BTW, isn’t there a “the” missing after “Is” here?]

Note that the inconsistent formatting applies to more than the strings I have listed here. They are easy to find by searching for %s in strings.xml.

@westnordost
Copy link
Member

That's true, I also noticed this. The usage of quotation marks should once be made consistent - in all languages as well. English as the source language would be done here in the repo, the other probably in POEditor.

Is “%s” road lit here? [BTW, isn’t there a “the” missing after “Is” here?]

Yeah, it should even be "Is “%s“ lit here", I would say.

So, if someone (you? :-)) want to do this, go ahead! Though, perhaps before we should discuss whether we really want the names to be in quotation marks everywhere or whether we perhaps want to leave the qutotation marks out for everything. (I did not weigh the pros and cons yet, so I have no strong opinion.)

@huftis
Copy link
Contributor Author

huftis commented Feb 18, 2018

Regarding the pros and cons of quotation marks:

In written English, we don’t normally use quotation marks for street names, restaurant names etc. So that’s one argument against including quotation marks. On the other hand, having quotation marks more clearly highlights the name, which is useful in the more technical setting that StreetComplete is. It’s easier to see that the question is about this particular node/POI/way, which has this exact name. This might especially be important for POIs spelled with lowercase letters (which is somewhat popular with trendy cafés and restaurant these days …). So I have a (weak) preference for including the quotation marks.

Regarding the Is “%s” road lit here? string, there already is a string called Is "%s" lit here?. The “road’ string is used for primary, secondary, tertiary, unclassified, service, residential, living_street and pedestrian ways, while the string without “road” is used for other ways (cycleways, parking lots etc.). But I agree that Is “%s” road lit here? would work for both cases. (And the “road” string wouldn’t really work for living streets.)

Yes, I can try to create a PR for this. But if I change the Is “%s” road lit here? string to Is “%s” lit here?, what should be done with the corresponding code:

		boolean isRoad = Arrays.asList(LIT_NON_RESIDENTIAL_ROADS).contains(type) ||
				Arrays.asList(LIT_RESIDENTIAL_ROADS).contains(type);
		if (isRoad)
		{
			if (hasName) return R.string.quest_way_lit_named_road_title;
			else         return R.string.quest_way_lit_road_title;
		}
		else
		{
			if (hasName) return R.string.quest_way_lit_named_title;
			else         return R.string.quest_way_lit_title;
		}

Leave it or simplify it (and removing two strings)? I guess having separate strings might be useful for some languages? On the other hand, having to identical strings might just be confusing for the translators …

@rugk
Copy link
Contributor

rugk commented Feb 19, 2018

As for pro/con, I'd prefer the style without the quotation marks, one usually does not use them like that.

A good compromise could be to make the names italic, if that is possible. That resolves the problem of rarely used quotation marks, but italic text has the same meaning as quotation marks.

@huftis
Copy link
Contributor Author

huftis commented Feb 19, 2018

@rugk Or perhaps bold? That provides greater contrast against the rest of the text, which might be needed on small screens.

@westnordost
Copy link
Member

westnordost commented Feb 19, 2018

I will take care of the lit issue, let's focus on the %s.

Generally, it is possible to format the strings, but it requires a code-change as well. Though, currently the questions are already in bold. Only making the placeholders bold would be a design change.

Basically, the strings can be formatted using simple HTML, i.e. Does <b>%s</b> have a toilet?. Because of a technical limitation, it would need to be Does &lt;b>%s&lt;/b> have a toilet?. This looks quite messy, and I can guarantee that at least one translator will fuck up that markup. So, the other option is for quest titles to always put the same markup automatically for any placeholder. Is this an option? If yes, then the solution could be to remove each and every quotation mark and i.e. italic-ize it instead.

But again, making it neither bold nor italic would still be an option.

@huftis
Copy link
Contributor Author

huftis commented Feb 19, 2018

Having the placeholder formatting in code instead of in the string sounds like the best solution, but I don’t think it will work for asian and middle eastern scripts.

I‘m not sure what will actually happen if the fonts don’t support italics/bold. If the text is just rendered using the normal font, I guess that might be OK. In that case, the translator can always add quotation marks (or something equivalent) to separate the node/way name and the rest of the string.

@westnordost
Copy link
Member

but I don’t think it will work for asian and middle eastern scripts.

Why?

@rugk
Copy link
Contributor

rugk commented Feb 19, 2018

This looks quite messy, and I can guarantee that at least one translator will fuck up that markup.

Just note that I've actually seen many projects, which handle it that way. Small HTML tags for italic/bold are usually okay and understood by good translators (who often have at least that level of HTML experience to understand these tags). (Okay the encoding is a bit weird, but could not you also do that encoding in the code?) To be sure, you can always put something into a contributing guide or so.

@huftis
Copy link
Contributor Author

huftis commented Feb 19, 2018

but I don’t think it will work for asian and middle eastern scripts.

Why?

Because many of these scripts don’t have italics. They have traditionally never had italics, and the fonts don’t contain italic glyphs.

@huftis
Copy link
Contributor Author

huftis commented Feb 19, 2018

BTW, are you sure the strings will show up as &lt;b>%s&lt;/b> in POEditor? The &lt; is just an XML-coding thing. The actual strings are <b>%s</b>, and any translation editor worth its salt should recognise this and present the string in unescaped form (just like they should recognise that \" means " (POEditor does at least correctly support this)).

@westnordost
Copy link
Member

I did not try, but it is irrelevant anyway as the way to go would be to give this kind of tagging to any placeholder-string in quest titles.

雜 杂 雑
雜 杂 雑

westnordost added a commit that referenced this issue Feb 19, 2018
@huftis
Copy link
Contributor Author

huftis commented Feb 20, 2018

As the italics (or slanted) formatting might be problematic for non-western scripts, I recommend you contact the translators of StreetComplete into Chinese (simplified and traditional), Japanese, Malayalam, and Persian and ask them how this should/could preferably be handled in their respective languages. If they find italic problematic, we need to find an alternative (at least for these languages). If not, everything is fine. :)

One option besides quotation marks and italics would be to use colour for the place holders. Not necessarily a hugely contrasting colour (e.g. orange); a moderately different color (e.g. dark purple) could perhaps work?

@RubenKelevra
Copy link
Contributor

Well, I would prefer going for a non-bold question, without question marks around the street name, but with a bold one. It's easy to read and distinguish and has good accessibility. Italic and bold also add at least overhead on programming-side, since there might be different symbols in different languages for quotation marks which we are not aware of.

@westnordost
Copy link
Member

westnordost commented Feb 24, 2018

So, this must be done the following

  • @huftis creates a PR with all the quotation marks removed from quest titles in the English strings.xml
  • I implement and experiment with to highlight the placeholders in another way (bold/italic/color/nothing) and contribute to that PR (@huftis could do this too, if he knows Android development well enough)
  • After PR has been merged, all quotation marks for all other languages for the quest title stirngs are removed on POEditor

@huftis
Copy link
Contributor Author

huftis commented Feb 24, 2018

@westnordost OK, I’ll create a PR for the string changes. Unfortunately, I don’t know very much about Android development; I have just download Android Studio. :) (I might be able to contribute the odd patch or two in the future, but probably nothing very advanced.)

@westnordost
Copy link
Member

No problem, then I will do step 2.

huftis added a commit to huftis/StreetComplete that referenced this issue Feb 25, 2018
Some references to names of streets and POIs used quotation
marks; others did not. For consistency, this change removes
the quotation marks. The street/place names will in the future
be formatted programatically, as described in issue streetcomplete#876.
@rugk
Copy link
Contributor

rugk commented Feb 26, 2018

I don’t know very much about Android development; I have just download Android Studio. :) (I might be able to contribute the odd patch or two in the future, but probably nothing very advanced.)

That's no problem. If you have already some programming knowledge in another language, it is not that difficult to solve small tasks.
Note that this here, of course, does not seem like a small task, but just FYI there is even a quest generator, which you can use to make simple quests.

westnordost pushed a commit that referenced this issue Feb 26, 2018
Some references to names of streets and POIs used quotation
marks; others did not. For consistency, this change removes
the quotation marks. The street/place names will in the future
be formatted programatically, as described in issue #876.
@westnordost
Copy link
Member

OKay, step 1 and 2 are done. Now, you need to remove all quotes from these strings in POEditor. Will you do that? (You need to temporarily put yourself as a translator for every language)

@huftis
Copy link
Contributor Author

huftis commented Mar 1, 2018

Usually, one would leave changes like these to the translators. But I think this change is pretty safe to do for all languages, so I’ll do it.

@huftis
Copy link
Contributor Author

huftis commented Mar 1, 2018

Hmm, @westnordost, it looks like the strings at POEditor haven’t been updated yet? The source (English) strings still contain the quotation marks.

@matkoniecz
Copy link
Member

@huftis Usually POEditor is updated before release.

@huftis
Copy link
Contributor Author

huftis commented Mar 1, 2018

Changing the translations before the source strings are updated doesn’t make much sense. The translations will automatically be fuzzied when the source strings are changed, and would need to be retranslated (really just unfuzzied) then.

@westnordost
Copy link
Member

Sorry, did that now.

@huftis
Copy link
Contributor Author

huftis commented Mar 6, 2018

@westnordost Thanks for updating the files. I have thought about this some more, had a look at the French strings, and have concluded that it’s not really safe to mass-update the translations. Only the translators have the knowledge needed to update the strings. It’s easy enough to actually remove the quotation marks, but we can’t be sure that the strings currently marked as ‘fuzzy’ are the correct translations when we remove the quotation marks. The strings may have been fuzzied because of other reasons than just the removal of the quotation marks (this is especially true for older strings that are similar to newer strings).

Also, it looks like there previously(?) was a very serious bug in the POEditor’s handling of the translation files. For the last update, things seem fine, but for earlier string updates, the English source strings have been changed without POEditor marking the corresponding translations as ‘fuzzy’. While the source strings changed, the translation strings remained the changed, and marked as ‘translated’. Thus the translators had no way of seeing that the English strings have been changed and to adjust their translation accordingly. This is the case at least for strings in #883 (clarifying definitions of vegetarian/vegan) and for the change of ‘Open store’ to ‘Visit app store’.

Do you know anything about POEditor’s buggy behaviour here, @westnordost? Was this a known bug that now has been fixed, or do we risk it reappearing when updating source strings in the future too?

In general, it looks like POEditor’s handling of Android XML files is buggy. For example, when exporting the translations to the PO or POT format, the string ID (instead of the English source string) is used as source string (‘msgid’), making the resulting files completely useless for the translators. (For example, the translators are asked to translate ‘download_server_error’ instead of being asked to translate ‘Connection error while scanning for quests. Try again later.’. The correct behaviour would be use the string ID as the ‘msgctxt’ and actual English text as the ‘msgid’.) I observe a similar behaviour for other formats, like XLS and CSV. Exporting to XLIFF seems to work, though.

This is also true for the XML export! Fuzzy strings are exported as if they were translated! I don’t how you export the translations (from an admin interface?), @westnordost, but if the same thing happens there, this is a big problem! It could potentially lead to completely incorrect translation of StreetComplete, confusing users using the app in non-English locales.

@westnordost
Copy link
Member

westnordost commented Mar 7, 2018

To mark something "fuzzy" is optional on import on new strings. So, I guess a few times, I didn't check that checkbox. Especially I did not check the checkbox for the "remove quotation marks" import, so that is why you do not see any fuzzyness in the other translations.
Please mass-update the translations with removing the quotation marks.

This is the case at least for strings in #883 (clarifying definitions of vegetarian/vegan) and for the change of ‘Open store’ to ‘Visit app store’.

I consider the visit app store change to be a change in wording only for English. The translators have enough freedom to find each the best wording for their language. The meaning of it did not change, only the wording (in English). I do not consider the other languages as fuzzily translated then.
For vegetarian/vegan, I consider it the same thing. That "meat" does not always include fish and poultry is something I consider very language (and culture) dependent.

Usually, when there is a real change, I create a new string altogether.

For Android, strings are not exported to PO or POT, so this does not really matter.

This is also true for the XML export! Fuzzy strings are exported as if they were translated!

That is okay by me.

@huftis
Copy link
Contributor Author

huftis commented Mar 7, 2018

To mark something "fuzzy" is optional on import on new strings.

That sounds like a misfeature in POEditor.

So, I guess a few times, I didn't check that checkbox. Especially I did not check the checkbox for the "remove quotation marks" import, so that is why you do not see any fuzzyness in the other translations.

It’s the opposite. The ‘remove quotation marks’ strings are (correctly) marked as fuzzy in POEditor (for the three languages I have checked).

Please mass-update the translations with removing the quotation marks.

Hm. OK, I might do this, but it still feels like a dangerous operation. There may be linguistic issues that we are missing (but that native speakers might notice).

This is the case at least for strings in #883 (clarifying definitions of vegetarian/vegan) and for the change of ‘Open store’ to ‘Visit app store’.

I consider the visit app store change to be a change in wording only for English. The translators have enough freedom to find each the best wording for their language. The meaning of it did not change, only the wording (in English). I do not consider the other languages as fuzzily translated then.

That’s just bad localisation practice. Any change in the English strings might trigger (similar or unrelated) changes in the translation. That’s why in all translation projects translations are automatically marked as ‘fuzzy’ whenever a change in the original source string occurs. Yes, even if it’s just fixing a spelling mistake. The author of the English strings don’t have knowledge about all the world’s languages and their particularities. The translators do have this knowledge (for their own language). That’s why we leave it up to the translators to evaluate if a string change in English necessitates a string change in the translators. (If it doesn’t, it’s just a click of button for the translator to ‘unfuzzy’ the string.)

The string change ‘Open store’ to ‘Visit app store’ is actually a nice case in point. Because the original string was confusing, translators got confused, any may very well have translated it incorrectly. Since the string was not fuzzied, they didn’t get to notice the improved wording (which would probably trigger them to fix the translation). For example, the French translation of ‘Visit app store’ is currently wrong (it roughly means ‘an open store’).

For vegetarian/vegan, I consider it the same thing. That "meat" does not always include fish and poultry is something I consider very language (and culture) dependent.

And that’s why the translators look at the English string when translating it. In some languages, ‘meat’ might mean both meat and fish (or meat and poultry). Or there may be several possible translations of the word ‘meat’, depending on the exact meaning.

The translators try to convey the same meaning as the original string (using, of course, language and cultural appropriate words and grammar), and try to be true the original source string. So when the English string says ‘meat’, they will probably try to find an equivalent translation.

(But they might also note that ‘meat’ is ambiguous in English, and infer that, since the string is talking about vegetarian meals, they probably should clarify the string to exclude fish and poultry in the translation, especially if vegetarianism is relatively uncommon in their country. Such ‘improvements’ done just in the translations, without suggesting a string change in the original English strings (on the project’s bug tracker), are common.)

With the new version of the string, they might very well have chosen a different translation. Which means that there will be a systematic difference in the meaning of translations added before the English string change and after the string change. So this too is a good example of a string that should have been fuzzied.

For Android, strings are not exported to PO or POT, so this does not really matter.

It matters for the translators! Translating using an online platform like POEditor is both slow and error-prone. It’s difficult to ensure consistent quality, and POEditor doesn’t offer any tools that help – tools that are commonly available in offline editors. Translators and translation teams may also have created custom tools for checking the translation files. So the translators may choose to download the PO files and do offline translation, using more effective PO editors and tools (followed by reuploading the translations to POEditor) . When the PO files one downloads aren’t correct PO files, all this is impossible.

This is also true for the XML export! Fuzzy strings are exported as if they were translated!

That is okay by me.

Hm, I’m not sure you understand how fuzzy messages are supposed to work. A fuzzy status for a string basically means ‘this translation may not be correct’. Or in other words, it means ‘this translation might be completely incorrect, and it needs a revision before inclusion in the program’.

Commonly, fuzzy strings are added by the translation tools whenever a new string is similar enough to an old string that the old one might be used as a time-saving suggestion for the translator. (Though for POEditor, I see that there’s also an ‘automatic’ status, that’s perhaps used in these cases.)

The fuzzy status may also be added by the translators. And then they’re used to signal that the translation isn’t ready to be used in the program (i.e. it’s better to use the original source string), e.g. because the translator isn’t sure how to best translate to original string.

For example, I might translate the string ‘Unhewn cobblestone’ to ‘Hm, kanskje ein spesiell form for brustein? Undersøk dette nøyare!’ and mark the string as ‘fuzzy’. It would be a small catastrophe if this ‘translation’ found its way into the actual app … (The translations means ‘Hmm, perhaps a special type of cobblestone? Investigate this further!’)

@westnordost
Copy link
Member

westnordost commented Mar 8, 2018 via email

@huftis
Copy link
Contributor Author

huftis commented Mar 8, 2018

Yes, I’ll manually remove the quotation marks from the other strings. (But I still think it’s a bad and risky idea to second-guess other languages’ rules about punctuation and grammar.)

I’ve now done the changes for the French translation. It will probably take several days (or even weeks) to finish it for all the translations, though. The actual change is of course trivial (I have created a one-line regexp which takes care of it for all languages – tested on the XML files). But to actually apply it, I have to manually filter and download the relevant strings for 34 languages, refilter the strings, apply the changes, and manually upload the files. And (of course ☹️), it turned out that POEditor’s export function is buggy even for exporting to the XLIFF format (fuzzy strings are treated as non-fuzzy strings when exporting them).

Any chance we could move away from POEditor in the future? It seems like almost everything about it basically sucks (importing, exporting, handling of fuzzy strings, the UI for translators, and that it’s proprietary/non-free software). I think the best alternative would be Weblate. It’s free software, feature rich, has Git integration, OK-ish support for offline translation, and the authors provide free hosting for free software. (It also has native support from Android XML string files. I don’t know how well this works, but we could always use a2po to provide PO(T) files instead.)

BTW, I believe I have found yet another bug in POEditor. Old strings that have been removed from the app seems to be stuck (forever?) in POEditor, and available for ‘translation’, creating unncessary work for the translators. One example is the string ‘Is "%s" road lit here?’.

@ENT8R
Copy link
Contributor

ENT8R commented Mar 8, 2018

Any chance we could move away from POEditor in the future?

From the few times I used this editor, I would say that this is the easiest translation editor I've worked with so far. The UI is very simple and even non-experts can easily translate the app into their favorite language.

[..] has Git integration

POEditor has Git integration as well...

[..] and the authors provide free hosting for free software

POEditor does offer this too though it is very hidden in the settings...

It also has native support from Android XML string files

POEditor does have this as well and this project uses this feature. I also don't know exactly why you want to use PO(T) files which makes the whole thing probably more complicated...

@huftis
Copy link
Contributor Author

huftis commented Mar 8, 2018

Any chance we could move away from POEditor in the future?

From the few times I used this editor, I would say that this is the easiest translation editor I've worked with so far. The UI is very simple and even non-experts can easily translate the app into their favorite language.

I think I have used almost all online translation tools that are, or have been, available (I’ve been translating free software for 15+ years), and I would agree that many of them are not very user friendly (and many have been quite buggy!). And I agree that the apparent simplicity of POEditor is a good thing. However, it doesn’t provide any features to help the translators provide high-quality translations or work efficiently with translations.

[..] has Git integration

POEditor has Git integration as well...

But it isn’t used?

It also has native support from Android XML string files

POEditor does have this as well and this project uses this feature.

And as I have mentioned in several examples, it is quite buggy.

I don't know exactly why you want to use PO(T) files which makes the whole thing probably more complicated...

Only marginally more complicated. :) It’s extremely easy to convert from and to Android strings and PO(T) files. So in practice it would be just be a one- or two-line script that would automate it. And PO files have the advantage that the exists tons of tools that work with them, and that translators can use to ensure high-quality translations.

But the reason that I mentioned it was just that if the support for Android string files weren’t as mature (i.e. bug-free) as the the support for PO(T) files, this wouldn’t be a problem. I have no reason to believe that the Android string file support isn’t any good (except that it’s a more recent feature), so if it works well, there would be no problem using it. :)

@rugk
Copy link
Contributor

rugk commented Mar 9, 2018

Maybe discuss this in a new issue as it is more or less unrelated to this one?

@huftis
Copy link
Contributor Author

huftis commented Mar 9, 2018

Maybe discuss this in a new issue as it is more or less unrelated to this one?

Agreed. I’ll open a new issue for the ‘find a better alternative to POEditor’ issue.

Anyway, I have now removed the old quotation marks for all fuzzy strings in all of the 34 languages in POEditor. (I have not touched any non-fuzzy strings. This is based on the thinking that if the translator intentionally adds a quotation mark where none exists in the English string, they probably have a good reason for doing so.)

I have taken care to handle the various variants of quotation marks ("«»“”„) used in the languages (including the special spacing rules for French). I have left the strings as fuzzy, so that the translators can easily find and check them, thus having the final say in what is the correct grammar for the %s strings.

Finally closing this issue. :)

@huftis huftis closed this as completed Mar 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants