You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a feature’s name property contains a semicolon, the style should replace it with a more presentable character.
Background
When the language fallback list only includes unsupported languages, such as vi, or language codes that cannot be supported, such as mul, the style resorts to the name property of each feature, which corresponds to the name key in OSM. The name key typically represents the name in the local language, but sometimes there are multiple names of equal standing in that language or multiple local languages. In regions where official language policies promote multilingualism, OSM communities long ago coalesced around ad-hoc delimiters, such as a hyphen, solidus, or space, between each name in name.
Problem
While mappers have tended to consider these delimiters “good enough”, in most cases they aren’t the only valid delimiters to use in a map context. In fact, these delimiters cause several noticeable deficiencies:
Chaotic differences between simultaneously visible features with no apparent linguistic or stylistic explanation
Characters like the hyphen that do a poor job of visually setting text apart
Poor line breaking, with words getting orphaned on a line otherwise containing text from another language or script
A jarring inconsistency with the punctuation expected in the speaker’s preferred language or in the context of an American-style map
Meanwhile, in other regions that experience more grassroots multilingualism, or where multiple names need to be listed in a monolingual context, mappers have often applied a semicolon, for consistency with non-name keys that require machine readability. These semicolons clearly aren’t intended to be shown to the user verbatim, where they look like a rendering glitch.
There is a long, winding discussion in the community forum about delimiters in name. It isn’t clear that there’s consensus to replace ad-hoc delimiters with the semicolon en masse, but there does seem to be some popular acceptance of the existing semicolons, at least in regions where no ad-hoc delimiter is well-established.
Proposed solution
Ideally, it would be the responsibility of the data consumer to apply an appropriate delimiter in all these cases. However, these punctuation and whitespace characters are too ambiguous to interpret as delimiters; there are far too many individual names that legitimately contain them too. For now, we should focus on pretty-printing the semicolons due to their clear intent. Every text-field that refers to a name property should replace occurrences of a semicolon with a more appropriate delimiter:
A newline in the main part of a point-placed label.
A bullet in a city label’s local-language gloss, which needs to be more compact.
Replacing characters in a text-field is challenging for reasons specific to this project’s technology stack. The style specification doesn’t include an expression operator to replace a substring within a larger string: mapbox/mapbox-gl-js#4100. Fortunately, it does contain the index-of operator for finding the substring, so we can concatenate that occurrence’s prefix and suffix with the replacement string. The style specification also lacks an operator for splitting or looping, so we’d only be able to replace a fixed number of semicolons. Most multiply named features have only a few names, so this shouldn’t be a major problem in practice.
Alternatives considered
The semicolon delimiters could be pretty-printed when generating the vector tiles, but that would limit our ability to choose different delimiters, for example depending on whether the label is point- or line-placed. Anyways, we’ll eventually need the ability to parse out individual values from this list in order to deduplicate glossed names: #592 (comment).
When a feature’s
name
property contains a semicolon, the style should replace it with a more presentable character.Background
When the language fallback list only includes unsupported languages, such as
vi
, or language codes that cannot be supported, such asmul
, the style resorts to thename
property of each feature, which corresponds to thename
key in OSM. Thename
key typically represents the name in the local language, but sometimes there are multiple names of equal standing in that language or multiple local languages. In regions where official language policies promote multilingualism, OSM communities long ago coalesced around ad-hoc delimiters, such as a hyphen, solidus, or space, between each name inname
.Problem
While mappers have tended to consider these delimiters “good enough”, in most cases they aren’t the only valid delimiters to use in a map context. In fact, these delimiters cause several noticeable deficiencies:
Meanwhile, in other regions that experience more grassroots multilingualism, or where multiple names need to be listed in a monolingual context, mappers have often applied a semicolon, for consistency with non-name keys that require machine readability. These semicolons clearly aren’t intended to be shown to the user verbatim, where they look like a rendering glitch.
There is a long, winding discussion in the community forum about delimiters in
name
. It isn’t clear that there’s consensus to replace ad-hoc delimiters with the semicolon en masse, but there does seem to be some popular acceptance of the existing semicolons, at least in regions where no ad-hoc delimiter is well-established.Proposed solution
Ideally, it would be the responsibility of the data consumer to apply an appropriate delimiter in all these cases. However, these punctuation and whitespace characters are too ambiguous to interpret as delimiters; there are far too many individual names that legitimately contain them too. For now, we should focus on pretty-printing the semicolons due to their clear intent. Every
text-field
that refers to a name property should replace occurrences of a semicolon with a more appropriate delimiter:Replacing characters in a
text-field
is challenging for reasons specific to this project’s technology stack. The style specification doesn’t include an expression operator to replace a substring within a larger string: mapbox/mapbox-gl-js#4100. Fortunately, it does contain theindex-of
operator for finding the substring, so we can concatenate that occurrence’s prefix and suffix with the replacement string. The style specification also lacks an operator for splitting or looping, so we’d only be able to replace a fixed number of semicolons. Most multiply named features have only a few names, so this shouldn’t be a major problem in practice.Alternatives considered
The semicolon delimiters could be pretty-printed when generating the vector tiles, but that would limit our ability to choose different delimiters, for example depending on whether the label is point- or line-placed. Anyways, we’ll eventually need the ability to parse out individual values from this list in order to deduplicate glossed names: #592 (comment).
/ref gravitystorm/openstreetmap-carto#4755
The text was updated successfully, but these errors were encountered: