-
-
Notifications
You must be signed in to change notification settings - Fork 729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combining characters rendering in wrong place #2384
Comments
In Americana’s font stack, is the advance measure for the glyph at index 787 a negative number equivalent to −1 ch? If so, I would expect the code below to at least center the diacritic over a medium-width preceding letter such as “c”. This would still be inaccurate for wider letters like “w” and narrower letters like “i”, but better than what these screenshots show. maplibre-gl-js/src/symbol/shaping.ts Lines 374 to 377 in 6d55d5b
|
Sounds like we'll need to investigate this. The fonts are created using Font.create(...) in fonteditor-core. Specifically, this is done here: We'll need to investigate those font creation options. I don't fully understand what it's doing so there may very well be a bug in how we're creating the fontstacks. |
Below is the font object for glyph 787 in the bold fontstack. This gets written to a TTF which is then consumed by
|
This is specifically about scripts where combining characters don't affect the base character's width, which excludes most Indic scripts. #50 is the general case, and solving that will indeed solve this. This issue potentially has an easier solution than #50, but if that is fixed first, this one is moot. |
This is actually two, maybe three distinct issues:
Looking at #2384 (comment), I think I was a bit off: This is correct, but it means the glyph PBF doesn’t contain enough information to infer that the character should be centered over the previous character. Either the glyph metrics fields would need to be extended, or we’d need some sort of heuristic that a negative Alternatively, GL JS could hard-code the set of codepoints that are known to be combining characters, based on the Unicode standard. Maybe there are granular properties about how characters combine. GL JS already hard-codes some other properties about codepoints, such as whether they allow breaking or can be rotated in vertical text, so there is certainly precedent. Maybe this is what text layout engines are expected to do for combining characters too? During shaping, the following code assumes the glyph begins at zero, with no affordance for combining characters: maplibre-gl-js/src/symbol/shaping.ts Lines 701 to 702 in 6d55d5b
The glyph PBF does contain the
This code I linked in #2384 (comment) is the culprit: it assumes it should advance by the spacing even if the character is zero-width. This would also produce unexpected spacing for an invisible character such as a ZWNJ: maplibre-gl-js/src/symbol/shaping.ts Lines 374 to 377 in 6d55d5b
This part is more straightforward to fix: if the glyph’s |
#2388 attempts to work around the problem with hard-coded metrics for combining characters, though it looks like it might be possible to obtain more robust metrics from the font. maplibre/maplibre-style-spec#145 (reply in thread) has a proof of concept of first-class support for combining text by properly segmenting the text by grapheme cluster and sending all text, including Latin, through the alternative TinySDF codepath, avoiding the manual shaping code above altogether. This approach would correctly handle any Latin combining diacritics, though ligatures would remain unsupported. |
maplibre-gl-js version: 2.4.0
In some scripts, such as Latin and Thai, combining characters never change the width of the preceding character to which they attach. Such combining characters are capable of being rendered properly without a full implementation of glyph substitution, as is needed for Indic scripts (see #50). Diacritics in these scripts look best when centered above or below characters.
In Maplibre GL JS, many combining characters are being rendered off to the side from where they should be. Combining characters are being rendered typewriter-style (the naïve "one codepoint = one glyph" implementation), as zero-width characters that place a diacritic in a fixed position about the previous letter. However, Latin letters (for example) have variable widths, so the same combining diacritic might not be centered over i and w in this rendering method.
This is especially noticeable when
text-letter-spacing
is set to a value greater than 0. In this case, extra spacing is erroneously inserted between combining characters and their preceding characters. Diacritics are then placed over this extra spacing instead, causing them to appear off-center.Example 1
https://americanamap.org/#map=14.81/49.02432/-123.10257&language=mul
A park in Tsawwassen, British Columbia, Canada, has the name sməqʷəʔelə həw̓aləm̓ew̓txʷ, posted alongside its English name "Blue Heron Nest Park". The Halkomelem language has a plethora of consonants, many of which are represented by a Latin letter with a comma above. Since these consonants are not available in Unicode as precomposed characters, this language makes frequent use of the codepoint U+0313 COMBINING COMMA ABOVE.
In m̓ and w̓, MapLibre places the comma above the right edge of the character instead of the center:
Example 2
https://americanamap.org/#map=2.66/-12.43/71.09&language=th
The Thai name for the Indian Ocean is มหาสมุทรอินเดีย. The below example is in Noto Sans Thai, which may have a dramatically different style than what displays inline here, but it looks like correct Thai:
However, when
text-letter-spacing
> 0, combining characters introduce double spacing between neighboring characters, and render awkwardly to the side of the preceding character. Note that there are no spaces in this name—Thai uses spaces to separate sentences, not words:The text was updated successfully, but these errors were encountered: