Skip to content
This repository has been archived by the owner on Dec 8, 2017. It is now read-only.

Complex Text Rendering #4

Closed
mikemorris opened this issue Dec 9, 2015 · 62 comments
Closed

Complex Text Rendering #4

mikemorris opened this issue Dec 9, 2015 · 62 comments
Assignees

Comments

@mikemorris
Copy link

What is the state of text rendering in Mapbox GL?

We currently do not render scripts that require bidirectional support or complex text shaping correctly in mapbox-gl-native or mapbox-gl-js. This ticket will track adding proper support to both projects.

Broken Arabic labels example courtesy of @mushon

What's missing?

We need to add the following functionality for proper text rendering:

  • Unicode Bidi Algorithm to cut labels into logical segments and flip the display order of RTL (right-to-left) text.
    • Necessary for proper rendering of RTL scripts and mixed script labels containing RTL scripts interspersed with LTR text runs like numerals.
  • Complex text shaping
    • Necessary for proper display of scripts where adjacent glyphs should be transformed into new glyphs or rendered as combined glyphs or ligatures.

In terms of scripts affected by this, Hebrew requires bidi support, Indic scripts like Hindi can require complex text shaping, and Arabic requires both bidi and complex text shaping support. Additionally, implementing the Unicode line-breaking algorithm should improve support for cases like smarter line breaking in Chinese.

How do we currently handle fontstack fallbacks?

Currently, the Protobuf-encoded "glyph tiles" we create with node-fontnik are a composited "fontstack" with missing glyphs in fonts higher in the stack being filled in by glyphs from fonts further down the stack and we therefore end up with a combined Helvetica, Arial Unicode fontstack with per-glyph fallbacks in rendered text.

Fontstack Coverage
"Helvetica" Latin
"Arial Unicode" Latin, Arabic
"Helvetica, Arial Unicode" Helvetica Latin, Arial Unicode Arabic

Why will this not work for complex text shaping?

Because shaping tables are specific to a font file, to apply shaping properly we will need to work exclusively with glyphs from a single font. Instead of using "fontstack" glyph tiles, we will need tiles which contain all the glyphs in a given range for a single font. This approach should also limit glyph atlas duplication for multiple fontstacks with a common fallback.

How will we do this?

We will first need to segment each label into text runs (splitting words into individual segments, and splitting Arabic text segments from numerical segments for example) with the Unicode bidi algorithm. Then, for each segment, we will attempt, with each font in the fontstack until a match is found, to shape the text segment with a single font's shaping table and check whether all characters in the shaped result can be rendered by that font (using a glyph coverage file). If coverage is incomplete, we will fall back to the next font in the stack.

(It's possible we could check glyph coverage first, but the necessary glyphs may change after shaping, and the glyph coverage check would have to be repeated. We should test performance to determine whether a possibly inaccurate initial coverage check is faster than redundant shaping passes for fonts lacking glyph coverage.)

Example

For the fontstack "Open Sans, Arial Unicode", no glyphs change when shaped with Open Sans/gsub.sfnt - do all characters in résumé exist in Open Sans.coverage.json? NO? Missing é? Reshape with Arial Unicode/gsub.sfnt, then check if all characters in résumé exist in Arial Unicode.coverage.json

Once a font with matching coverage has been determined, we can request glyph tiles from a single font containing the necessary glyphs, like Arial Unicode Regular/0-255.pbf.

How will we get/use these "shaping tables"?

Shaping tables are contained in font files as GSUB (glyph substitution), GPOS (glyph positioning) and KERN (kerning) tables, which can be read by the FreeType function FT_Load_Sfnt_Table. We will need to extract these tables from from uploaded font files, then request them from the client through an API. We've started work on extracted shaping tables but it isn't quite functional yet.

To use these shaping tables, we will need to pass them into HarfBuzz for mapbox-gl-native, or an emscripten port for mapbox-gl-js. I'm not sure if HarfBuzz currently has an interface for reading raw shaping tables (it generally works with full font files). If this interface doesn't currently exist, we'll need to add it.

Resources

Universal

C++

JavaScript

/cc @mapbox/gl

@mushon
Copy link

mushon commented Dec 9, 2015

👍

@incanus
Copy link

incanus commented Dec 9, 2015 via email

@mikemorris
Copy link
Author

Made some initial progress on integrating Harfbuzz in mapbox-gl-native in https://github.com/mapbox/mapbox-gl-native/compare/harfbuzz, but the biggest stumbling block I've hit so far has been the requirement for using glyph indices (as opposed to Unicode points) for layout.

From the ICU docs (but Harfbuzz shares the same pattern here):

Since many of the contextual forms, ligatures, and split characters needed to display complex text do not have Unicode code points, they can only be referred to by their glyph indices. Because of this, the LayoutEngine's output is a list of glyph indices. This means that the output must be displayed using an interface where the characters are specified by glyph indices rather than code points.
http://userguide.icu-project.org/layoutengine

This is complicated by our current SDF spec only tagging glyphs by char code, not glyph index, and will be another consideration in how a v2 SDF spec will need to be structured.

@jimmont
Copy link

jimmont commented Apr 19, 2016

As noted Hebrew labels are backward/flipped, in Tel Aviv for example http://localhost:9966/#19/32.09430/34.78352
Had someone who understands the language looking at my mapbox-gl-js map say "they're nonsense" until I pointed out the labels are just backward. I added a few name:en values via OSM which hides some of this for me but for anyone using Mapbox in Israel to get around this issue makes matching the map to signs (the actual wayfinding) an awkward exercise--assuming, and hoping they note the pattern in the first place. Is there a solution I can implement now?

@kkaefer
Copy link
Member

kkaefer commented Apr 19, 2016

@mikemorris is currently working on fixing this

@mushon
Copy link

mushon commented Apr 19, 2016

@mikemorris do let us know if you need some help with testing as this is definitely a pressing issue for many of us. Thanks!

@mikemorris
Copy link
Author

mikemorris commented Apr 19, 2016

A little extra help would certainly be appreciated @mushon! I'll continue to post updates here as I get a better idea of how to break this project down into concrete chunks to build and test.

@jimmont My initial work in mapbox/mapbox-gl-js#1841 may be an option for you. It only handles bidirectional text (not complex shaping), but it sounds like that might be all you need currently?

@Arman92
Copy link

Arman92 commented Jun 2, 2016

We are anxiously waiting for any update of the issue status.

@ghost
Copy link

ghost commented Jun 7, 2016

This issue is a fatal.(To use in CJK)
Is there a temporary workaround?

@mikemorris
Copy link
Author

mikemorris commented Jun 7, 2016

@epsg3857 Can you explain how/which CJK scripts are affected by the lack of bidirectional text or complex shaping? Are you referring to the line-breaking issue originally reported in mapbox/mapbox-gl-native#1223, vertical label support or something else?

@1ec5
Copy link
Contributor

1ec5 commented Jun 7, 2016

@epsg3857, mapbox/mapbox-gl-native#5077 was incorrectly linked to this issue. This ticket tracks complex font shaping and right-to-left text support, not CJK. The issue you’re running into is mapbox/mapbox-gl-native#1681, possibly exacerbated by mapbox/mapbox-gl-native#1444.

@ghost
Copy link

ghost commented Jun 7, 2016

I see.
understood.

@Arman92
Copy link

Arman92 commented Aug 17, 2016

Any updates? It's been a while :-(

@mushon
Copy link

mushon commented Aug 17, 2016

I second that.

Just as an FYI, this is not some "nice to have" feature, it is a very serious bug. Right now every Mapbox GL map no matter what label language it uses shows many meaningless reversed text all around the Middle East and North Africa as many many OSM labels don't have an English name.

And I must add, I haven't found myself having to beg a company for RTL support since the early days of Macromedia Flash. Somehow Mapbox always seemed to me like a company with a different image of the world, and a different vision of how different cultures and places should be represented on the web. It is quite frustrating and frankly insulting to see how an issue that should be a blocker for any beta release and affects hundreds of millions users is continuously disregarded. While reversing our exotic letters for our cities and streets on every GL map makes them equally meaningless to you, for us this is the image of technological colonialism. On a map.

Forgive my harsh words, but I hope this helps you finally see this long ignored blind spot and address this issue more urgently.

With otherwise utter admiration and respect,

Mushon Zer-Aviv
Mushon.com | Shual.com | @mushon

On Aug 17, 2016, at 07:26, Arman [email protected] wrote:

Any updates? It's been a while :-(


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@tmcw
Copy link

tmcw commented Aug 17, 2016

Hi Mushon,

We absolutely understand how this is an important issue, both in terms of parity with other technology and being able to represent all languages equally. I hope that, as this ticket's intro lays out and the many referencing issues explain: this is a very difficult issue. Flash and desktop applications can take advantage of existing C++ text-shaping logic, as well as fonts-on-disk. Mapbox GL JS loads fonts incrementally (yay) which means that this sort of problem is incredibly, deeply, months-long-difficult hard (boo).

We understand that this is a big issue - many Mapboxers aren't native English speakers and we want our maps to be a tool for equality and understanding. Unfortunately, this is, simply, incredibly hard, and thus it's taken an extremely long time to even get a prototype off the ground. If you know any tricks or want to connect us with people who know a shorter way to a solution, contributions or connections would be incredibly appreciated. But please, for the time being, understand that this isn't disregard or colonialism or etc., it's just unvarnished "difficulty".

  • Tom

@MahdiAstanei
Copy link

Now.. can I fix this on android? is there a way?

@mushon
Copy link

mushon commented Nov 21, 2016

@ChrisLoer we're very happy to see this.
Do you have an estimation of when can we expect this to become production-ready?

@ChrisLoer
Copy link

@mushon The current gl-native changes should go out in the Android SDK 4.3.0 and iOS SDK 3.5.0 releases. The timing of those releases depends on several other features, but the target is in the next few months. The gl-js changes aren't ready to merge yet, but I'm hopeful that within a few weeks I'll be able to merge them, at which point we'll know which release they'll go into.

As for the HarfBuzz support for Indic text, all I can say for sure is that I'll start working on it as soon as the current round of gl-js changes go in. So by the end of the year I should at least be able to provide a better estimate.

@ChrisLoer
Copy link

I've merged the gl-native fix for line-breaking diglossic text (mapbox/mapbox-gl-native#7112), and I'm continuing to work on porting the gl-native changes to gl-js.

@ChrisLoer
Copy link

mapbox/mapbox-gl-js#3758 contains an implementation of Arabic shaping and bidirectional layout for gl-js. We're not ready to merge it yet as we work through the performance implications, but if you're interested in seeing our progress or providing feedback, please take a look!

@igal1c0de4n
Copy link

igal1c0de4n commented Jan 15, 2017

@ChrisLoer it's been more then a month since last update on this issue. Can you provide an update as for where it stands, what's remaining etc?

Also, relating to performance:
Is there a way to get some solution out there, then iterate and improve performance?
Perhaps this should be open for discussion, as quite a few people & projects are waiting for this fix (it's been more then a year since this issue was created). All projects which consider the middle east as important are pretty much prevented from relying on mapbox-gl. How much of a problem that is for mapbox, I'm not sure, but I do think the priority of this issue should be re-evaluated

@ChrisLoer
Copy link

@knigal For gl-js, we've settled on the idea of loading the support as a plugin (as a way to get the functionality out there as we still keep working to improve the performance). The changes and documentation are ready, we're just finalizing the relatively minor issue of how to name the plugin (mapbox-gl-arabic-text is the leading contender right now). This should go in very soon and be available in our February release of gl-js.

For gl-native, the changes are still waiting to go out as part of the Android SDK 4.3.0 and iOS SDK 3.5.0 releases, still targeted for the early months of this year.

@1ec5
Copy link
Contributor

1ec5 commented Jan 15, 2017

If you’re interested in trying out this functionality ahead of time on a mobile or desktop platform, check out our instructions for building the SDKs yourself:

https://github.com/mapbox/mapbox-gl-native/tree/master/platform/android#contributing-to-the-sdk
https://github.com/mapbox/mapbox-gl-native/blob/master/platform/ios/INSTALL.md
https://github.com/mapbox/mapbox-gl-native/blob/master/platform/macos/INSTALL.md

Please file any issues you see in the mapbox-gl-native repository.

@jimmont
Copy link

jimmont commented Jan 15, 2017

@ChrisLoer perhaps your team could consider mapbox-gl-rtl-text as the plugin name if it isn't already. Save a few letters.

@1ec5 @ChrisLoer how can we test the update to gl-js (that's slated to come in February)?

Thanks for this update and the effort as well. We can now map plans for 2017.

@igal1c0de4n
Copy link

igal1c0de4n commented Jan 15, 2017

Q: @ChrisLoer Will the upcoming release display correctly Hebrew in addition to Arabic? If so, mapbox-gl-rtl-text would be a better name (sincemapbox-gl-arabic-text implies Arabic-only)

btw only mapbox-gl.js is relevant to my projects. Thus adding a +1 on @jimmont 's request for a way to preview the .js plugin. I'd be happy to try it and post feedback

@lucaswoj
Copy link

Closing this ticket as part of an effort to merge this repo into the mapbox-gl-js repo. Work on this project is nearing completion. Status updates and conversation will continue at mapbox/mapbox-gl-js#3708

@lucaswoj
Copy link

The initial phase of support for right-to-left and Arabic script is now complete.

The primary tracking issue for the remaining complex text challenges is now: mapbox/mapbox-gl-native#7774

@chuckhacker
Copy link

Great!!!

@johnnybegood7
Copy link

Any consideration of using Graphite2 to handle this on top of FT?

@ChrisLoer
Copy link

@johnnybegood7 Our assumption is that if we were adding support for fonts that used Graphite, it would be through the Harfbuzz wrapper of Graphite (see discussion in mapbox/mapbox-gl-native#7774).

@AbdulrhmanBazrto
Copy link

was this problem solved for android SDK ? :)

@zugaldia
Copy link
Member

@3bo0o0odee We integrated ICU to support bidirectional text layout and Arabic text shaping with mapbox/mapbox-gl-native#6984 which is available in the Android SDK 5.x series. The remaining complex text work is ongoing and tracked on mapbox/mapbox-gl-native#7774.

@MahdiAstanei
Copy link

Is there any body to create a step by step fix for android? can we use Mapbox on android and USE for Persian or any Arabic countries?

@kkaefer
Copy link
Member

kkaefer commented Aug 7, 2017

@MahdiAstanei Upgrading Mapbox to the latest released version (currently 5.1) will include this fix. You don't have to do any special configuration.

@mushon
Copy link

mushon commented Aug 7, 2017

Thanks for making progress there but what about supporting this at Mapbox Studio and supporting the static (bitmap) tiles? We really can't design like this… @ChrisLoer can you give us an update?

@ChrisLoer
Copy link

@mushon The plan right now is that Studio support won't come until we've integrated RTL text into the core library. We are exploring using web assembly as a way to integrate more "native" code into GL JS -- any solution we come up with there will probably include RTL text. I know that "just enable the plugin in Studio" would be a more immediate solution for you, and that's still an open discussion, but it's not our current plan. An alternative short term solution we've discussed is having Studio preview raster tiles generated by api-gl, which would include the shaping support.

When you say static (bitmap) tiles, do you mean tiles generated by api-gl? They should already be fixed.

@mushon
Copy link

mushon commented Aug 7, 2017

@ChrisLoer I would really appreciate at least a browser plugin that would give us something to work with in studio until you implement a more robust solution.
As for the bitmap tiles, you are right, they are processed correctly both for static images and for leaflet. It's just that the static image interface is inconsistent as it currently shows the wrong preview to what it actually generates.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests