Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] BiDi and shaping support for Font::draw/Font::get_string_size #21791

Closed
wants to merge 1 commit into from

Conversation

bruvzg
Copy link
Member

@bruvzg bruvzg commented Sep 6, 2018

Changes Font::draw, Font::draw_halign, Font::get_string_size functions and Label, LineEdit controls to support bi-directional text and complex scripts using ICU and HarfBuzz libraries.
Adds Font::draw_paragraph function, and ShapedString/ShapedAttributedString classes for text display and input handling.

🔹 Third-party libraries
  • HarfBuzz (full library except non-relevant font backends (CoreText, Uniscribe) and built-in UCDN), 1.8.8, Old-MIT
  • ICU4C ("common" library part and built-in unicode base data), 62.1, Unicode License
🔸 Updates

7 Sep 2018 - Fixed build for ARMv7 Linux, added glyph offsets to get_string_size calculation.
27 Sep 2018 - Added multiline and rich text shaping APIs.
12 Oct 2018 - Added Label, and LineEdit, text shaping moved from font to separate ShapedString (for plain text), ShapedAttributedString (for rich text)

🔹 TODO list
  • Single line plain text rendering API
  • Multiline plain text rendering API
  • Multiline spanned text rendering API
  • LineEdit control
  • Label control
  • Char-by-char text in other controls
  • RichTextLabel control
  • TextEdit control

Related:
#10546, #3081, #9961, #982

@bruvzg
Copy link
Member Author

bruvzg commented Sep 6, 2018

Some visual comparison of 3.1 and this PR:

🔹 3.1 alpha 1

win_old

🔹 93a888e + this PR

win_new

🔸 References to compare rendering

Noto Nastaliq Urdu test page:
http://behdad.org/urdu/

Random Wikipedia articles:
https://he.wikipedia.org/wiki/פאדי_קונסידיין
https://km.wikipedia.org/wiki/ខេត្ដព្រះសីហនុ
https://th.wikipedia.org/wiki/อาณาจักรโชซ็อนโบราณ


Update: Some additional examples of opentype features (Fonts: Kleymissky, Noto Sans, Fira Code)

🔹 3.1 alpha 1

screenshot 2018-09-29 at 22 42 51

🔹 this PR

screenshot 2018-09-29 at 22 42 12

@akien-mga akien-mga added this to the 3.1 milestone Sep 6, 2018
@akien-mga akien-mga requested a review from reduz September 6, 2018 09:34
@bruvzg bruvzg force-pushed the min-shaping branch 3 times, most recently from 2349a44 to 7c79abe Compare September 7, 2018 11:42
@@ -173,6 +173,26 @@ Comment: The FreeType Project
Copyright: 1996-2017, David Turner, Robert Wilhelm, and Werner Lemberg.
License: FTL

Files: ./thirdparty/harfbuzz/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sections are ordered alphabetically based on the "Files" path, so this should go after ./thirdparty/glad/.

COPYRIGHT.txt Outdated
2005, David Turner
2004, 2007, 2008, 2009, 2010, Red Hat, Inc.
1998-2004, David Turner and Werner Lemberg
License: MIT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text of Harfbuzz's MIT variant should be included, as it's not the same as the "classical" MIT license Godot uses, or as Debian names it the Expat license.

Since there are dozens of MIT licenses with different wordings, I'd suggest to name this one MIT-HarfBuzz for clarity.

For the reference, I checked https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/#license-specification and https://spdx.org/licenses/ but it doesn't include any identifier for HarfBuzz's MIT variant.
Debian itself documents it as "MIT" in https://metadata.ftp-master.debian.org/changelogs/main/h/harfbuzz/harfbuzz_1.8.8-2_copyright, but since we might end up adding more thirdparty code under other MIT variants, I prefer to be more explicit.

COPYRIGHT.txt Outdated
@@ -388,6 +408,41 @@ License: BSD-3-clause



License: Unicode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Licenses are listed alphabetically too, so this should go just above License: Zlib.

SConstruct Outdated
@@ -190,6 +190,9 @@ opts.Add(BoolVariable('builtin_squish', "Use the built-in squish library", True)
opts.Add(BoolVariable('builtin_thekla_atlas', "Use the built-in thekla_altas library", True))
opts.Add(BoolVariable('builtin_zlib', "Use the built-in zlib library", True))
opts.Add(BoolVariable('builtin_zstd', "Use the built-in Zstd library", True))
opts.Add(BoolVariable('builtin_hb', "Use the built-in HarfBuzz library", True))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer builtin_harfbuzz.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also be listed alphabetically.

SConstruct Outdated
@@ -190,6 +190,9 @@ opts.Add(BoolVariable('builtin_squish', "Use the built-in squish library", True)
opts.Add(BoolVariable('builtin_thekla_atlas', "Use the built-in thekla_altas library", True))
opts.Add(BoolVariable('builtin_zlib', "Use the built-in zlib library", True))
opts.Add(BoolVariable('builtin_zstd', "Use the built-in Zstd library", True))
opts.Add(BoolVariable('builtin_hb', "Use the built-in HarfBuzz library", True))
opts.Add(BoolVariable('builtin_icu', "Use the built-in ICU library", True))
opts.Add(BoolVariable('use_sil_graphite2', "Use the external SIL Graphite library (LGPL licensed)", False))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any LGPL text in src/hb-graphite2.{cc,h}, it has the same copyright header as HarfBuzz.

# Don't use dynamic_cast, necessary with no-rtti.
env.Append(CPPDEFINES=['NO_SAFE_CAST'])
# These flags help keep the file size down
env.Append(CPPFLAGS=["-fno-exceptions"]) #, '-fno-rtti' - dynamic_cast required by ICU
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can/should do that. CC @eska014

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, turning off runtime type information saves about 200KiB last I tested, so we should avoid re-enabling it. Does ICU have some build option that avoids usage of RTTI?

@@ -124,6 +124,27 @@ Files extracted from upstream source:
- the include/ folder
- `docs/{FTL.TXT,LICENSE.TXT}`

## ICU4C
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be listed alphabetically too (and with two lines between sections).

@reduz
Copy link
Member

reduz commented Sep 16, 2018

Hi Bruvzg, very glad that you are trying to get this to work, I think this is a lot of progress. How do you want to continue from here? I can imagine Label and RichTextlabel will need more complex code.

Also, one thing I was never too clear about is how big is ICU. I suppose in the range of <1mb it could be compiled in with the option to compile out. If larger, it could be built-in for the editor, but loaded via gdnative on exported projects (or optionally compiled in on an export template).

@bruvzg bruvzg force-pushed the min-shaping branch 2 times, most recently from 7b573fc to 20f16f1 Compare September 27, 2018 10:28
@bruvzg
Copy link
Member Author

bruvzg commented Sep 27, 2018

How do you want to continue from here? I can imagine Label and RichTextlabel will need more complex code.

I have added separate functions to break lines and justify for Label, and separate shape_spans for RichText. But didn't do anything to controls itself yet.

Also, one thing I was never too clear about is how big is ICU. I suppose in the range of <1mb it could be compiled in with the option to compile out. If larger, it could be built-in for the editor, but loaded via gdnative on exported projects (or optionally compiled in on an export template).

Currently total size impact of HarfBuzz and ICU on macOS release export template without LTO is 1.42 MB, some parts of ICU probably could be removed.

ICU break iterator data is about 6 MB for all languages, full data is 26MB, but it is optional (only used for breaking lines and justification of some languages) and can be loaded as resource when needed (Can be set in application/config/icudata project setting). Also full ICU distribution includes tool to customize data file.

With ICU data:
screenshot 2018-09-27 at 12 39 11

Without ICU data:
screenshot 2018-09-27 at 12 40 02

I don't see any LGPL text in src/hb-graphite2.{cc,h}, it has the same copyright header as HarfBuzz.

I didn't included actual library (libgraphite2) code itself, since it's LGPL in most cases lib should be linked dynamically if someone need it.

Not sure how to handle this best, graphite won't be useful for most users, it's only used for rare languages and require special smart fonts.

@bruvzg bruvzg changed the title BiDi and shaping support for Font::draw/Font::get_string_size [WIP] BiDi and shaping support for Font::draw/Font::get_string_size Sep 27, 2018
@bruvzg bruvzg force-pushed the min-shaping branch 3 times, most recently from a68b98c to 57d9eda Compare October 29, 2018 08:57
@bruvzg
Copy link
Member Author

bruvzg commented Dec 29, 2018

FYI: I have uploaded standalone version of the same thing at - https://github.com/bruvzg/godot_tl

  • Works both as builtin module (clone into Godot's modules and rebuild engine) and gdnative module (dynamic library).
  • Same backend with some fixes + paragraph + rich text edit control prototype.
  • Includes optional graphite shaper support (LGPL/MPL).
  • In addition to String works with raw UTF-8/UTF-16/UTF-32 PoolVector (Godot String on Windows uses 16-bit wchar_t but does not support surrogate pairs on any level and is effectively broken for everything outside BMP).
  • Uses slightly modified font classes (based on Godot font classes, stripped of unneeded stuff), to provide smarter script based font fallback/substitution.

@bruvzg bruvzg requested a review from a team as a code owner February 5, 2019 06:54
@bruvzg bruvzg force-pushed the min-shaping branch 2 times, most recently from 819be30 to efed218 Compare March 12, 2019 21:12
…(LineEdit, Label), line breaking and line justification using ICU and HarfBuzz libraries.
@slevy85
Copy link

slevy85 commented Apr 11, 2019

Hi,
I am not sure it is the right place for this, but I have used this PR and I got an issue when using Stretch Mode in 2D :
Project Settings/Display/Window/Stretch

The letters get mixed together :
image

image

Edit : It doesn't happened with the default Godot font, so maybe it is a font issue, I used Amiri fonts to test

@realkotob
Copy link
Contributor

realkotob commented Sep 2, 2019

@bruvzg If I need to add a Label with arabic text, should I use this PR or use the standalone version?

@MohammadKhashashneh
Copy link
Contributor

MohammadKhashashneh commented Oct 1, 2019

Hi, anyone can update us with the current state of this pull request? it's been a while since the last exchange. @bruvzg Are you still on it? how can I help?

@MohammadKhashashneh
Copy link
Contributor

MohammadKhashashneh commented Oct 2, 2019

Ok, I did a little research and to answer my own question here is what I found:

  • The branch builds and it looks really promising. glad to see actual Arabic support happening. great work @bruvzg
  • Aside from the above mentioned todo list, the following issues are currently present:
    • Could not build with the use_staitc_icu_data=true option
    • No default fallback/last resort when not specifying a font (had to specify a font to show Arabic script)
    • No fallback also when using mixed text (2 scripts)
    • crashes here and there (when running the scene, when modifying font properties...)
  • There exist a separate module https://github.com/bruvzg/godot_tl (didn't try yet)
    that is actually a bit more active and fixes some issues, more controls supported, fallback...

Now my new questions @reduz @akien-mga @bruvzg

  • How to proceed from here? Is fixing and continuing on this branch and ultimately merging still an option?
  • should we just move on and use the separate module? can't it be officially included in godot ?

I'm willing to help either way so I appreciate it if someone could answer the above.

@akien-mga
Copy link
Member

  • How to proceed from here? Is fixing and continuing on this branch and ultimately merging still an option?
  • should we just move on and use the separate module? can't it be officially included in godot ?

The plan is to have all these features built-in in the engine, I think it fulfills an important enough use case to be made available without hassle in the base engine.

But to do so, the implementation and API need to be designed to fit closely to Godot's design principles and architecture for rendering text - or Godot's architecture for text needs to evolve based on a consensus. At the time of this PR the consensus could not be reached, but later on @reduz and @bruvzg more or less reached a consensus on how to do it. Now what's needed is contributor time, but also the ability to refactor text APIs and probably break compatibility, which means waiting for Godot 4.0's development cycle.

@akien-mga akien-mga modified the milestones: 3.2, 4.0 Oct 4, 2019
@MohammadKhashashneh
Copy link
Contributor

Great! Thank you @akien-mga for the clarification.
If that's the case then 4.0 would be Ideal to properly shake things up and do it right.
Ok , I believe in the meantime is to get aquanted more with this PR tiny details and get ready.

Will someone post the requirements/vision somewhere?

@OmarAglan
Copy link

@bruvzg Any Update On This?
I worked on it a bit and it's working with some addons from my side (i will not pull request it).

@bruvzg
Copy link
Member Author

bruvzg commented Oct 30, 2019

Any Update On This?

Nothing is being done here. If you need a bit more maintained version - https://github.com/bruvzg/godot_tl

@aaronfranke aaronfranke marked this pull request as draft April 8, 2020 23:46
@aaronfranke
Copy link
Member

aaronfranke commented May 13, 2020

@bruvzg What is the status of this PR? Is there any desire to continue this work and add it to Godot, or should users who want this feature simply be directed to your repository?

If not, abandoned pull requests will be closed in the future as announced here. Feel free to close this PR if you don't want to rebase and continue this work.

@jitspoe
Copy link
Contributor

jitspoe commented Oct 14, 2020

Is this the closest thing to getting Arabic support or is there something else being worked on? I saw this as a gdscript solution, but it seems like it would make more sense to support it natively in-engine: https://github.com/3akev/godot-arabic-text

Also, the gdscript version uses regular expressions, which I believe cause problems with Switch ports. (Is that still a thing?)

@bruvzg
Copy link
Member Author

bruvzg commented Oct 14, 2020

@jitspoe current version is here #41100 (for 4.0 / master), there's also https://github.com/bruvzg/godot_tl (for 3.2).

@nightblade9
Copy link
Contributor

nightblade9 commented Oct 14, 2020

I saw this as a gdscript solution, but it seems like it would make more sense to support it natively in-engine: https://github.com/3akev/godot-arabic-text

AFAIK that project was intended as a band-aid solution (it's mostly a port of some Python bidi libraries/code) until we can get something better. I believe there is an issue in the godot-feature-requests repo for proper bidi support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.