Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hebrew support #7512

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from
Draft

Hebrew support #7512

wants to merge 11 commits into from

Conversation

BLooperZ
Copy link

@BLooperZ BLooperZ commented Nov 5, 2024

This is an attempt to add hebrew support when rendering text

this is not yet fully working though

I couldn't figure out how to add freebidi dependency on platforms other than linux

on linux, there are some cursors displayed at random positions, even without the bidi conversion (on windows it seems to work correctly)

I hope there aren't other issues with cursor positioning
(and I haven't even added bidi support for text input😅)

help and comments will be greatly appreciated

Thank you for the amazing work on🙏

Copy link
Collaborator

@glebm glebm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please send a separate PR with just he.po? Makes reviewing easier, and we can submit it right away.

@@ -167,7 +168,7 @@ OptionalClxSpriteList LoadFont(GameFontTables size, text_color color, uint16_t r
const std::string_view language_code = GetLanguageCode();
const std::string_view language_tag = language_code.substr(0, 2);
if (language_tag == "zh" || language_tag == "ja" || language_tag == "ko"
|| (language_tag == "tr" && row == 0)) {
|| (language_tag == "tr" && row == 0)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be some whitespace changes here that will fail clang-format
We use tabs for structural indentation but spaces for continuation indentation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will revert this change

Source/engine/render/text_render.cpp Show resolved Hide resolved
for (; !remaining.empty() && remaining[0] != '\0'
&& (next = DecodeFirstUtf8CodePoint(remaining, &cpLen)) != Utf8DecodeError;
remaining.remove_prefix(cpLen)) {
const auto drawSingleLine = [&]() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little bit unclear at a glance what this lambda captures, consider extracting the contents of the lambda into a function.

char32_t next;
std::string_view remaining = text;
size_t cpLen;
std::u32string text32 = ConvertUtf8ToUtf32(text);
Copy link
Collaborator

@glebm glebm Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We like to avoid allocations in code that runs on every frame, is it possible to implement this in a streaming fashion?

It looks like fribidi only supports UTF-32 but perhaps there is another library?
A quick search reveals https://github.com/Tehreer/SheenBidi, which supports UTF-8, and provides a number of handy functions for this. Looks like we can also replace our UTF-8 decoding implementations with calls to theirs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, 2 notes regarding this (from my point of view, please feel free to suggest otherwise)

  1. Because I am trying to find and handle each line separately, working with UTF-32 actually greatly simplify the indexing of the string, allowing simpler logic to get each line, which I find harder doing on UTF-8.
  2. I was having trouble setting up additional dependency as I am not yet experienced with cmake. I was able to set up linking with freebidi binary on linux because it is quite popular. I am open to alternative but I would need guidance for setting them up.

Copy link
Collaborator

@glebm glebm Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I am trying to find and handle each line separately, working with UTF-32 actually greatly simplify the indexing of the string, allowing simpler logic to get each line, which I find harder doing on UTF-8.

UTF-8 is a self-synchronizing encoding, which means explicit line breaks can be found simply via find('\n'), without the need for decoding. If you use UTF-8, you'd use code unit (byte) indices rather than code point indices throughout (which necessitates a BiDi library that supports UTF-8 natively).

I am open to alternative but I would need guidance for setting them up.

It's quite simple, add a file 3rdParty/SheenBidi/CMakeLists.txt with:

include(functions/FetchContent_MakeAvailableExcludeFromAll)

include(FetchContent)
FetchContent_Declare(SheenBidi
    URL https://github.com/glebm/SheenBidi/archive/3721411b5ec2862240f34aeeb3e8ba59283ec2ec.tar.gz
    URL_HASH MD5=7dc0138bda1e16b217cee66cfd95a6f9
)
FetchContent_MakeAvailableExcludeFromAll(SheenBidi)

The target to link to is SheenBidi::sheenbidi.

@@ -2,7 +2,7 @@ if(NOT DEFINED DEVILUTIONX_ASSETS_OUTPUT_DIRECTORY)
set(DEVILUTIONX_ASSETS_OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/assets")
endif()

set(devilutionx_langs bg cs da de el es fr hr hu it ja ko pl pt_BR ro ru uk sv tr zh_CN zh_TW)
set(devilutionx_langs bg cs da de el es fr hr hu it ja ko pl pt_BR ro ru uk sv tr he zh_CN zh_TW)
Copy link
Collaborator

@glebm glebm Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Estonian has been added a month ago so there is now conflict in this line that will require rebasing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants