Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework font fallbacks #4658

Merged
merged 1 commit into from
Sep 17, 2022
Merged

Rework font fallbacks #4658

merged 1 commit into from
Sep 17, 2022

Conversation

sommerluk
Copy link
Collaborator

@sommerluk sommerluk commented Aug 21, 2022

Fixes #4653

Adds fallback fonts and simplifies the font groups.

Changes proposed in this pull request:

  • Treat Noto Sans Arabic like all other Sans fonts. Until now, it had a special treatment because it could redefine some Latin characters; therefore, it was placed behind. However, this is apparently something that happened in Noto Nashk Arabic (which we used in the past). Since a few years, we are using yet Noto Sans Arabic, which does not have this problem. Also, Arabic is not the only script for which Noto provides particular styles: Hebrew for instance has the Noto Rashi style, so Arabic is not special in this sense either.
  • Simpifies the font subgroups to only three subgroups.
  • Solves Adlam font duplicate #4653 by choosing Noto Sans Adlam Unjoined. This choise is because Noto recommends the Unjoined variant for headlines, which seems to me appropiate for carthography, but this choise is subjective.

@pnorman pnorman self-requested a review August 21, 2022 20:51
Copy link
Collaborator

@pnorman pnorman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to support such seldom characters, it makes IMO sense to re-introduce also Unifont again, which has a big coverage

Do you have some examples of glyphs in unifont that we don't have coverage of?

Do you have any suggestions of good locations to review this with?

scripts/get-fonts.sh Outdated Show resolved Hide resolved
@sommerluk
Copy link
Collaborator Author

If we want to support such seldom characters, it makes IMO sense to re-introduce also Unifont again, which has a big coverage

Do you have some examples of glyphs in unifont that we don't have coverage of?

There is a good graphical coverage table for Noto: https://notofonts.github.io/overview/ (The site takes some seconds to load…) Everything that is red (and also everything that is green, but where we don't include the font in openstreetmap-carto because it's a dead script) is not covered.

Another question is, however, if these code points are actually used in the OSM database.

Do you have any suggestions of good locations to review this with?

Honestly, no. I didn’t manage to make an overpass query to find some occurrences. For #4644 even the bug reporter could not come up with a real usage in OSM. And also for #4653 I did not find data. (I searched for “name:ff” in the hope to find some node that also has its name tag in this language, and maybe in Adlam script. No success.) It seems that Unicode support of Overpass practically does not exist, at least for regular expressions. Not only everything beyond the BMP does not work, but also Arabic letters within the BMP let the regular expression fail.

The tests that I've done locally have been done with mock-up data.

@pnorman pnorman self-requested a review August 23, 2022 10:46
@sommerluk
Copy link
Collaborator Author

I've added a commit that removes the font-directory declaration, fixing #4667.

@pnorman
Copy link
Collaborator

pnorman commented Sep 1, 2022

I've added a commit that removes the font-directory declaration, fixing #4667.

I'd prefer testing why we have #4667 to see if we have an error, and keeping it separate.

@sommerluk
Copy link
Collaborator Author

Okay. I've removed this commit.

@pnorman
Copy link
Collaborator

pnorman commented Sep 2, 2022

I'm running a query on my DB to look for names with characters outside Noto's coverage range, not counting CJK which we expect to be covered by Hanazono.

I have no concerns except the unifont, and want to validate if this is an actual issue.

@bgo-eiu
Copy link

bgo-eiu commented Sep 3, 2022

I have noticed that certain characters do not come up in taginfo either. I can add a name to the map which has a currently unsupported character if that would help

I appreciate the inclusion of my issue together with the Adlam one, but I want to reiterate that using a Nastaliq font as a fallback in Carto is not a good idea. The problem is that letter position presentations and connections are different between the two script styles (Naskh and Nastaliq, I am considering Noto Sans Arabic a Naskh font here to be clear).

Here is how lam + alef connect in a typical Naskh font like Noto Sans Arabic:
image

Here is how they connect in a Nastaliq font:
image

Now if you have a variant of lam that expects the second connection but gets the first, the result is this:
image

Here everything is Noto Sans Arabic except the lam with tah above, which has fallen back to Noto Nastaliq Urdu. These are both forms of lam and need to connect to alef the same way. This is why I use alt_name for letters like this until there is consistent font support, because the letter needs to connect the same way as other letters, not just be rendered itself.

Unifont has impressive character coverage, but apparently at least some of its Arabic characters have incomplete presentation forms and do not even connect cursively. (Their version of this character is also incorrect, the "tah" is supposed to be on top of the staff.)

Anyway, I just tried this with a few different fonts, and none of them render this correctly. Noto Sans Arabic is one of the few fonts which renders ݨ correctly, so it is good overall. I think PakType Naskh looks the best out of the fallback options:
image

However, I think you can leave the Arabic fallback out for now as this is still not great. Whenever I have time, I want to just try a hand at adding the missing characters to Sans Arabic and making a pull request to the Noto font repository

@sommerluk
Copy link
Collaborator Author

Okay. If I understand you correctly, you think…

  • There should be a fallback font, so that U+08C7 ARABIC LETTER LAM WITH SMALL ARABIC LETTER TAH ABOVE and potentially some other non-supported rare Arabic characters get rendered.
  • The fallback font should not be Noto Nastaliq Urdu, as it has a completely different design than our standard fonts and therefore does not match them well.

I chose Noto Nastaliq Urdu because it is at least in the same font family as our default fonts, so installing the font with script was trivial, and I was hoping for at least some similarities in design, and ultimately it is just a fallback font and not a default font.

However, I get your point that the fallback font has a very different style than the default font, which is visually intrusive.

I will wait for the results of @pnorman's query before moving on.

@pnorman
Copy link
Collaborator

pnorman commented Sep 8, 2022

I'm running a query on my DB to look for names with characters outside Noto's coverage range, not counting CJK which we expect to be covered by Hanazono.

I did not find any characters.

@sommerluk
Copy link
Collaborator Author

Okay. It might have been a bad idea of mine to put the Arabic script issue and the other changes in the same PR. That's why I removed the part about the Arabic script, which we can further discuss separately in #4644. Here thus remain the undisputed changes.

@pnorman pnorman merged commit 56a7a8f into gravitystorm:master Sep 17, 2022
@sommerluk sommerluk deleted the fontFallback branch September 18, 2022 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adlam font duplicate
3 participants