Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arabic ligature diacritic mis-positioning #3069

Closed
Tracked by #3078
aaronbell opened this issue Jul 24, 2021 · 8 comments
Closed
Tracked by #3078

Arabic ligature diacritic mis-positioning #3069

aaronbell opened this issue Jul 24, 2021 · 8 comments

Comments

@aaronbell
Copy link

I'm trying to understand an observed behavior in the new Cascadia Code Arabic font.

Here is a sequence used for the lam_lam_heh-ar ligature, with a couple of diacritics added:
uni0644 uni064E uni0644 uni064F uni0647

The output of such is:
Screen Shot 2021-07-23 at 7 17 50 PM

With the sequence (LTR, for convenience):
lam_lam_hehar uni064E uni064F LIG
(The LIG is an empty glyph added to maintain alignment with the monospace grid)

However, the output should look like:
Screen Shot 2021-07-23 at 7 20 41 PM

I can achieve this output by removing the substitution that adds the LIG character:
sub [lam_lam_heh-ar allah-ar] @vocal damma-ar' by damma-ar LIG;

As far as I can tell, the OT mark to ligature feature in the font is working as expected, and this is purely on the rendering engine side (macOS renders it correctly). It appears that the presence of the LIG character is changing how Harfbuzz analyzes the mark positioning and thus is causing the mark next to the LIG to be incorrectly positioned.

While I have a workaround (shift the position of the LIG to before the ligature), I was wondering if you could help clarify what is causing this behavior.

Thanks!

For convenience, here is the latest version of the font:
CascadiaCode.ttf.zip

@khaledhosny
Copy link
Collaborator

My hunch would be that splitting the damma-ar glyph is causing HarfBuzz to miscalculate what component it applies to.

@behdad
Copy link
Member

behdad commented Jul 26, 2021

This was done intentionally to match Uniscribe. I can't find the commit right now. But basically there was a case demonstrated by one of the popular MS fonts, whereas if first eg. A,B ligated into AB, and then that AB ligated with a C into ABC, then a mark attached to second component of that ABC expected to be positioned on C, NOT B. That is, the AB would be considered one ligature component after the ABC was formed.

What would our Uniscribe backend do?

@khaledhosny
Copy link
Collaborator

That would be 7b84c53.

$ hb-view CascadiaCode.ttf -u 0644,064E,0644,064F,0647

h

hb-view.exe CascadiaCode.ttf -u 0644,064E,0644,064F,0647 --shaper=uniscribe

u

hb-view CascadiaCode.ttf -u 0644,064E,0644,064F,0647 --shaper=coretext

c

@behdad
Copy link
Member

behdad commented Jul 28, 2021

That would be 7b84c53.

No not that one. Let me try harder to find.

@behdad
Copy link
Member

behdad commented Jul 28, 2021

No not that one. Let me try harder to find.

commit fe20c0f
https://bugzilla.gnome.org/show_bug.cgi?id=437633

@behdad
Copy link
Member

behdad commented Jul 28, 2021

Thinking more, this is supposed to work. Debugging.

@behdad
Copy link
Member

behdad commented Jul 28, 2021

Okay so when you do a MultipleSubst on a mark, we are not retaining its ligature-component properties I suppose. Let me try.

@behdad behdad closed this as completed in 6fe0d7d Jul 28, 2021
@behdad
Copy link
Member

behdad commented Jul 28, 2021

Should be fixed.

DHowett pushed a commit to microsoft/cascadia-code that referenced this issue Oct 29, 2021
This is a fairly comprehensive (and spooky!) 🐛💀 update resolving many
open issues.

### Arabic bugfixes
- [x] Closes #532 👻 - Additional positional variants added
- [x] Closes #535 🍂 - Corrected hamza form
- [x] Closes #540 🎃 - Dot arrangement corrected
- [x] Closes #541 🧹 - Was due to the use of anchors on those glyphs.
  These have been removed so the glyph can render as spacing.
- [x] Closes #542 🌕 - This was partly due to a [bug in Harfbuzz]. It
  has been resolved both on the font side (through a different
  implementation) and in Harfbuzz. 
- [x] Closes #549 🦸‍♀️ - Design corrected
- [x] Closes #555 💀 - All letter glyphs removed from Arabic
  Presentation form unicode slots to avoid situations where the glyphs
  are not behaving as expected.
- [x] Related to #543 - uni0615 removed as Cascadia Arabic not intended
  to support Quranic

### Other bug fixes
- [x] Closes #488 🔪 - Finally made the www ligature have the proper
  number of `w`s. 
- [x] Closes #436 🧟‍♀️ - Extended length of Powerline 'caps' to
  avoid situations where rounding can prevent overlap. This may cause
  problems if the caps are used next to one another, but that seems an
  unlikely scenario given what I've reviewed of Powerline styles. 
- [x] Closes #521 🤖 - enlarged the size of the grave character to make
  it more recognizable / legible in code. 
- [x] Closes #524 ☠️ - Added some more differentiation in stroke, and
  also created more space using hinting. 
- [x] Closes #525 🧙‍♂️ - tweaked the braces to be more twisty and
  create better differentiation from the parens. 
- [x] Closes #529 🧛‍♀️ - Changed year :P
- [x] Closes #546 👹 - ij no longer masquerading as a mark. 
- [x] Closes #563 🧟‍♂️ - corrected `locl` feature for proper
  Serbian rendering
- [x] Closes #571 🦹‍♀️ - corrected overshoot
- [x] Closes #572 🕷 - ratio symbol added
- [x] Closes #577 🍁 - shifted heights of box drawing lines to better
  align with block glyphs. Will reduce risk of non-joining forms under
  certain conditions. 

[bug in harfbuzz]: harfbuzz/harfbuzz#3069 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants