Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text: look for hyphenation in more words if needed #320

Merged
merged 3 commits into from
Nov 28, 2019

Conversation

poire-z
Copy link
Contributor

@poire-z poire-z commented Nov 27, 2019

When laying out lines and looking for wrap possibilities, we use the last breakable space found. But we also try to hyphenate the word that follows it to grab a bit of it.
In the case where there are more words after that last breakable space (words separated by a non-breakable spaces, or where our AVOID_WRAP_BEFORE/AFTER checks decided that wrap is to be avoided at these spaces), we would only look at hyphenating the last word of this series of words: if
that fails, we would not try to hyphenate previous words.
We know check hyphenation in all these words.

Sample test case: a french line ending with pourqu'elle overflowing max line width (wrap at ' is avoided):

  • if max line width ($) is met at pourq$u'elle, pour-qu hyphenation would be found and used.
  • if max line width is met at pourqu'el$le, only elle would be checked, and no hyphenation found and we would wrap before pourqu ; pour-qu would not be considered at all. It will be now.

Should solve the non-hyphenation issue of koreader/koreader#5645 (explanation in koreader/koreader#5645 (comment)).
With the example from this issue, with Warum made non-hyphenable by changing it to Wzrzm in the 2nd line - and the still-wrong avoid wrap cause by the quote/guillemet before Warum, the possible wrap at die-sem was not considered before:

Before:
image

After:
image

The real good behaviour would be to wrap between diesem and »Warum - but this is another issue #307 (comment), yet to be fixed.


This change is Reviewable

When laying out lines and looking for wrap possibilities,
we use the last breakable space found. But we also try to
hyphenate the word that follows it to grab a bit of it.
In the case where there are more words after that last
breakable space (words separated by a non-breakable spaces,
or where our AVOID_WRAP_BEFORE/AFTER checks decided that
wrap is to be avoided at these spaces), we would only look
at hyphenating the last word of this series of words: if
that fails, we would not try to hyphenate previous words.
We know check hyphenation in all these words.

Sample test case: a french line ending with "pourqu'elle"
overflowing max line width (wrap at ' is avoided):
- if max line width ($) is met at "pourq$u'elle", "pour-qu"
  hyphenation would be found and used.
- if max line width is met at "pourqu'el$le", only "elle"
  would be checked, and no hyphenation found and we would wrap
  before "pourqu" ; "pour-qu" would not be considered at all.
  It will be now.
@poire-z
Copy link
Contributor Author

poire-z commented Nov 27, 2019

Added a small commit to fix the fact that any of diesem warum would still be hyphenated with this:

<p>die Eichhörnchen so, wie sie das tun?
Mit <span style="color: blue; hyphens: none">diesem&nbsp;Warum? begeben wir uns
zurück aud das ursprünglichkindlich Fragen</span></p>

Might be useful with popup footnotes (the HTML snippet
is given to MuPDF, even if its RTL support is currently
limited).
@poire-z
Copy link
Contributor Author

poire-z commented Nov 28, 2019

Added a small commit to possibly help with RTL popup footnotes, even if MuPDF support for RTL is limited (the leading ^ is correctly put on the right, but last line is not right aligned - and of course, no arabic font, which is another issue not related to crengine):
image

@Frenzie
Copy link
Member

Frenzie commented Nov 28, 2019

Since we have those Arabic glyphs from FreeSerif (?) I guess that means the MuPDF patch needs a minor update. :-)

@poire-z
Copy link
Contributor Author

poire-z commented Nov 28, 2019

Yes, they are in FreeSerif - but I dunno how many fallback fonts MuPDF supports (looks like it has one specific for CJK, and another for fallback, but dunno how that works in practice).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants