You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tested with BIRT 4.3.0, but I think the behavior is still the same with current BIRT source:
Unicode Symbol 173 is the soft-hyphen symbol known as SHY.
The abbreviation SHY is fitting perfectly, because this symbol usually hides itself and only should be visible at the end of a line.
Some languages, German in particular, tend to use very long words, for example in chemistry.
In some cases, the text containing these words is predefined somewhere, e.g. in a database.
So it makes sense to store good possible hyhenation points by placing the SHY symbol at these places inside the word,
e.g. for a word like "kapillargaschromatographisch" this could be stored as "kapillar-gas-chromato-graphisch".
Notes:
There are more possible hyphenation points inside this word, e.g. "ka-pil-lar-gas-chro-ma-to-graph-isch", but not all of them good for readability.
I used the minus sign instead of the SHY symbol, because you wouldn't see it: The browser understands it and would hide it.
This does not mean we need automatic hyphenation. But if the text is already prepared for hyphenation, let's just use that info.
As far as I can tell, the PDF emitter does all the line-breaking logic inside BIRT.
So it should be possible to specifically handle the SHY symbol inside pre-hyphenated word and to consider it in the simple word-breaking algorithm.
It should also be possible to make this work in some way for the Word emitter. At least it seems necessary to replace a SHY character with <w:softHyphen/> in the XML output.
I'll ask my boss if and when I can get the time to tackle this.
The text was updated successfully, but these errors were encountered:
Demo report, containing the same text as HTML text item and plain text item.
With BIRT 4.9, the HTML text item works correctly in the DOCX output, the plain text item does not (showing minus symbols instead). For PDF output, both text items do not work correctly.
Tested with BIRT 4.3.0, but I think the behavior is still the same with current BIRT source:
Unicode Symbol 173 is the soft-hyphen symbol known as
SHY
.The abbreviation SHY is fitting perfectly, because this symbol usually hides itself and only should be visible at the end of a line.
Some languages, German in particular, tend to use very long words, for example in chemistry.
In some cases, the text containing these words is predefined somewhere, e.g. in a database.
So it makes sense to store good possible hyhenation points by placing the SHY symbol at these places inside the word,
e.g. for a word like "kapillargaschromatographisch" this could be stored as "kapillar-gas-chromato-graphisch".
Notes:
As far as I can tell, the PDF emitter does all the line-breaking logic inside BIRT.
So it should be possible to specifically handle the SHY symbol inside pre-hyphenated word and to consider it in the simple word-breaking algorithm.
It should also be possible to make this work in some way for the Word emitter. At least it seems necessary to replace a SHY character with
<w:softHyphen/>
in the XML output.I'll ask my boss if and when I can get the time to tackle this.
The text was updated successfully, but these errors were encountered: