-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Internationalization #26
Comments
I recommend using BCP47 language tags for language identification (per HTML standard). That would allow you for example to refer to az-Latn vs az-Arab vs az-Cyrl above. For more info about BCP47 language tags see https://www.w3.org/International/articles/language-tags/ |
I'd have thought that Javanese written in the Javanese script (ltr) was more likely to be encountered than written in the arabic script. (See https://r12a.github.io/scripts/javanese/ for info about the Javanese script.) |
Forgive my ignorance, but are you using XHTML only? HTML5 has a number of important additions for support of bidirectional text, introducing control for directional isolation and auto-detection of base direction for injected text. For more information see https://www.w3.org/International/articles/inline-bidi-markup/ (updated just now). Hope these comments are helpful. |
@r12a Oh yeah that’s super useful, thank you! To answer your questions:
|
We’ll have an issue with Arabic in EPUB 2 though. Yesterday Kevin Callahan triggered what appeared to be a bug in iBooks at first sight. Except it wasn’t. To sum things up, the EPUB opened backwards (as if Which reminded me EPUB 2 doesn’t have this What it means:
How should we handle this case then? † Footnote
|
After further testing, this is how some major Reading Systems handle this particular case (with 2
Methodology: 6 files were tested – 2 EPUB2 files, 4 EPUB3 files. The only differences were:
For instance,
and
Note: per their guidelines, Kobo is allowing authors to use |
My file is EPUB3, FYI. And now I’m glad to know how iBooks decides which language + reading direction to use. Thanks.
… On Dec 23, 2017, at 5:39 AM, Jiminy Panoz ***@***.***> wrote:
We’ll have an issue with Arabic in EPUB 2 though.
Yesterday Kevin Callahan triggered what appeared to be a bug in iBooks at first sight <https://twitter.com/BNGOBooks/status/944260033973506048>. Except it wasn’t.
To sum things up, the EPUB opened backwards (as if page-progression-direction="rtl" were set on the spine). Turns out the app used the last language element it found in the metadata, which was ar-SA, to render that.
Which reminded me EPUB 2 doesn’t have this page-progression-direction attribute, but supports anything else needed for Right-to-Left (including the dir attribute and the direction CSS property). I can indeed remember EPUB 2 publications in Arabic (at some point, a prospect asked if I could do that and after checking who could do that, I discovered services offering EPUB2 output).
What it means:
there are major Reading Systems supporting RTL in EPUB 2;
content providers have been probably using this version because it just works;
the only hint we get is the language then;
we don’t know what’s the main language is when multiple elements are declared;
I couldn’t find any guidance on multiple elements handling in the specs (yet);
if there are no guidance, it explains the huge interoperability issues† authors have to deal with.
How should we handle this case then?
† Footnote
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#26 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APegmJTE_3R64KGjxdfLVIhTWQgSk9QJks5tDNhxgaJpZM4Q_FOY>.
|
Ah thanks for the clarification that I should have made myself in the previous message. So yeah, I can confirm that their strategy is trusting the last |
Samples are now available to test and improve i18n support. |
Took a quick look at the iBooks doc to check some details related to rendition and oh boy do they put the burden on authors.
What worries me lotbits is that this is consistent with the issues we quickly discovered when starting work on vertical writing in the EPUB context (paginated, page-progression, mixed writing-modes, etc.) and they put constraints we can’t necessarily. It also reminds me the Japanese industry is pushing for the EPUB 3.2 revision while the current spec is barely vertical-writing-friendly. I mean, I’m pretty sure lots of contents relying on the The constraints Apple has designed, they’re just here to fill the spec gaps – but we’ll have to design heuristics because we simply can’t do that. cc @llemeurfr |
Added control captures for i18n samples. They are all available in the i18n folder. I can’t tell whether we have any issue for text rendering or not. So any review would be greatly appreciated since I can tell for sure that only Mongolian doesn’t render as expected (we have no system font to manage that). Tests those capture cover:
|
@JayPanoz, in the scope of the EPUB 3.2 revision, can you define what would be useful for handling vertical writing perfectly (or at least better)? |
Yeah no problem. 1/ I’ve already raised an issue about multiple metadata elements and the lack of guidance a.k.a. “which one should be considered the primary language?” on the EPUB-revision issue ( In practice, vendors already have to deal with this, and may impose constraints to authors, e.g. only one language item, specifying the Han Traditional or Han Simplified script for Chinese, using another prefixed meta to force direction of scroll, etc. But that’s only because there is no better way to know we should use vertical-writing from the OPF… 2/ We quickly discussed something like The thing is it’s unrealistic to use the 3/ Then there are the mixed directions and writing-modes issues i.e. when the A warning/note that all reading systems can’t necessarily manage those cases would be the least the spec can do, and authors must be extra cautious about that. It currently isn’t discussed at all, and the process just started for Web Publications – but impacts EPUB3 as well. |
Oh and yeah, a super important one. 4/ Authors shouldn’t rely on |
Hi Jiminy, |
Well, thanks for the info. If that’s Google Play Books, they’re consequently creating an interoperability issue – and we already have an awful lot because vendors decide to do such stuff unilaterally. It also breaks yet another fundamental rule of CSS, and change the way it is working so this is bad. This should be raised at a higher level, because it impacts the whole ecosystem and, more importantly, competitors. How are they parsing CSS in the first place? You can’t encounter this issue unless you create it and are extremely lazy – in the worst case scenario, stripping those declarations is a 2-line script… |
Opened a specific issue (#32) about prefixed properties as we’d better keep this one for i18n only. |
Oh joy… Kindle being Kindle all over again:
|
Just a question; why would we need to support this Kindle thing as your point 3. is the standard way to act? shouldn't we recommend authors to use the standard way plus alert them that they'll need to add the Amazon meta if they also target Kindle? |
To clarify, I would not necessarily worry about that in the short term. If you ask me “do we need to support that?”, my answer would be “No, not at the moment.” Now, [company we shouldn’t name] happens to finally care about internationalization after a decade of not caring at all. And if you had to support right to left scripts in the easiest way possible, a way that’s EPUB2-compatible and won’t raise an ePubCheck issue, a way that doesn’t require extra namespaces and doesn’t disrupt workflows, etc., guess what you’ll end up with? Yes, a In other words, in the longer term, my answer could become “yes, we need to support that, because usage is significant.” But that’s relatively new, so we should probably keep in on our radar and that’s it. |
Note: I’ll use “i18n” instead of “internationalization”.
So let’s be honest, this issue will quite probably stand as “The Readium CSS issue” since it is roadmap-blocking, is impacting other parts of Readium 2 (streamer, navigator, API, apps developed by EDRLab), and will need a lot of documentation. In other words, it’s a project on its own, nested in the Readium CSS project.
I’ve spent the last 3 weeks documenting this issue, and you can think of it as a summary of the research that has been done. I won’t list everything there but only what is critical to provide implementers and readers with a solid baseline. I’m willing to make this baseline the best we can get (say, bulletproof although rough around the edges) but it’s worth noting we’ll need help from experts in those various and diverse languages and their typography to turn it into an excellent user experience.
Roadmap
First and foremost, I’d be in favor of updating our current roadmap. Vertical writing is indeed blocking and I’d like to move it to the beta version.
What it means is that we could ship support for
horizontal-tb
CJK, Right to Left and Indic languages in the alpha version relatively quickly, and then focus on vertical-writing, since our work on a11y baseline has been ahead of time since the beginning of this project.I believe we would all probably agree that the prototype has proved a solid-enough bedrock – of course we have edge cases to deal with but it’s fine for the vast majority of contents – and pushing the small and easy wins for RTL/
horizontal-tb
CJK/Indic on the develop branch would allow us to release an alpha on the master branch early 2018. This would probably send a good signal too since the proto has been released 3 months ago already.On a related note, we’ll start documenting columns handling (e.g. page progression) in January 2018 so it would make sense to prioritize LTR/RTL (
horizontal-tb
), especially as we’ll be able to document vertical-writing immediately after – quite frankly, this will be critical since they are conceptual changes to take into account.Global needs outscoping Readium CSS
Obviously, Readium CSS won’t be able to fly in autopilot mode there. It needs either flags it can target or smart handling of its resources depending on the publication.
Minimal set of features
What we’ll need:
page-progression-direction
for thespine
(streamer);<meta>
(streamer);xml:lang
and/orlang
attribute if it’s missing in XHTML documents (API);dir="rtl"
attribute if it’s missing in XHTML documents (API);text-align
) with a rtl direction for RTL languages (Apps);A longer-term issue will be localization, should you want to get this need covered in the apps, as implementers might want an easy way to translate strings, etc. But it’s up to EDRLab, obviously.
Writing-mode and RTL mapping
For writing mode, those are the
writing-mode
we should apply based on the language andpage-progression-direction
:I propose we simplify this model for Chinese and rely on
page-progression-direction
with an extra check for language (zh
), and not bother with all those variants.It’s worth noting we should not add
dir="rtl"
there, for the CJK languages.In Right to left, we can simply rely on
page-progression-direction
, if the language is not CJK (and Mongolian) but here is a mapping of languages you might encounter, just for your information:Right to Left
This shouldn’t be a huge issue in Readium CSS, as we only need a few adjustments, specific base and default styles, and typefaces.
Hopefully, this doesn’t impact our views (paged and scrolled) since columns will behave as expected.
Our pagination model is the following:
CSS Multicol in horizontal-tb (x-axis)
When the
dir
attribute is set onhtml
, it becomes:CSS Multicol in horizontal-tb + dir="rtl" (x-axis)
Our main CSS-related concern there should be typefaces, as we’ll need outstanding fonts to deal with typography requirements (ligatures, multi-baseline levels, joining rules, etc.).
CJK (horizontal-tb)
Similar to RTL: we only need a few adjustments, specific base and default styles, and typefaces.
This should already provide support for the vast majority of contents in Chinese (vertical-writing is not used in mainland China, but only in Taiwan, Hong Kong and Macao), and Korean.
Chinese, Japanese and Hangul share a lot in terms of typography but having a few adjustments for each language would be a plus since differences are quite minor.
Other languages
For the time being, we’re only focusing on Devanagari, which should not have a huge impact. Once again, we’ll need a few adjustments, with the main focus being typefaces.
Vertical Writing
This is by very far our biggest issue in Readium CSS since we can’t necessarily manage that well, cross-platform-wise.
We don’t have anything to force the column-axis in CSS, which means that our spread model (two columns next to each other)
CSS Multicol in horizontal-tb (x-axis)
Will automatically become the following in
vertical-rl
:CSS Multicol in vertical-* (y-axis)
So the best we can do right now is a fragmented scrolled-view:
New fragmented scrolled-view for vertical-* (y-axis)
In other words, one column with overflowed columns on the y-axis, which 1) will force implementers to map left/right (swipe/buttons) on bottom/top and 2) won’t allow them to have page-transition animations.
Note: The only alternative to solve those issues at the moment would be writing a renderer in JavaScript. It’s worth noting that if you’re only targeting iOS, there is a solution in pure CSS though.
What’s even worse is that the same typefaces can’t necessarily be used (proportional/fixed-width depending on writing-mode), and I’ll have to make adjustments for quotes and other details in the base and default stylesheets based on writing-mode…
Note: We won’t try to manage
horizontal-tb
documents invertical-rl
publications in a smart way for the time being. This use case is indeed not defined in the EPUB spec. Besides, we’ve got nothing at the OPF level to deal with it, and checking thewriting-mode
during runtime will blow performance in extreme ways i.e. 15 seconds to render some XHTML files… which would be worse than supporting this use case in terms of UX.Longer terms issues include:
-epub-properties
for web apps;rendition: align-x-center
;ibooks:respect-image-size-class
(gaiji) andibooks:scroll-axis
metas (see EPUB Compat doc);letter-
andword-spacing
might have to be removed, and not only for CJK);rendition: flow
ofscrolled-doc
.Out of scope
There are some typography and layout issues which are not our responsibility but rendering engines’. Those issues include:
display: run-in
), which is popular in CJK;ruby
and its styling;bidi
;Documentation
In theory, I would only have to document the new fragmented scrolled-view model for vertical-writing, and adjustments for user settings.
In practice, I’m willing to go the extra mile and will document typographic and layout concepts, and make glossaries, so that Western implementers have everything at hand to deal with requirements and issues in CJK and languages they might not be familiar with.
This will obviously take time but will fix a huge pain point.
Overarching Issues
writing-mode
, RTL, typefaces used/expected, etc.Resources
The text was updated successfully, but these errors were encountered: