Mark ruby with rb rb* rt rt* at risk? #1424

chaals · 2018-05-04T11:52:55Z

The markup pattern of using multiple rb elements with corresponding rt following them, to be allocated automatically - e.g.

<ruby><rb>一<rb>二<rb>三<rt>yi<rt>er<rt>san</ruby>

does not seem to have a lot of implementation, with the only one I found being in Firefox, which will put "yi" above the first character, "er" above the second, etc. as specified.

In other browsers I tried (Edge, Opera, Safari), the string "ersan" will follow the chinese characters as normal text.

Are there others? Should we mark this as "at risk"? Or alternatively warn authors that it quite often won't work as expected?

See also the i18n group's test for this

The text was updated successfully, but these errors were encountered:

r12a · 2018-05-13T11:06:26Z

Please don't dismantle this! This 'tabular' approach has significant benefits over the currently supported 'interleaved' approach, and is needed to fully meet the needs of the Japanese community. This includes use cases such as the following:

the ability to inline annotations as 漢字 as 漢字(かんじ) rather than 漢(かん)字(じ)
the ability to correctly style 'jukugo' ruby
the ability to search for text such as 漢字 in a document

The HTML parser already supports this markup model. What's not supported in browsers (other than Firefox) is the correct automatic display of the annotations. That's actually waiting on implementation of the CSS ruby spec, not the HTML spec. Since the CSS spec intends to rely on this particular model, removing the markup model from the HTML will damage the ability to implement the CSS spec.

Before we start blindly ripping out this stuff, why don't we first see whether we can put some more energy into getting the display supported by more browsers?

(You may want to read the following article to understand the background here: https://www.w3.org/International/articles/ruby/markup)

fantasai · 2018-05-13T11:52:29Z

+1 to @r12a’s comment. (Although #3 should be handled by more intelligent searching in consideration of ruby markup, since it's a problem when searching for phrases regardless.)

Also wrt background, this post--which was the impetus for changes to the HTML and CSS specs that resulted in the tabular model--explains the problems with the original HTML5 model and how we got to the model we have now specced: http://fantasai.inkedblade.net/weblog/2011/ruby/

frivoal · 2018-05-13T14:14:45Z

As ruby lies somewhere at the intersection of i18n and a11y, it may not be the sexiest feature but it very much deserves support and continued work. Firefox has been doing better than most, but it needs to continue and other needs to catch up. For the reasons outlined by @r12a, his article, and @fantasai's, this particular kind of markup is needed if we want to be able to get both semantics and styling right for ruby in Japanese.

chaals · 2018-05-20T17:53:14Z

I agree that we should support ruby, and don't want to "blindly (or even watching carefully) rip stuff out" that is helpful and valuable. Putting it "at risk" was merely noting that browser implementation of the feature is somewhat lacking in reality, and that we should consider that explicitly in line with the way we produce the spec rather than simply continue with what it written as if it were somehow inviolable truth.

Personally, I think @r12a's argument to retain it is convincing. That said, setting up expectations for authors that are only met by 5% of deployed browsers isn't necessarily helpful either. I don't think authors generally interpret "it's parsed into a model, it just doesn't get rendered according to that model" as "implemented".

There are (as noted) many possibilities - add an editorial warning that most browsers will need some additional work to actually render the ruby, mark the thing "at risk" and even remove it from the spec, do nothing at all. The purpose of this issue is to work out which of them we should pursue.

plehegar · 2018-05-22T12:35:25Z

Given the lack of proper implementations, I don't see how we cannot put this at risk. HTML has been shipping for years without this support as far as I understand. Yes, it's very unfortunate that implementations aren't moving forward but blocking on this doesn't seem practical.

LJWatson · 2018-05-22T12:58:25Z

If we could provide evidence of wide adoption/useage on the web, or point to browser issues where intent to implement is shown and/or there is active discussion of the issues, that might help, but in the absence of interop it's a hard case to make I'm afraid.

fantasai · 2018-05-22T23:09:49Z

Getting it correctly parsed is a prerequisite for rendering it correctly. Removing that parsing support will create compatibility problems in the future if we do intend to move towards the tabular model (which I think we do). It's a good thing that widespread parsing support is preceding rendering support--it means that the parsing will happen correctly, and authors can use alternative methods to provide reasonable fallback renderings in browsers that don't have full rendering support. Without that parsing support, the result in browsers that don't have full rendering support will be very broken, it's harder to polyfill with CSS and/or JS, and this makes it that much harder to have a pleasant and useful transition period.

r12a · 2018-05-23T10:55:02Z

Can we become more specific about what we're planning to mark as at risk? Which parts of the text exactly are in danger of being removed?

I doubt that anyone is expecting to change the HTML parser, but actually that's what the HTML spec was intended to describe (positioning relies on CSS). The parser works interoperably. The only thing that's currently not working is that some of the annotations are not by default correctly shown adjacent to the base characters.

See https://jsfiddle.net/n8ffzho0/6/ which uses the tabular model and:

(on line 1) displays inline ruby with semantic relationships highlighted and parens around all the ruby annotations belonging to a given compound noun (which is one of the ways the tabular approach is more useful than the interleaved, apart from the fact that it makes searching for text easier),
(on line 2) hides the kanji and replaces it with hiragana, which is helpful for accessibility, esp for dyslexics.
(on line 3) displays the ruby as expected, given the help of a small JavaScript function.

Have the HTML WG checked for evidence that there is no wide adoption on the web, or that there are no browser issues where intent to implement is shown and/or there is active discussion of the issues? Bear in mind that the tabular model is that of the Ruby Annotation spec, which was the only game in town prior to HTML 5. (It's also the model used by TTML, fwiw.)

chaals · 2018-05-23T12:33:05Z

When I filed this issue, it was based on the observation that "this doesn't seem to work interoperably as expected in browsers". Which meant it seemed prudent to consider whether it should be changed or removed.

I don't think that changing this is easy or sensible, for the reasons various people have pointed out. Removing it without careful analysis also seems a bad idea, as people have pointed out.

Considering marking it "at risk" is essentially a precaution. If HTML goes to CR roughly on its current timeline, unless we removed it hastily we would be unable to remove the feature even if it turned out to be really unimplemented.

My current proposal is as follows:

Don't change the parsing model.
Do ensure that the benefits of the model (searching properly, ...) are explained clearly.
Note, as a warning in the spec, that few browsers implement the expected layout. So if authors do want the layout and not what most browsers do (which is actually quite confusing), in many browsers they will have to polyfill it somehow.

Note that the script you provided gives the right layout at the expense of making the underlying semantics messier (by reverting it to the "simple" model). However, that may suit some use cases. As far as I am aware, existing interoperably implemented CSS cannot do it quite right, but again it may be possible to get an approximation that is better than what browsers currently do.

I have looked for further evidence of full support, and found little - I would like to know if there are other implementations. @howcom, do you know if PrinceXML implements this correctly?

I am also still looking for other approaches to polyfilling that work.

frivoal · 2018-05-23T12:37:07Z

If we're looking for non browser implementations, I suspect antenna house formatter does it. @MurakamiShinyu can you confirm?

kojiishi · 2018-05-27T07:22:42Z

Blink is trying to switch its layout engine to a new one, and once the switch is done, we plan to have a new Ruby implantation on top of it. Also, as Florian said, there should be non-browser implantations.

Having a layout warning in HTML spec sounds like a bit strange to me. Maybe better to point the relevant CSS spec, and say it's not REC yet, is more reasonable?

chaals · 2018-05-27T11:37:36Z

@kojiishi wrote:

Having a layout warning in HTML spec sounds like a bit strange to me. Maybe better to point the relevant CSS spec, and say it's not REC yet, is more reasonable?

Yes, absolutely. The reason for talking about the layour is that the semantics of ruby are really weak, and most of what it means is related to layout. The "high-order bit" of what I have written is that the spec should be clear about what works and what doesn't. And with the particular rb rb* rt rt* structure, currently other than firefox things generally come out really broken :(

How we explain what doesn't work is a task in front of us, that I might get done today if I am lucky.

In any event, having a new implementation is a good thing. Do you have running code at this stage, or just aspirations?

kojiishi · 2018-05-28T04:42:12Z

Do you have running code at this stage, or just aspirations?

We built a JavaScript prototype for the design review of the new layout engine, but we're still in the middle of switching, no native code for Ruby yet.

I hope you agree that we don't want to update HTML when CSS status changes. Writing a proper warning for authors without introducing such troubles doesn't look easy, but if you have text, happy to review.

Also, we're not getting good responses from Japan contributors, but please make sure we get responses from forks or non-browser implementations.

@MurakamiShinyu do you know if AH or your new engine supports the "rb rb* rt rt*" model?
@realskk do you know if cho-tate-gaki supports the "rb rb* rt rt*" model?

MurakamiShinyu · 2018-05-28T10:35:18Z

AH Formatter (I tested the latest V6.5MR5) renders correctly with the ruby test ruby-position-004.html which does not omit rb and rt end tags.
However it has problem with the test ruby-position-005.html which has rb and rt start tags only.
See the screenshots:

Vivliostyle output depends on the browser in this ruby case. Vivliostyle with Chrome has same problem with Chrome browser and no problem with Firefox. Test URL: https://vivliostyle.github.io/vivliostyle.js/viewer/vivliostyle-viewer.html#x=https://w3c.github.io/i18n-tests/html/semantics/text-level-semantics/the-ruby-element/ruby-position-005.html

See the screenshots:

chaals · 2018-05-28T21:13:50Z

@kojiishi:

We built a JavaScript prototype for the design review

The test you have there doesn't cover the tests ruby-position-004.html ruby-position-005.html of this feature.

My very quick reading of that javascript is that it doesn't handle the case either, because the measure and flow functions assume that a ruby-text is immediately preceded by a ruby base without doing the reordering of the stack that would make that work (which is what @r12a's javascript patch referred to above does). Note that I could well be wrong about your javascript though - I just tried to run it briefly in my head.

I hope you agree that we don't want to update HTML when CSS status changes.

In practical terms I think we are close to agreement. For various reasons I think there is value in making some editorial comments in HTML about how an unmet CSS dependency is a current problem (which I don't see any evidence will be solved within a year, W3C's HTML update cycle).

I certainly don't want to change the underlying HTML.

Writing a proper warning for authors without introducing such troubles doesn't look easy, but if you have text, happy to review.

No, it isn't easy. I'll have one this week for wherever we get to.

kojiishi · 2018-05-29T03:09:56Z

My very quick reading of that javascript is that it doesn't handle the case either...

Quite possible, it was for a design review, might not include something that we know we can do.

In practical terms I think we are close to agreement. For various reasons...

I agree it's unlikely to happen in one year for the browser engines, and an informative layout warning pointing to the relevant CSS spec sounds reasonable. I hope you understand i18n features tend to require longer cycles, and I hope the spec text won't make the intent to implement controversial when it comes to happen.

No, it isn't easy. I'll have one this week for wherever we get to.

Thank you for taking the hard part ;)

fantasai · 2018-05-29T05:12:21Z

The reason for talking about the layour is that the semantics of ruby are really weak

The semantics of ruby is that there are parallel texts, one which is the the base text and the others the annotations. In most cases the parallel texts are alternates for the base text, but in some cases they are additional information associated to the base text. In any case, I don't see that Ruby semantically any weaker than Tables.

For various reasons I think there is value in making some editorial comments in HTML about how an unmet CSS dependency is a current problem (which I don't see any evidence will be solved within a year, W3C's HTML update cycle).

HTML is about semantics, not about layout. While the typical layout of ruby is what we see in CSS Ruby, other layouts are equally valid. For example, some Korean texts use smaller parenthesized text after the base text (in a single linear inline flow) to represent the annotations. A UA that uses such a rendering should be conformant to HTML, and such a rendering is trivial to produce with CSS2. Rendering suggestions in HTML are informative only, in any case.

chaals · 2018-05-29T11:23:35Z

This is getting off on a tangent...

By "weak semantics" I mean "this is some kind of annotation" is much weaker than "this is a transliteration" or "this is a translation" or "this is an explanatory note" - all things that can be reasonably done with ruby text. To provide strong semantics, one would use e.g. schema.org, or encode things as microformats.

On this measure, tables also have weak semantics, but both ruby and tables have nice clear structural relationships.

chaals · 2018-05-29T11:34:58Z

I realise that HTML generally defers to CSS for layout and rendering is considered optional. The same is true for many kinds of behaviour in respect to Javascript. However, the rendering of ruby makes a difference to how it might be understood. That's a core reason for working on rendering capability in the first place.

Suggesting that a particular kind of markup will produce a particular rendering, when in practice we know that will not happen, makes no sense at all.

The markup pattern is generally parsed correctly, and is useful. As @kojiishi notes, important i18n features can take a long time to deploy interoperable. So removing the markup pattern for now seems a bad idea.

It is unfortunate that a new project like @kojiishi mentions doesn't take the markup pattern into account, because it gives the impression that implementors might not even be aiming to make it work. And that would be a reason to consider removing it for now, as we remove various other features that are important to users but aren't actually supported by browsers. But I think we could easily fix this case in the JS prototype. Essentially, add @r12a's code to the bit that interprets ruby, and if a ruby uses this rb rb* rt rt* pattern, reorder the segments to interleave them for layout.

aphillips · 2018-06-17T19:42:11Z

The I18N WG would like to invite interested individuals (including @chaals, @kojiishi, @fantasai) discuss this with us in our next teleconference (21 June 2018 @ 15.00 UTC). Please contact me privately or via [email protected] if you need invite/dial-in information. // I18N-ACTION-727

chaals · 2018-06-22T12:16:14Z

It seems that Firefox supports ruby completely, and the Antenna House implementation does do the pairing association correctly (which is what was the strict point under discussion in this issue). In addition, I hope to see another implementation from @kojiishi in the future.

So for now I think we should leave the markup alone and close this issue without any action...

fix #1424 Clarify that rtc elements optionally contain `rp` before and/or after `rt elements.

siusin · 2019-07-29T14:52:07Z

Thanks all.

We're closing this issue on the W3C HTML specification because the W3C and WHATWG are now working together on HTML, and all issues are being discussed on the WHATWG repository.

If you filed this issue and you still think it is relevant, please open a new issue on the WHATWG repository and reference this issue (if there is useful information here). Before you open a new issue, please check for existing issues on the WHATWG repository to avoid duplication.

If you have questions about this, please open an issue on the W3C HTML WG repository or send an email to [email protected].

chaals self-assigned this May 4, 2018

chaals added i18n-comment at risk labels May 4, 2018

chaals added this to the HTML5.3 WD4 milestone May 4, 2018

r12a added the i18n-jlreq Tracked by the japanese layout group label May 11, 2018

r12a mentioned this issue May 30, 2018

Mark ruby with rb rb* rt rt* at risk? #1424 w3c/i18n-activity#570

Closed

chaals modified the milestones: HTML5.3 WD4, HTML 5.3 WD5 Jun 19, 2018

siusin mentioned this issue Jun 19, 2018

loosen the rule of the rb element #1407

Merged

chaals mentioned this issue Jun 19, 2018

Request TAG review of HTML5.3 w3ctag/design-reviews#275

Closed

travisleithead mentioned this issue Jun 26, 2018

HTML General Review: HTML Ruby w3ctag/design-reviews#248

Closed

3 tasks

LJWatson modified the milestones: HTML5.3 WD5, HTML5.3 WD6 Jul 30, 2018

LJWatson removed this from the HTML5.3 WD6 milestone Sep 13, 2018

edent mentioned this issue Oct 25, 2018

How does w3m render Ruby? tats/w3m#104

Open

chaals added a commit that referenced this issue Nov 12, 2018

Claridy rtc content model

6c8caa7

fix #1424 Clarify that rtc elements optionally contain `rp` before and/or after `rt elements.

scottaohara mentioned this issue Jul 27, 2019

Complete <ruby> related mappings w3c/html-aam#115

Closed

siusin closed this as completed Jul 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark ruby with rb rb* rt rt* at risk? #1424

Mark ruby with rb rb* rt rt* at risk? #1424

chaals commented May 4, 2018 •

edited by r12a

Loading

r12a commented May 13, 2018

fantasai commented May 13, 2018

frivoal commented May 13, 2018

chaals commented May 20, 2018

plehegar commented May 22, 2018

LJWatson commented May 22, 2018

fantasai commented May 22, 2018

r12a commented May 23, 2018

chaals commented May 23, 2018

frivoal commented May 23, 2018

kojiishi commented May 27, 2018

chaals commented May 27, 2018

kojiishi commented May 28, 2018

MurakamiShinyu commented May 28, 2018

chaals commented May 28, 2018

kojiishi commented May 29, 2018

fantasai commented May 29, 2018 •

edited

Loading

chaals commented May 29, 2018

chaals commented May 29, 2018

aphillips commented Jun 17, 2018

chaals commented Jun 22, 2018

siusin commented Jul 29, 2019

Mark ruby with rb rb* rt rt* at risk? #1424

Mark ruby with rb rb* rt rt* at risk? #1424

Comments

chaals commented May 4, 2018 • edited by r12a Loading

r12a commented May 13, 2018

fantasai commented May 13, 2018

frivoal commented May 13, 2018

chaals commented May 20, 2018

plehegar commented May 22, 2018

LJWatson commented May 22, 2018

fantasai commented May 22, 2018

r12a commented May 23, 2018

chaals commented May 23, 2018

frivoal commented May 23, 2018

kojiishi commented May 27, 2018

chaals commented May 27, 2018

kojiishi commented May 28, 2018

MurakamiShinyu commented May 28, 2018

chaals commented May 28, 2018

kojiishi commented May 29, 2018

fantasai commented May 29, 2018 • edited Loading

chaals commented May 29, 2018

chaals commented May 29, 2018

aphillips commented Jun 17, 2018

chaals commented Jun 22, 2018

siusin commented Jul 29, 2019

chaals commented May 4, 2018 •

edited by r12a

Loading

fantasai commented May 29, 2018 •

edited

Loading