-
Notifications
You must be signed in to change notification settings - Fork 549
Mark ruby with rb rb* rt rt* at risk? #1424
Comments
Please don't dismantle this! This 'tabular' approach has significant benefits over the currently supported 'interleaved' approach, and is needed to fully meet the needs of the Japanese community. This includes use cases such as the following:
The HTML parser already supports this markup model. What's not supported in browsers (other than Firefox) is the correct automatic display of the annotations. That's actually waiting on implementation of the CSS ruby spec, not the HTML spec. Since the CSS spec intends to rely on this particular model, removing the markup model from the HTML will damage the ability to implement the CSS spec. Before we start blindly ripping out this stuff, why don't we first see whether we can put some more energy into getting the display supported by more browsers? (You may want to read the following article to understand the background here: https://www.w3.org/International/articles/ruby/markup) |
+1 to @r12a’s comment. (Although #3 should be handled by more intelligent searching in consideration of ruby markup, since it's a problem when searching for phrases regardless.) Also wrt background, this post--which was the impetus for changes to the HTML and CSS specs that resulted in the tabular model--explains the problems with the original HTML5 model and how we got to the model we have now specced: http://fantasai.inkedblade.net/weblog/2011/ruby/ |
As ruby lies somewhere at the intersection of i18n and a11y, it may not be the sexiest feature but it very much deserves support and continued work. Firefox has been doing better than most, but it needs to continue and other needs to catch up. For the reasons outlined by @r12a, his article, and @fantasai's, this particular kind of markup is needed if we want to be able to get both semantics and styling right for ruby in Japanese. |
I agree that we should support ruby, and don't want to "blindly (or even watching carefully) rip stuff out" that is helpful and valuable. Putting it "at risk" was merely noting that browser implementation of the feature is somewhat lacking in reality, and that we should consider that explicitly in line with the way we produce the spec rather than simply continue with what it written as if it were somehow inviolable truth. Personally, I think @r12a's argument to retain it is convincing. That said, setting up expectations for authors that are only met by 5% of deployed browsers isn't necessarily helpful either. I don't think authors generally interpret "it's parsed into a model, it just doesn't get rendered according to that model" as "implemented". There are (as noted) many possibilities - add an editorial warning that most browsers will need some additional work to actually render the ruby, mark the thing "at risk" and even remove it from the spec, do nothing at all. The purpose of this issue is to work out which of them we should pursue. |
Given the lack of proper implementations, I don't see how we cannot put this at risk. HTML has been shipping for years without this support as far as I understand. Yes, it's very unfortunate that implementations aren't moving forward but blocking on this doesn't seem practical. |
If we could provide evidence of wide adoption/useage on the web, or point to browser issues where intent to implement is shown and/or there is active discussion of the issues, that might help, but in the absence of interop it's a hard case to make I'm afraid. |
Getting it correctly parsed is a prerequisite for rendering it correctly. Removing that parsing support will create compatibility problems in the future if we do intend to move towards the tabular model (which I think we do). It's a good thing that widespread parsing support is preceding rendering support--it means that the parsing will happen correctly, and authors can use alternative methods to provide reasonable fallback renderings in browsers that don't have full rendering support. Without that parsing support, the result in browsers that don't have full rendering support will be very broken, it's harder to polyfill with CSS and/or JS, and this makes it that much harder to have a pleasant and useful transition period. |
Can we become more specific about what we're planning to mark as at risk? Which parts of the text exactly are in danger of being removed? I doubt that anyone is expecting to change the HTML parser, but actually that's what the HTML spec was intended to describe (positioning relies on CSS). The parser works interoperably. The only thing that's currently not working is that some of the annotations are not by default correctly shown adjacent to the base characters. See https://jsfiddle.net/n8ffzho0/6/ which uses the tabular model and:
Have the HTML WG checked for evidence that there is no wide adoption on the web, or that there are no browser issues where intent to implement is shown and/or there is active discussion of the issues? Bear in mind that the tabular model is that of the Ruby Annotation spec, which was the only game in town prior to HTML 5. (It's also the model used by TTML, fwiw.) |
When I filed this issue, it was based on the observation that "this doesn't seem to work interoperably as expected in browsers". Which meant it seemed prudent to consider whether it should be changed or removed. I don't think that changing this is easy or sensible, for the reasons various people have pointed out. Removing it without careful analysis also seems a bad idea, as people have pointed out. Considering marking it "at risk" is essentially a precaution. If HTML goes to CR roughly on its current timeline, unless we removed it hastily we would be unable to remove the feature even if it turned out to be really unimplemented. My current proposal is as follows:
Note that the script you provided gives the right layout at the expense of making the underlying semantics messier (by reverting it to the "simple" model). However, that may suit some use cases. As far as I am aware, existing interoperably implemented CSS cannot do it quite right, but again it may be possible to get an approximation that is better than what browsers currently do. I have looked for further evidence of full support, and found little - I would like to know if there are other implementations. @howcom, do you know if PrinceXML implements this correctly? I am also still looking for other approaches to polyfilling that work. |
If we're looking for non browser implementations, I suspect antenna house formatter does it. @MurakamiShinyu can you confirm? |
Blink is trying to switch its layout engine to a new one, and once the switch is done, we plan to have a new Ruby implantation on top of it. Also, as Florian said, there should be non-browser implantations. Having a layout warning in HTML spec sounds like a bit strange to me. Maybe better to point the relevant CSS spec, and say it's not REC yet, is more reasonable? |
@kojiishi wrote:
Yes, absolutely. The reason for talking about the layour is that the semantics of ruby are really weak, and most of what it means is related to layout. The "high-order bit" of what I have written is that the spec should be clear about what works and what doesn't. And with the particular How we explain what doesn't work is a task in front of us, that I might get done today if I am lucky. In any event, having a new implementation is a good thing. Do you have running code at this stage, or just aspirations? |
We built a JavaScript prototype for the design review of the new layout engine, but we're still in the middle of switching, no native code for Ruby yet. I hope you agree that we don't want to update HTML when CSS status changes. Writing a proper warning for authors without introducing such troubles doesn't look easy, but if you have text, happy to review. Also, we're not getting good responses from Japan contributors, but please make sure we get responses from forks or non-browser implementations.
|
AH Formatter (I tested the latest V6.5MR5) renders correctly with the ruby test ruby-position-004.html which does not omit rb and rt end tags. Vivliostyle output depends on the browser in this ruby case. Vivliostyle with Chrome has same problem with Chrome browser and no problem with Firefox. Test URL: https://vivliostyle.github.io/vivliostyle.js/viewer/vivliostyle-viewer.html#x=https://w3c.github.io/i18n-tests/html/semantics/text-level-semantics/the-ruby-element/ruby-position-005.html See the screenshots: |
The test you have there doesn't cover the tests ruby-position-004.html ruby-position-005.html of this feature. My very quick reading of that javascript is that it doesn't handle the case either, because the measure and flow functions assume that a ruby-text is immediately preceded by a ruby base without doing the reordering of the stack that would make that work (which is what @r12a's javascript patch referred to above does). Note that I could well be wrong about your javascript though - I just tried to run it briefly in my head.
In practical terms I think we are close to agreement. For various reasons I think there is value in making some editorial comments in HTML about how an unmet CSS dependency is a current problem (which I don't see any evidence will be solved within a year, W3C's HTML update cycle). I certainly don't want to change the underlying HTML.
No, it isn't easy. I'll have one this week for wherever we get to. |
Quite possible, it was for a design review, might not include something that we know we can do.
I agree it's unlikely to happen in one year for the browser engines, and an informative layout warning pointing to the relevant CSS spec sounds reasonable. I hope you understand i18n features tend to require longer cycles, and I hope the spec text won't make the intent to implement controversial when it comes to happen.
Thank you for taking the hard part ;) |
The semantics of ruby is that there are parallel texts, one which is the the base text and the others the annotations. In most cases the parallel texts are alternates for the base text, but in some cases they are additional information associated to the base text. In any case, I don't see that Ruby semantically any weaker than Tables.
HTML is about semantics, not about layout. While the typical layout of ruby is what we see in CSS Ruby, other layouts are equally valid. For example, some Korean texts use smaller parenthesized text after the base text (in a single linear inline flow) to represent the annotations. A UA that uses such a rendering should be conformant to HTML, and such a rendering is trivial to produce with CSS2. Rendering suggestions in HTML are informative only, in any case. |
This is getting off on a tangent... By "weak semantics" I mean "this is some kind of annotation" is much weaker than "this is a transliteration" or "this is a translation" or "this is an explanatory note" - all things that can be reasonably done with ruby text. To provide strong semantics, one would use e.g. schema.org, or encode things as microformats. On this measure, tables also have weak semantics, but both ruby and tables have nice clear structural relationships. |
I realise that HTML generally defers to CSS for layout and rendering is considered optional. The same is true for many kinds of behaviour in respect to Javascript. However, the rendering of ruby makes a difference to how it might be understood. That's a core reason for working on rendering capability in the first place. Suggesting that a particular kind of markup will produce a particular rendering, when in practice we know that will not happen, makes no sense at all. The markup pattern is generally parsed correctly, and is useful. As @kojiishi notes, important i18n features can take a long time to deploy interoperable. So removing the markup pattern for now seems a bad idea. It is unfortunate that a new project like @kojiishi mentions doesn't take the markup pattern into account, because it gives the impression that implementors might not even be aiming to make it work. And that would be a reason to consider removing it for now, as we remove various other features that are important to users but aren't actually supported by browsers. But I think we could easily fix this case in the JS prototype. Essentially, add @r12a's code to the bit that interprets ruby, and if a ruby uses this |
The I18N WG would like to invite interested individuals (including @chaals, @kojiishi, @fantasai) discuss this with us in our next teleconference (21 June 2018 @ 15.00 UTC). Please contact me privately or via [email protected] if you need invite/dial-in information. // I18N-ACTION-727 |
It seems that Firefox supports ruby completely, and the Antenna House implementation does do the pairing association correctly (which is what was the strict point under discussion in this issue). In addition, I hope to see another implementation from @kojiishi in the future. So for now I think we should leave the markup alone and close this issue without any action... |
fix #1424 Clarify that rtc elements optionally contain `rp` before and/or after `rt elements.
Thanks all. We're closing this issue on the W3C HTML specification because the W3C and WHATWG are now working together on HTML, and all issues are being discussed on the WHATWG repository. If you filed this issue and you still think it is relevant, please open a new issue on the WHATWG repository and reference this issue (if there is useful information here). Before you open a new issue, please check for existing issues on the WHATWG repository to avoid duplication. If you have questions about this, please open an issue on the W3C HTML WG repository or send an email to [email protected]. |
The markup pattern of using multiple
rb
elements with correspondingrt
following them, to be allocated automatically - e.g.does not seem to have a lot of implementation, with the only one I found being in Firefox, which will put "yi" above the first character, "er" above the second, etc. as specified.
In other browsers I tried (Edge, Opera, Safari), the string "ersan" will follow the chinese characters as normal text.
Are there others? Should we mark this as "at risk"? Or alternatively warn authors that it quite often won't work as expected?
See also the i18n group's test for this
The text was updated successfully, but these errors were encountered: