-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interoperable handling of invalid markup #136
Comments
I certainly would not want error reporting to be placing any constraints on browsers here, so I'd agree to any wording here that tells authors not to produce bad markup, but basically authorises browsers to do whatever they find convenient if there is bad markup. If making a zero sized subtree works (could be made to work) for all the current implementations, specifying that to get interoperable behaviour would also be Ok I think. |
Adding mathml4 label too, as we might want to adjust the behavior. |
cc @rwlbuis |
Just for the record, there is a third option which is probably a bit more complicate to implement/describe but more aligned with to what people expect with CSS / HTML5: Add anonymous |
Quoting @bfgeek
|
I'm not sure there is a canonical way to "fix" invalid markup (Ignore extra children? Add empty anonymous? Wrap a subset of children into anonymous containers?). My preference would be to align on WebKit's behavior as it's the easiest to specify/test/implement. This is already how we (at Igalia) implement it our Chromium branch and we would only have to fix Firefox. Basically, my proposal would be:
Putting this on the agenda of next meeting. |
WebKits implementation causes issues in other areas of the browser. For example the "invalid" is still accessible from VoiceOver, etc. This type of "easy" fixup produces technical debt in the codebase, and complexity later on which I'd be strongly opposed to adding in chromium. Having said that one guiding principle that CSS usually follows is that it should attempt to display all content. This is probably a good principle for MathML to also follow. I think that for display types which can only accept up to "N" children, inserting any additional children in an anonymous mrow type would be acceptable. E.g. <mfrac><mi>a</mi><mi>b</mi><mi>c</mi></mfrac> would render as: <mfrac><mi>a</mi><mrow><mi>b</mi><mi>c</mi></mrow></mfrac> |
@tabatkins also might have some good thoughts. |
So the alternative proposal would be:
My main concern (in addition to adding more work) is how dynamic changes will be handled. In the past, WebKit generated some anonymous elements (sometimes two levels of nested anonymous e.g. for scripted elements) in the Renderer/Layout classes and set/remove anonymous style on them: That caused at best inconsistencies after dynamic changes and at worst asserts/crashes. Hopefully, this proposal is a bit safer and we could maybe even handle the Render/Layout tree fixup & reorganization from the DOM classes. However, I agree that the proposed behavior would be more aligned with HTML5 / CSS (adding the label now btw) and more natural for users, so it's probably worth the added complexity/work. I don't really like the alternative of "making sure that all the MathML algorithms can deal with the edge cases" as that makes the algorithm more complex and harder to read. For example, the advantage of WebKit's current implementation is that once we know an element is valid we can just call the children and associated variable numerator/denominator/base/superscript etc and that makes the algorithm (which is already relatively complex when we follow all TeX / Open Type MATH metrics) much easier to read and understand. |
Since VoiceOver was mentioned above, I however have to say that adding such anonymous children could help accessibility ( w3c/mathml#9 ). Typically,
would become
and then we can add roles numerator/denominator on the anonymous mrow's while still keeping roles fraction on the mfrac. Otherwise, this fixup would have to be done in the a11y tree. |
Doesn't HTML5's parsing spec already say what should happen for syntax errors such as missing close tags? I didn't see in the parsing spec what should happen when the wrong number of arguments are given. Maybe I missed it or maybe because that isn't related to what gets built in the DOM. HTML5 has some elements that must be used in ordered pairs (i.e., they don't make sense out of context). For example, If there is no precedent, then I like the idea of adding anonymous |
@NSoiffer Yes the parsing spec does specify what should happen with missing close tags, but that isn't what is being discussed.
The primary issue here is that DOM/CSS tries to display all content on the page, HTML doesn't really have a concept of an "incorrect" number of children for any given element (there may be a counter example here, but I'm not aware of any). E.g. a |
An alternative option could be to say that invalid markup are just laid out as mrow. That way we keep a simple fallback similar to the empty 0x0 but ensure that all the content is displayed. |
@bfgeek I think the html spec is trying to specify this https://html.spec.whatwg.org/#mathml says
which means that
should render as
That said, if that proves tricky to specify in an interoperable way, we should change the spec. |
At the risk of drifting off-topic, is I think the merror description in the spec needs a tweak. I've added w3c/mathml#66 about that. Basically that tweaks gives a way of describing the error without interfering with the display of the content for this use of With that tweak, @fred-wang's solution of using |
@davidcarlisle Whatever is decided here, should be specified in the MathML spec, and that removed from the HTML spec. I've spoken to mozilla engineers about that behaviour and they dislike it, and the technical debt it creates. |
@davidcarlisle Replacing the content with
|
@bfgeek I'm perfectly happy to let implementation issues take the lead here, I just wanted to flag the reference in the HTML5 spec so if we end up specifying something different here that gets pushed back to the w3c and whatwg html editors to update the spec (there are references to the mathml3 W3C CVS draft spec location rather than the currently maintained github drafts which also need updating in the html5 spec). |
@bfgeek I also don't like at all Mozilla / MathML3 / HTMLT5's suggestion to render as if an "merror" was inserted. Firefox actually implements it by displaying a "invalid-markup" message which is not really helpful. The standard way to report an error is via the browser console, which is what Firefox does. If we want to be consistent with HTML, we should try and render all the content without extra visual indication for error, so so I suggest one of this option: |
Just to add more to the discussions, my colleague @rwlbuis pointed out that SVG has some error rules https://www.w3.org/TR/SVG11/implnote.html#ErrorProcessing that implies not displaying some parts of the content. I'm not sure how it is actually implemented in browsers. There are also empty elements in MathML like mspace that are not supposed to have any child. I wonder how we should handle them compared to e.g. empty HTML elements. See web-platform-tests/wpt#15722 |
Also add some edge cases (zero/one/more than two mfrac children).
Consensus from 2019/09/12 was:
|
From 2020/05/18 meeting: There won't be a visual indication of the cleanup, but a warning should be sent to the console. |
Adding the label back because I don't think there is any consensus on the proposed "wrapping up the extra elements" approach and unless I'm missing what @rbuis was referring to, the layout-side experiment he made some months ago (for the mfrac element only) didn't demonstrate at all "it was easy" ; actually it was removed from our branch because it added extra complexity and it is not being upstreamed. There are still open questions like what happens with mmultiscripts, are we ok with "not perfectly equivalent" mentioned above, or if we should consider an anonymous box fixup approach instead (which is probably what is best in the long term) etc So I don't think the CG is in position to draw any conclusion here without more work on implementations. On the other hand, the current approach with mrow-like fallback well-described in the text of the spec, implementing it is very easy, chromium reviewers are fine with it and is what the WPT tests assume. |
Adding "need polyfill" as it should be easy to implement in JS what the people have requested to fixup the markup. |
w3c/mathml#175 was resolved with 'render as an mrow'. I'm not sure what is left, but @bkardell pointed out this can't be closed until there is agreement from the other implementations that rendering as an mrow is ok with them -- they do something different now as pointed out in the initial comment. |
@fred-wang: I don't think this can be closed unless you heard from other implementers as per @bkardell comment. |
Checking the discussions, it seems Mozilla and Apple are happy with the "render as mrow" approach, but let's see what @bkardell says. |
For my benefit and the benefit of others that are interested in what was said by Mozilla and Apple, can you provide a link to the discussions? |
This was discussed in an Apple/Igalia private chat, but Brian will be able to explain. About Mozilla, I don't have a link either but I'm pretty sure @emilio had mentioned several times in the past that having this error message layout was complicating things (I'll ping him again). I actually had written a WIP patch one year ago https://bugzilla.mozilla.org/show_bug.cgi?id=1583037 |
Closing this bug:
There is also a WebKit bug to align with the spec: https://bugs.webkit.org/show_bug.cgi?id=123348 |
https://mathml-refresh.github.io/mathml/chapter2.html#interf.error
Gecko renders (not very useful) "invalid-markup" message
WebKit just layout a 0x0 box for the invalid subtree
I think the latter is more efficient and less code so I'd prefer it. Probably if we want to log errors, it should be in the console instead.
The text was updated successfully, but these errors were encountered: