You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
(Filed as a result of mozilla/readability#392 ; I'm not 100% sure whether this should be considered a DOM issue or an HTML parser issue; feel free to move as appropriate )
At the moment, the DOM includes 2 or 3 tables with an attribute whose name is "0", as evidenced from the console log.
The original markup of the page as inspected via "View Source", at time of writing, looked something like:
<tablewidth="90% border="0>
Note the opening quote before 90% and 'closing' quote after border=.
Obviously the markup's intent is to have a table with 2 attributes, width="90%" and border="0". But both browsers parse this as attributes with name '0' and the empty string as a value. I assume this parsing is proscribed by the spec, but I haven't tried to look for the specifics there.
The problem arises when rote DOM manipulation reads through element.attributes, and on a new element, tries to set these same attributes. Element.setAttribute throws an InvalidCharacterError because as noted in https://dom.spec.whatwg.org/#dom-element-setattribute , 0 "does not match the Name production in XML", viz. https://www.w3.org/TR/xml/#NT-Name .
Scripts can currently work around this issue (in reasonably complete DOM implementations) by using element.attributes.setNamedItem(otherElement.attributes[i].cloneNode()), though this isn't very elegant.
I think the inconsistency here is unfortunate. I would argue for one of the following improvements:
parsing an HTML document should validate attributes the same way the DOM spec says to validate them (cf. https://dom.spec.whatwg.org/#validate and https://dom.spec.whatwg.org/#dom-element-setattribute ), or if that is too problematic for backwards compatibility reasons (ie where document authors apparently intend for the element to have an attribute e.g. with name "1" or "." or somesuch), that it should only do so where it is doing parsing for questionable markup such as the above.
setAttribute DOM API validation should be relaxed to the same standard that the HTML parsing uses; if not possible for backwards compatibility reasons, it should be relaxed for documents with text/html content types and/or HTML (rather than XHTML/XML-based) parsing models.
The text was updated successfully, but these errors were encountered:
gijsk
changed the title
Attribute DOM representation and parsing is confusing
Attribute DOM representation and parsing is inconsistent
Jan 7, 2019
See also whatwg/dom#449. To summarize, this problem is known, but resolving it requires a lot of careful compatibility testing that nobody seems to be willing to invest in.
(Filed as a result of mozilla/readability#392 ; I'm not 100% sure whether this should be considered a DOM issue or an HTML parser issue; feel free to move as appropriate )
STR:
At the moment, the DOM includes 2 or 3 tables with an attribute whose name is "0", as evidenced from the console log.
The original markup of the page as inspected via "View Source", at time of writing, looked something like:
Note the opening quote before
90%
and 'closing' quote afterborder=
.Obviously the markup's intent is to have a table with 2 attributes,
width="90%"
andborder="0"
. But both browsers parse this as attributes with name '0' and the empty string as a value. I assume this parsing is proscribed by the spec, but I haven't tried to look for the specifics there.The problem arises when rote DOM manipulation reads through
element.attributes
, and on a new element, tries to set these same attributes.Element.setAttribute
throws anInvalidCharacterError
because as noted in https://dom.spec.whatwg.org/#dom-element-setattribute ,0
"does not match the Name production in XML", viz. https://www.w3.org/TR/xml/#NT-Name .Scripts can currently work around this issue (in reasonably complete DOM implementations) by using
element.attributes.setNamedItem(otherElement.attributes[i].cloneNode())
, though this isn't very elegant.I think the inconsistency here is unfortunate. I would argue for one of the following improvements:
setAttribute
DOM API validation should be relaxed to the same standard that the HTML parsing uses; if not possible for backwards compatibility reasons, it should be relaxed for documents withtext/html
content types and/or HTML (rather than XHTML/XML-based) parsing models.The text was updated successfully, but these errors were encountered: