-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to identify namespaces / SVG + MathML #72
Comments
Honestly, I have no idea what a good answer is here, Would appreciate feedback. Several proposals below. IMHO, they are all viable, but with different aesthetics. In an XML-centric world -- which is where namespaces were invented -- this is easy: QNames everywhere => problem solved. It's just that HTML didn't go the XML-centric route, so I'm not sure whether that should be our answer here. I think there's two questions: Whether to consider fixed vs arbitrary namespaces, and syntax. Should we consider arbitrary namespaces, or only the browser-supported ones? That is, should a Sanitizer config be able to specify There'd be a number of ways to represent this (depending a bit on the previous question):
|
A few ill-thought out ideas while I'm already stepping away from my desk. I'm hopeful we don't have to solve this for arbitrary namespaces but hardcode a couple of them? I don't expect lots of new namespaces with their own parsing rules to creep up any minute (I might be wrong!) |
We can absolutely do that, and IMHO it'd be simpler. I would like to be sure that we aren't overlooking a use case for arbitrary namespaces, though. If we don't strongly care about that - which I don't - then I think we have a pretty easy solution. |
Can we make the namespaces implicit? For instance: |
What about |
I'd assume that if element is valid for more than one namespace, it would be valid in all of them. So simple I'm aware of five tags that are valid in two namespaces: [edit]: The reasoning for that is from my experience developers mainly are aware of tag names, not the namespaces shenanigans. So that should match their expectations. |
I like this. It seems both simple and effective. Also, amusingly, it means that the spec "by accident" already mostly does the right thing, since this would mean we act on the tag names only, and largely ignore namespaces. Disadvantages, for completeness sake:
I very much suspect this is correct. |
Maybe we can have a bit of both worlds:
|
The HTML parser knows about these namespaces for elements:
and these for attributes:
How would we deal with allowedAttributes and namespaces? The least-surprise would probably be to use For extra fun, the attribute with local name
https://html.spec.whatwg.org/multipage/dom.html#the-lang-and-xml:lang-attributes Personally, I don't mind a sanitizer throwing away that useless attribute. For conformance authors are required to use the |
Oh my goodness. This is very sad and very useful, thanks for bringing these insights here @zcorpan! :) It's on @otherdaniel to updates his proposal in #103, but what he and I discussed seems along the lines of what you suggested.
I'm not entirely sure what you mean with "the appropriate namespaced attributes" for |
I mean
So to clarify, "
The HTML parser can produce both! On HTML elements, you get the no namespace variant. On SVG and MathML elements, you get the namespaced variant. On HTML elements, the namespaced variant is not allowed, but the no-namespace one is allowed. (You can get it with DOM APIs, not with the HTML parser.) And actually, the
https://html.spec.whatwg.org/multipage/dom.html#global-attributes:html-namespace-2 |
As far as I know, The only XSS-risky one I think is |
It's been a while. The current status is:
This purposely limits Sanitizer to support the exact same namespaced content as HTML does. I think this resolves the issue; but I'll wait for a bit with closing it in case anyone disagrees. |
The spec isn't very clear about how the attribute match list namespace mapping works, imo. Is the namespace mapping always applied? Or only when the element is " |
Thanks for the feedback! The intent is:
I'll try to clarify the wording. (My own problem when trying to spec that was that, knowing how namespaces work in XML, I found the HTML adoption of this concept fairly strange. But I believe our current solution faithfully matches what HTML does.) |
OK. Note that the HTML syntax only supports these namespaced attributes on foreign elements. |
Good point, I had overlooked that. This is presently mis-specified in Sanitizer. |
The current spec uses a tag's local name, meaning it cannot distinguish between SVG, MathML, and HTML element names. Clearly, that won't do long-term.
The text was updated successfully, but these errors were encountered: