-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lxml
trees parsed by html5lib
can not be used with lxml.clean
#102
Comments
We unfortunately cannot return a lxml HTML tree unless we entirely break namespace support, which seems undesirable in the extreme, thus closing as wontfix. |
So, what is We can set |
What are you actually wanting to use Regardless, it seems like, on the face of it, that it should support XML trees (as it is, it doesn't work for XHTML parsed by lxml either!). In general terms, I don't think its worthwhile to support. (Looking at the implementation of |
Hi, I'm very interested on html5lib be able tu use lxml's HtmlMixin, can you @gsnedders please explain why it would break support for NS (at code level)? |
Same here. The functionality it provides like @gsnedders, do you have any suggestions? @davirtavares, did you ever figure out anything? |
Hey @requiredfield, at all I ended writing my own version of these methods as funcs, based on the lxml's code https://github.com/lxml/lxml/blob/master/src/lxml/html/__init__.py#L455. Unfortunately had no time to fixing it by a more elegant way :/ |
Responding to this (and independent of this issue as it was originally filed), would it make sense for |
Yes, given HTML has been defined, and implemented in browsers, to put things in namespaces for almost ten years now. That might need changes in libxml2 too, though. |
This simple script fails with html5lib.
The problem is
lxml.html.document_fromstring
return an element with typelxml.html.HtmlElement
, butHTMLParser.parse
returns with typelxml.etree._ElementTree
The text was updated successfully, but these errors were encountered: