-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User documentation? #489
Comments
https://html5lib.readthedocs.io/en/latest/ has docs, though it doesn't answer the why. https://html5lib.readthedocs.io/en/latest/html5lib.html#html5lib.html5parser.HTMLParser specifically mentions the The why is quite simple: it's what the HTML spec says (that almost all elements get inserted in the HTML namespace from HTML); you can see browsers do this via |
Thanks - creating a parser object explicitly with the proper args gives the desired results. New users reading the overview might benefit from some rationale (pros and cons) for choosing a particular tree type, and a note pointing out that namespacing is one of the options that can be specified. It could be argued that new users might not be aware of namespacing and should not get it by default, while those who do need it would know enough to opt in. |
I think basic usage example would be helpful. Example: parse html, replace the innerHTML of all Up to now the docs are just about parsing. Please add some example how to process the parsed data. |
Something like this would be nice to have in the docs:
|
Is there any tutorial documentation for this package? Something that would answer questions like when parsing an HTML file:
Why does the result look like this?
A naive user would expect
tag
to return the literal value contained in the HTML element, not the tag prefixed with a qualifier of some sort. It would be helpful to have a document that explains why the prefix has been injected and how to configure the library to return unadorned tags.The text was updated successfully, but these errors were encountered: