Impact
The HTML Parser in lxml does not properly handle context-switching for special HTML tags such as <svg>
, <math>
and <noscript>
. This behavior deviates from how web browsers parse and interpret such tags. Specifically, content in CSS comments is ignored by lxml_html_clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This vulnerability could lead to Cross-Site Scripting (XSS) attacks, compromising the security of users relying on lxml_html_clean in default configuration for sanitizing untrusted HTML content.
Patches
Users employing the HTML cleaner in a security-sensitive context should upgrade to lxml 0.4.0, which addresses this issue.
Workarounds
As a temporary mitigation, users can configure lxml_html_clean with the following settings to prevent the exploitation of this vulnerability:
remove_tags
: Specify tags to remove - their content is moved to their parents' tags.
kill_tags
: Specify tags to be removed completely.
allow_tags
: Restrict the set of permissible tags, excluding context-switching tags like <svg>
, <math>
and <noscript>
.
References
Impact
The HTML Parser in lxml does not properly handle context-switching for special HTML tags such as
<svg>
,<math>
and<noscript>
. This behavior deviates from how web browsers parse and interpret such tags. Specifically, content in CSS comments is ignored by lxml_html_clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This vulnerability could lead to Cross-Site Scripting (XSS) attacks, compromising the security of users relying on lxml_html_clean in default configuration for sanitizing untrusted HTML content.Patches
Users employing the HTML cleaner in a security-sensitive context should upgrade to lxml 0.4.0, which addresses this issue.
Workarounds
As a temporary mitigation, users can configure lxml_html_clean with the following settings to prevent the exploitation of this vulnerability:
remove_tags
: Specify tags to remove - their content is moved to their parents' tags.kill_tags
: Specify tags to be removed completely.allow_tags
: Restrict the set of permissible tags, excluding context-switching tags like<svg>
,<math>
and<noscript>
.References