-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some HTML entities are double-decoded. #2927
Comments
Thanks @RichardNeill - you're right, we're prematurely decoding those entities. It's not really double conversion, since we're not creating HTML at all, but directly instantiating corresponding SVG elements. I don't believe there are any security holes here (if you do find one please do let us know!), but that is the reason we're extra constrained - because plotly.js is used on our public-facing platform, we can't allow users to take responsibility for security of their own plots, we need the conversion to be safe no matter what string is supplied. |
Hi Alex, Thanks for your comment. The immediate source of confusion for me is that both So for example, if I have
Then what I see is: So, sometimes the tags are literal, and sometimes they are not. The handling is inconsistent, depending on the tag. An end user can certainly inject raw html into a chart - though so far, I haven't been able to find a way to exploit it. But it is definitely true that the system is inconsistent... a There is no way, for example, to create a chart with the following, literal, title:
Not one of them will allow me to write a literal open-angle,b,close-angle into the title of the chart! I hope that makes it clearer. |
Yes, we have an open item to document this better plotly/documentation#766
Certainly by using your own javascript on your own page you can add whatever new elements you want, but you cannot inject any raw HTML or javascript using just the figure Anyway I'll have a fix shortly for the immediate issue of decoding entities improperly. Turns out there are also some easy performance and functionality gains to be had. We are not likely to support the full set of named HTML entities, just the special characters (&, <, >) that are difficult to enter explicitly, and a few that we've supported for a long time (μ, nbsp, ×, ±, °) but I think we can support all of the numbered entities. So for any named entity not in that list, just use a unicode literal or the numbered entity. There is a way to get the browser to decode arbitrary named entities, eg https://gomakethings.com/decoding-html-entities-with-vanilla-javascript/, but it involves |
Thank you. I confirm that this is now fixed, at least for what I needed. |
Titles etc have a strange double-decoding bug for HTML entities.
For example, it is impossible to include a literal
<b>
within a title or tooltip - it is always converted to<b>
. This has two consequences:If I actually want a raw angle-bracket, for example to say
Threshold < 3
, then this is brittle, because sometimes the bracket could be interpreted as beginning a tag.There is a possible security risk - it is impossible to enforce an "htmlspecialchars()" conversion to make user input safe, because entities are double-decoded.
Try for example:
This wrongly shows the words "could break" in bold. This is a bug.
However, there seems to be some special-case handling of "script" because the alert does not trigger, and the script tag is shown literally. This indicates a belt-and-braces fix in the special-case of "script", I think. So while the behaviour is safe, it's confusing, given (1).
Thanks for your time and your help.
The text was updated successfully, but these errors were encountered: