Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve card header word breaks #628

Closed
wants to merge 1 commit into from

Conversation

nfelt
Copy link
Contributor

@nfelt nfelt commented Oct 5, 2017

Insert tags into the tag names used in card headers, so that in the case of long tag names, they will break more gracefully at word boundaries.

Before:

image

After:

image

Note that incidentally this also seems to address a bug where the long unbroken tag name pushed the containing div out too far, and bumped the "show description" circled-i icon out of view, making it inaccessible. The bug is not 100% fixed because in the case of a tag name which has an unbreakable component that exceeds the card width (e.g. "someVeryLongUnbrokenRunNameWithNoHyphensSlashesOrUnderscores/loss/scalar_summary") we would still have the problem. But that seems much less likely with this change in place.

@nfelt nfelt requested a review from jart October 5, 2017 22:29

// Break the string at natural points, including commas, equals, and slashes
_breakString: function(originalString) {
return originalString.replace(/([\/=\-_,])/g, "$1<wbr>");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a security risk. See this. Also:

  • import HtmlSanitizer from 'goog:goog.html.sanitizer.HtmlSanitizer';
  • HtmlSanitizer.sanitize(...)
  • @io_bazel_rules_closure//closure/library.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right... should have thought more about that, this was just code I copy-pasted from tf-runs-selector.html as introduced in 3113df4, but I did think the innerHtml "workaround" was a bit funny.

There are a few other places using the innerHtml stuff that seem like they could also be problematic, e.g. here there's a homegrown HTML sanitizer function that while perhaps sufficient for that case would be better off sticking to HtmlSanitizer so nobody copies it into a place where it's inappropriate:

I can do a separate PR to fix all those spots to use HtmlSanitizer and then come back and update this one.

@@ -168,6 +171,11 @@
this.$.descriptionDialog.positionTarget = e.target;
this.$.descriptionDialog.toggle();
},

// Break the string at natural points, including commas, equals, and slashes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why equals? They might be used to encode hyper parameters. If I have a run name like learning_rate=0.1,filters=64,batch_size=100 then it would be nice if made an attempt to show me one association per line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied the same set of characters used for run breaking in tf-runs-selector.html, since it seemed like it was worth having the same behavior. If you like I can restrict one or both to not break on equals signs though.

@@ -62,7 +62,10 @@
<template is="dom-if" if="[[_tagLabel]]">
<div class="heading-row">
<div class="heading-label">
tag: <span itemprop="tag">[[_tagLabel]]</span>
tag: <div
itemprop="tag"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not run too? Possibly also do name as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run was just an oversight because it's not in the scalars plugin card headers. I'll redo some test data with one of the plugins that does show runs in the headers and add this behavior to run as well.

Re: name, that didn't seem as useful since custom display names can easily have spaces in them (and it'd be a little weird to break mid-word when there are space breaks available, IMO - unlike a book we're not trying to justify text to fill a column). We could potentially insert <wbr> only for display names that don't already have whitespace, but overall this seems less important than the run or the tag because 1) human-choosen names will tend to be short and 2) automatic names are based only on the node name without any enclosing namespaces, so they'll also tend to be much shorter than the tag name (or run name).

@jart jart changed the title Add word breaking to card header tag names Improve card header word breaks Oct 6, 2017
@jart
Copy link
Contributor

jart commented Oct 6, 2017 via email

@wchargin
Copy link
Contributor

wchargin commented Oct 8, 2017

I'm not sure that goog.html.sanitize.HtmlSanitizer is what we want. We want to transform "<strong>hi</strong>" to "&lt;strong&gt;hi&lt;/strong&gt;", but HtmlSanitizer will happily act as the identity on that input.

The purpose of HtmlSanitizer is to sanitize an actual HTML string, returning an actual HTML string. For instance:

  • '<a href="http://google.com/">test</a>' is unchanged by sanitize, but
  • '<a href="javascript:void0">test</a>' becomes '<a>test</a>'.

By contrast, we really just want to embed a plain-text string into HTML: escape or quote would be a better term than sanitize.

For that purpose, all that you need to do is replace < and & with their entities. (In the code that you linked, I also replace >, but as noted that is just because I didn't like the asymmetry of escaping < but not >! It's not necessary.) HTML text nodes have no other metacharacters, so this is sufficient.

We can definitely still have a standard function escapeHtmlText(x: string): string—name up for grabs, but it should be clear that the result should be used only in a text node context as opposed to, say, an attribute context. (Properly supporting attribute escaping is tricky.) This function should go into tf_utils/utils.ts.

Alternatively, if you can find a suitable closure-library function that does the right thing, I'd be very happy with that, too!

@nfelt
Copy link
Contributor Author

nfelt commented Oct 9, 2017

@wchargin Good point. I was assuming HtmlSanitizer could be configured to have a tag whitelist of just <wbr> but you're right that escaping is a better model than sanitizing for what we want here. Given the spec reference you cited, I'm comfortable sticking to explicitly escaping <&> with entity references.

One question: I was contemplating making a polymer element for a <span> with wordbreaking for its content, that does the <wbr> insertion and HTML escaping in a more reusable way, since it's now going to be used in at least three places. Does this seem reasonable? I'd probably put the escapeHtmlText() function in utils.ts and then have the element definition live in tf_dashboard_common since that seems like the right place for a generic element.

@wchargin
Copy link
Contributor

wchargin commented Oct 9, 2017

could be configured to have a tag whitelist of just <wbr>

Not unreasonable, but there's still a difference: a run called "a<wbr>z" should become "a&lt;wbr&gt;z".

I was contemplating making a polymer element for a <span> with wordbreaking for its content, that does the <wbr> insertion and HTML escaping in a more reusable way

Semantically, this is—in my opinion—entirely reasonable.

If I were writing the code, though, I probably wouldn't write such a component: defining new components in Polymer feels extremely clunky to me, so that components are "heavyweight" instead of "lightweight," and it's not clear to me that this component pulls its weight. By that I mean: for a Polymer component, you need your own file, some DOM-module and import boilerplate, a script tag where you declare some properties, a computed property for the escaped text, and then a template with the implementation. (Contrast something like React, where the entire component is literally export const WordBreaks = (props) => <span>{addWordBreaks(props.children)}</span>; in an existing utilityComponents.ts file or similar.)

Note that I'm not asking that you not implement it, just letting you know where I'm coming from.

One question: does it make sense to have the same component handle <wbr>s and HTML escaping?

have the element definition live in tf_dashboard_common since that seems like the right place for a generic element

We're trying to get rid of tf_dashboard_common, because all components in tensorboard/components/ should by definition be common across all dashboards—otherwise, they'd be in tensorboard/plugins/*/. So just create tensorboard/components/tf_word_breaks/tf-word-breaks.html (name up to you).

@nfelt
Copy link
Contributor Author

nfelt commented Oct 9, 2017

The polymer component approach is a little more heavyweight, but the appeal to me was that then we can encapsulate the somewhat messy business of setting inner-h-t-m-l=[[breakWords(escapeHtml(...))]] so that it just takes care of the escaping and HTML insertion for you, with the ultimate goal of auditing and restricting usage of the inner-h-t-m-l hack so that as jart@ said we can be more confident about XSS protection for TensorBoards showing logfile data across trust boundaries.

Also, the interplay of word-breaking and escaping is also a little subtle; you can't escape after breakWords() since then you'd escape <wbr> itself, but if breakWords() was ever changed to break on e.g. & or ; then you'd start inserting <wbr> into entity references incorrectly. Making that process bulletproof would mean escaping each word segment individually before joining them with <wbr>, but at that point you could just use a "native" dom-repeat to do the join and avoid the inner-h-t-m-l usage entirely, which is an advantage you only get with a polymer component.

Re: tf_dashboard_common, thanks for the heads up. Assuming this sounds reasonable I'll go ahead and make a top-level tf_word_breaker component.

@jart
Copy link
Contributor

jart commented Oct 9, 2017

Ah yes, sanitization is somewhat different from entity escaping. If Polymer has a way to properly address the XSS concern without needing Closure Library, then that might be better.

Regarding the tf_word_breaker component I would suggest instead defining vz_strings and then have it absorb the functionality from vz_sorting.

@stephanwlee
Copy link
Contributor

Will close this PR since we have #1602 for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants