Skip to content
This repository has been archived by the owner on Sep 11, 2024. It is now read-only.

Remove escape backslashes in non-Markdown messages #4694

Merged
merged 1 commit into from
Jun 17, 2020
Merged

Remove escape backslashes in non-Markdown messages #4694

merged 1 commit into from
Jun 17, 2020

Conversation

justin-sleep
Copy link
Contributor

@justin-sleep justin-sleep commented Jun 3, 2020

Iterating on #4008, fixes element-hq/element-web#11230. A couple of changes:

  1. I only call toPlaintext if there's a backslash present in the message.
  2. I attempt to account for HTML entities not being converted by removing the override of HtmlRenderer's out function. As far as I can tell, we do want the messages to be entity-encoded when calling toPlaintext (at least, none of the tests broke! 😄).

@t3chguy t3chguy requested a review from a team June 3, 2020 22:12
@@ -42,6 +42,10 @@ export function htmlSerializeIfNeeded(model: EditorModel, {forceHTML = false} =
if (!parser.isPlainText() || forceHTML) {
return parser.toHTML();
}
// ensure removal of escape backslashes in non-Markdown messages
if (md.indexOf("\\") > -1) {
return parser.toPlaintext();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't this mean we'd lose formatting on stuff like \*hello\* **world**? ("hello" should be escaped, but world should not)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is caught by the earlier toHTML call (isPlaintext returns false for that message), and results in a formatted_body of *hello* <strong>world</strong> as expected.

Comment on lines -178 to -185
// The default `out` function only sends the input through an XML
// escaping function, which causes messages to be entity encoded,
// which we don't want in this case.
renderer.out = function(s) {
// The `lit` function adds a string literal to the output buffer.
this.lit(s);
};

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we accomplishing by removing this? Surely we still want to avoid the encoding the comment describes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the out function it's overriding:

function out(s) {
  this.lit(this.esc(s, false));
}

and esc:

function(s, preserve_entities) {
    if (reXmlSpecial.test(s)) {
        if (preserve_entities) {
            return s.replace(reXmlSpecialOrEntity, replaceUnsafeChar);
        } else {
            return s.replace(reXmlSpecial, replaceUnsafeChar);
        }
    } else {
        return s;
    }
}

Removing it accounts for this problem. Since toPlaintext emits HTML (rather unintuitively, given the name), if we don't encode HTML entities they are processed literally.

As an example, if the out override is left in, the message \*hello\* world '<' returns a formatted_body of *hello* world '<'. Since the message is HTML, Riot produces this:

2020-06-05-13:14:07

@turt2live turt2live self-requested a review June 8, 2020 16:05
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, this seems sane enough to try it. Will merge after the RC is cut just in case.

@turt2live turt2live merged commit 803b7bb into matrix-org:develop Jun 17, 2020
@justin-sleep justin-sleep deleted the remove-escape-backslashes branch June 23, 2020 18:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't escape > at the start of message with \ in the new composer
2 participants