Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

URL previews contain entities #14708

Closed
clokep opened this issue Dec 19, 2022 · 1 comment · Fixed by #14781
Closed

URL previews contain entities #14708

clokep opened this issue Dec 19, 2022 · 1 comment · Fixed by #14781
Labels
A-URL-Preview Issues related to generating server-side previews of remote URLs O-Occasional Affects or can be seen by some users regularly or most users rarely S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@clokep
Copy link
Member

clokep commented Dec 19, 2022

(I'm fairly certain this is a Synapse bug and not a client bug.)

Previews of some URLs, e.g. https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/ end up with HTML entities in them:

image

The page does have an og:title element (with a value that matches the title and twitter:title elements). This includes the HTML entity (') in it. I'm guessing Synapse should be escaping this before returning the JSON blob.

Relevant HTML
<!-- This site is optimized with the Yoast SEO plugin v12.1 - https://yoast.com/wordpress/plugins/seo/ -->
<title>Why JSON isn&#039;t a Good Configuration Language - Lucidchart</title>
<meta name="description" content="Learn why JSON falls short as a configuration language and what you can use instead as you create a new application, framework, or library."/>
<link rel="canonical" href="[https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/](view-source:https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/)" />
<meta property="og:locale" content="en_US" />
<meta property="og:type" content="article" />
<meta property="og:title" content="Why JSON isn&#039;t a Good Configuration Language - Lucidchart" />
<meta property="og:description" content="Learn why JSON falls short as a configuration language and what you can use instead as you create a new application, framework, or library." />
<meta property="og:url" content="https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/" />
<meta property="og:site_name" content="Lucidchart" />
<meta property="article:publisher" content="https://www.facebook.com/lucidchart" />
<meta property="article:tag" content="configuration" />
<meta property="article:tag" content="json" />
<meta property="article:section" content="Behind the Scenes" />
<meta property="article:published_time" content="2018-07-16T16:28:37-06:00" />
<meta property="article:modified_time" content="2019-05-09T22:30:50-06:00" />
<meta property="og:updated_time" content="2019-05-09T22:30:50-06:00" />
<meta property="og:image" content="https://www.lucidchart.com/techblog/wp-content/uploads/2019/11/LucidBlogDefaultImage.png" />
<meta property="og:image:secure_url" content="https://www.lucidchart.com/techblog/wp-content/uploads/2019/11/LucidBlogDefaultImage.png" />
<meta property="og:image:width" content="1600" />
<meta property="og:image:height" content="686" />
<meta name="twitter:card" content="summary_large_image" />
<meta name="twitter:description" content="Learn why JSON falls short as a configuration language and what you can use instead as you create a new application, framework, or library." />
<meta name="twitter:title" content="Why JSON isn&#039;t a Good Configuration Language - Lucidchart" />
<meta name="twitter:site" content="@lucidchart" />
<meta name="twitter:image" content="https://www.lucidchart.com/techblog/wp-content/uploads/2019/11/LucidBlogDefaultImage.png" />
<meta name="twitter:creator" content="@lucidchart" />
<script type='application/ld+json' class='yoast-schema-graph yoast-schema-graph--main'>{"@context":"https://schema.org","@graph":[{"@type":"WebSite","@id":"https://www.lucidchart.com/techblog/#website","url":"https://www.lucidchart.com/techblog/","name":"Lucidchart","potentialAction":{"@type":"SearchAction","target":"https://www.lucidchart.com/techblog/?s={search_term_string}","query-input":"required name=search_term_string"}},{"@type":"WebPage","@id":"https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/#webpage","url":"https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/","inLanguage":"en-US","name":"Why JSON isn&#039;t a Good Configuration Language - Lucidchart","isPartOf":{"@id":"https://www.lucidchart.com/techblog/#website"},"datePublished":"2018-07-16T16:28:37-06:00","dateModified":"2019-05-09T22:30:50-06:00","author":{"@id":"https://www.lucidchart.com/techblog/#/schema/person/51674496bb30a2f40b3b602a0bd0e488"},"description":"Learn why JSON falls short as a configuration language and what you can use instead as you create a new application, framework, or library."},{"@type":["Person"],"@id":"https://www.lucidchart.com/techblog/#/schema/person/51674496bb30a2f40b3b602a0bd0e488","name":"Thayne McCombs","image":{"@type":"ImageObject","@id":"https://www.lucidchart.com/techblog/#authorlogo","url":"https://secure.gravatar.com/avatar/78e2a9fed0b202a590a912246647f520?s=96&d=mm&r=g","caption":"Thayne McCombs"},"sameAs":[]}]}</script>
<!-- / Yoast SEO plugin. -->
Processed Open Graph response
{
  "og:locale": "en_US",
  "og:type": "article",
  "og:title": "Why JSON isn&#8217;t a Good Configuration Language",
  "og:description": "Why JSON isn\u2019t a Good Configuration Language",
  "og:url": "https://www.lucidchart.com/techblog/2018/07/16/why-json-isnt-a-good-configuration-language/",
  "og:site_name": "Lucidchart",
  "og:updated_time": "2019-05-09T22:30:50-06:00",
  "og:image:secure_url": "https://www.lucidchart.com/techblog/wp-content/uploads/2019/11/LucidBlogDefaultImage.png",
  "og:image:width": "1600",
  "og:image:height": "686"
}

As an aside, the processed description seems to be incorrect this -- seems to be because it also offers an oembed link to https://www.lucidchart.com/techblog/wp-json/oembed/1.0/embed?url=https%3A%2F%2Fwww.lucidchart.com%2Ftechblog%2F2018%2F07%2F16%2Fwhy-json-isnt-a-good-configuration-language%2F which has worse info in it:

oEmbed Response
{
  "version": "1.0",
  "provider_name": "Lucidchart",
  "provider_url": "https:\/\/www.lucidchart.com\/techblog",
  "author_name": "Thayne McCombs",
  "author_url": "https:\/\/www.lucidchart.com\/techblog\/author\/thayne-mccombs\/",
  "title": "Why JSON isn&#8217;t a Good Configuration Language",
  "type": "rich",
  "width": 600,
  "height": 338,
  "html": "<blockquote class=\"wp-embedded-content\" data-secret=\"BVNcfqyYa5\"><a href=\"https:\/\/www.lucidchart.com\/techblog\/2018\/07\/16\/why-json-isnt-a-good-configuration-language\/\">Why JSON isn&#8217;t a Good Configuration Language<\/a><\/blockquote><iframe sandbox=\"allow-scripts\" security=\"restricted\" src=\"https:\/\/www.lucidchart.com\/techblog\/2018\/07\/16\/why-json-isnt-a-good-configuration-language\/embed\/#?secret=BVNcfqyYa5\" width=\"600\" height=\"338\" title=\"&#8220;Why JSON isn&#8217;t a Good Configuration Language&#8221; &#8212; Lucidchart\" data-secret=\"BVNcfqyYa5\" frameborder=\"0\" marginwidth=\"0\" marginheight=\"0\" scrolling=\"no\" class=\"wp-embedded-content\"><\/iframe><script type=\"text\/javascript\">\n\/*! This file is auto-generated *\/\n!function(c,l){\"use strict\";var e=!1,o=!1;if(l.querySelector)if(c.addEventListener)e=!0;if(c.wp=c.wp||{},c.wp.receiveEmbedMessage);else if(c.wp.receiveEmbedMessage=function(e){var t=e.data;if(!t);else if(!(t.secret||t.message||t.value));else if(\/[^a-zA-Z0-9]\/.test(t.secret));else{for(var r,s,a,i=l.querySelectorAll('iframe[data-secret=\"'+t.secret+'\"]'),n=l.querySelectorAll('blockquote[data-secret=\"'+t.secret+'\"]'),o=0;o<n.length;o++)n[o].style.display=\"none\";for(o=0;o<i.length;o++)if(r=i[o],e.source!==r.contentWindow);else{if(r.removeAttribute(\"style\"),\"height\"===t.message){if(1e3<(s=parseInt(t.value,10)))s=1e3;else if(~~s<200)s=200;r.height=s}if(\"link\"===t.message)if(s=l.createElement(\"a\"),a=l.createElement(\"a\"),s.href=r.getAttribute(\"src\"),a.href=t.value,a.host===s.host)if(l.activeElement===r)c.top.location.href=t.value}}},e)c.addEventListener(\"message\",c.wp.receiveEmbedMessage,!1),l.addEventListener(\"DOMContentLoaded\",t,!1),c.addEventListener(\"load\",t,!1);function t(){if(o);else{o=!0;for(var e,t,r,s=-1!==navigator.appVersion.indexOf(\"MSIE 10\"),a=!!navigator.userAgent.match(\/Trident.*rv:11\\.\/),i=l.querySelectorAll(\"iframe.wp-embedded-content\"),n=0;n<i.length;n++){if(!(r=(t=i[n]).getAttribute(\"data-secret\")))r=Math.random().toString(36).substr(2,10),t.src+=\"#?secret=\"+r,t.setAttribute(\"data-secret\",r);if(s||a)(e=t.cloneNode(!0)).removeAttribute(\"security\"),t.parentNode.replaceChild(e,t);t.contentWindow.postMessage({message:\"ready\",secret:r},\"*\")}}}}(window,document);\n<\/script>\n"
}
@clokep clokep added S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. O-Occasional Affects or can be seen by some users regularly or most users rarely A-URL-Preview Issues related to generating server-side previews of remote URLs labels Dec 19, 2022
@clokep
Copy link
Member Author

clokep commented Dec 19, 2022

It looks like this is due to the oEmbed title field have the HTML entity in it. the oEmbed spec doesn't say this is valid, but 🤷

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-URL-Preview Issues related to generating server-side previews of remote URLs O-Occasional Affects or can be seen by some users regularly or most users rarely S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant