Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

πŸ› Fixed unexpected conversion of single-quoted attributes in HTML cards #19727

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 17 additions & 17 deletions ghost/core/test/e2e-api/content/__snapshots__/posts.test.js.snap

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions ghost/core/test/utils/fixtures/data-generator.js
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,7 @@ DataGenerator.Content = {
id: '618ba1ffbe2896088840a6e7',
title: 'Not so short, bit complex',
slug: 'not-so-short-bit-complex',
// NOTE: this has some invalid HTML, has a missing `</p>` after `</nav>`
mobiledoc: DataGenerator.markdownToMobiledoc('<p><nav><ul><li><a href=\"#nowhere\" title=\"Anchor URL\">Lorem</a></li><li><a href=\"__GHOST_URL__/about#nowhere\" title=\"Relative URL\">Aliquam</a></li><li><a href=\"//somewhere.com/link#nowhere\" title=\"Protocol Relative URL\">Tortor</a></li><li><a href=\"http://somewhere.com/link#nowhere\" title=\"Absolute URL\">Morbi</a></li><li><a href=\"#nowhere\" title=\"Praesent dapibus, neque id cursus faucibus\">Praesent</a></li><li><a href=\"#nowhere\" title=\"Pellentesque fermentum dolor\">Pellentesque</a></li></ul></nav><p>Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae, ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam egestas semper. Aenean ultricies mi vitae est. Mauris placerat eleifend leo.</p><table><thead><tr><th>1</th><th>2</th><th>3</th><th>4</th></tr></thead><tbody><tr><td>a</td><td>b</td><td>c</td><td>d</td></tr><tr><td>e</td><td>f</td><td>g</td><td>h</td></tr><tr><td>i</td><td>j</td><td>k</td><td>l</td></tr></tbody></table><dl><dt>Definition list</dt><dd>Consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</dd><dt>Lorem ipsum dolor sit amet</dt><dd>Consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</dd></dl><ul><li>Morbi in sem quis dui placerat ornare. Pellentesque odio nisi, euismod in, pharetra a, ultricies in, diam. Sed arcu. Cras consequat.</li><li>Praesent dapibus, neque id cursus faucibus, tortor neque egestas augue, eu vulputate magna eros eu erat. Aliquam erat volutpat. Nam dui mi, tincidunt quis, accumsan porttitor, facilisis luctus, metus.</li><li>Phasellus ultrices nulla quis nibh. Quisque a lectus. Donec consectetuer ligula vulputate sem tristique cursus. Nam nulla quam, gravida non, commodo a, sodales sit amet, nisi.</li><li>Pellentesque fermentum dolor. Aliquam quam lectus, facilisis auctor, ultrices ut, elementum vulputate, nunc.</li></ul></p>'),
published_at: new Date('2015-01-04'),
featured: true
Expand Down
95 changes: 70 additions & 25 deletions ghost/link-replacer/lib/link-replacer.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,39 +8,84 @@ class LinkReplacer {
* @returns {Promise<string>}
*/
async replace(html, replaceLink, options = {}) {
const cheerio = require('cheerio');
const {tokenize} = require('html5parser');
const entities = require('entities');

try {
const $ = cheerio.load(html, {
xml: {
// This makes sure we use the faster and less destructive htmlparser2 parser
xmlMode: false
},
// Do not replace &, ', " and others with HTML entities (is bugged because it replaces &map_ with something weird (&#x21A6;))
decodeEntities: false
}, false);

for (const el of $('a').toArray()) {
const href = $(el).attr('href');
if (href) {
let url;
const path = entities.decode(href);
try {
url = new URL(path, options.base);
} catch (e) {
// Ignore invalid URLs
const tokens = tokenize(html); // IToken[]
const replacements = [];

let inAnchor = false;
let inHref = false;

// interface IToken {
// start: number;
// end: number;
// value: string;
// type: TokenKind;
// }

// const enum TokenKind {
// 0 Literal,
// 1 OpenTag, // trim leading '<'
// 2 OpenTagEnd, // trim tailing '>', only could be '/' or ''
// 3 CloseTag, // trim leading '</' and tailing '>'
// 4 Whitespace, // the whitespace between attributes
// 5 AttrValueEq,
// 6 AttrValueNq,
// 7 AttrValueSq,
// 8 AttrValueDq,
// }

for (const token of tokens) {
if (token.type === 1 && token.value === 'a') {
inAnchor = true;
}

if (inAnchor) {
if (token.type === 2) {
inAnchor = false;
inHref = false;
}

if (token.type === 6 && token.value === 'href') {
inHref = true;
}
if (url) {
url = await replaceLink(url, path);
const str = url.toString();
$(el).attr('href', str);

if (inHref && token.type === 8) {
const path = entities.decode(token.value.substring(1, token.value.length - 1));
let url;
try {
url = new URL(path, options.base);
} catch (e) {
// Ignore invalid URLs
}
if (url) {
url = await replaceLink(url, path);
const str = url.toString();
replacements.push({url: str, start: token.start + 1, end: token.end - 1});
}

inHref = false;
}
}
}

return $.html();
let offsetAdjustment = 0;

replacements.forEach(({url, start, end}) => {
const originalLength = end - start;
const replacementLength = url.length;

html = html.slice(0, start + offsetAdjustment) + url + html.slice(end + offsetAdjustment);

offsetAdjustment += replacementLength - originalLength;
});

return html;
} catch (e) {
// Catch errors from cheerio
// do nothing in case of error,
// we don't want to break the content for the sake of member attribution
return html;
}
}
Expand Down
5 changes: 3 additions & 2 deletions ghost/link-replacer/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,14 @@
"lib"
],
"devDependencies": {
"@probe.gl/bench": "^4.0.6",
"c8": "8.0.1",
"mocha": "10.2.0",
"should": "13.2.3",
"sinon": "15.2.0"
},
"dependencies": {
"cheerio": "0.22.0",
"entities": "4.5.0"
"entities": "4.5.0",
"html5parser": "^2.0.2"
}
}
14 changes: 11 additions & 3 deletions ghost/link-replacer/test/LinkReplacer.test.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
const assert = require('assert/strict');
const linkReplacer = require('../lib/link-replacer');
const cheerio = require('cheerio');
const html5parser = require('html5parser');
const sinon = require('sinon');

describe('LinkReplacementService', function () {
Expand Down Expand Up @@ -83,12 +83,20 @@ describe('LinkReplacementService', function () {
assert.equal(replaced, expected);
});

it('Ignores cheerio errors', async function () {
sinon.stub(cheerio, 'load').throws(new Error('test'));
it('Ignores parse errors', async function () {
sinon.stub(html5parser, 'tokenize').throws(new Error('test'));
const html = '<a href="http://localhost:2368/dir/path">link</a>';

const replaced = await linkReplacer.replace(html, () => 'valid');
assert.equal(replaced, html);
});

it('Doesn\'t replace single-quote attributes with double-quote', async function () {
const html = '<div data-graph-name=\'The "all-in" cost of a grant\'>Test</div>';
const expected = '<div data-graph-name=\'The "all-in" cost of a grant\'>Test</div>';

const replaced = await linkReplacer.replace(html, () => new URL('https://google.com/test-dir?test-query'));
assert.equal(replaced, expected);
});
});
});
14 changes: 14 additions & 0 deletions ghost/link-replacer/test/benchmark.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
const {readFileSync} = require('node:fs');
const {Bench} = require('@probe.gl/bench');
const linkReplacer = require('../lib/link-replacer');
const linkReplacerNew = require('../lib/link-replacer-new');

// load html from file in ./fixtures/example-post.html
const html = readFileSync('./test/fixtures/example-post.html', {encoding: 'utf8', flag: 'r'});

const bench = new Bench()
.group('LinkReplacer')
.addAsync('cheerio', () => linkReplacer.replace(html, () => new URL('https://google.com/test-dir?test-query')))
.addAsync('html5parser', () => linkReplacerNew.replace(html, () => new URL('https://google.com/test-dir?test-query')));

bench.run();
78 changes: 78 additions & 0 deletions ghost/link-replacer/test/fixtures/example-post.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
<p>Faster and more robust than ever before, we just shipped a complete rewrite of the Ghost editor. This is our third
major iteration of the Ghost editor, packed with new features, including:</p>
<ul>
<li><a href="https://ghost.org/changelog/image-editor/"><strong>Native image editing</strong></a> - so you can
adjust photos on the fly</li>
<li><a href="https://ghost.org/changelog/post-history/"><strong>Post history</strong></a> - so you can see who
edited what, when, and restore old versions</li>
<li><a href="https://ghost.org/changelog/create-landing-pages/"><strong>Landing page cards</strong></a> - so you can
build beautiful custom experiences</li>
<li><a href="https://ghost.org/changelog/bookmarker/"><strong>Bookmarking</strong></a> - so you can collect links
from around the web for your posts</li>
</ul>
<p>And some fixes for longstanding issues with our previous editor, like:</p>
<ul>
<li><strong>Faster overall performance</strong> - things just feel more <em>snappy</em></li>
<li><strong>Improved handling of very large posts</strong> - which, in the past, was... painful</li>
<li><strong>Better undo/redo chaining</strong> - a smoother experience when fixing mistakes</li>
<li><strong>Much improved mobile editing</strong> - so you can write on the go in iOS / Android</li>
<li><strong>Nested lists</strong> - for structuring your bulleted thoughts<ul>
<li>Which wasn't possible before<ul>
<li>But is now</li>
</ul>
</li>
</ul>
</li>
<li><strong>More keyboard shortcuts</strong> - find the full list in the post settings menu</li>
</ul>
<p>The new editor is now available across all Ghost installs. <a
href="https://ghost.org/pricing/"><strong>Ghost(Pro)</strong></a> users can log into their sites to give it a
try. If you're a developer, self-hosting Ghost, you'll need to <a href="https://ghost.org/docs/update/">update</a>
to the latest version to get access to everything that's new.</p>
<hr>
<h2 id="developer-changes">Developer changes</h2>
<p>Keep reading below if you're curious about the technical details behind the new editor, and what it means if you're
building API integrations with Ghost.</p>
<figure class="kg-card kg-image-card kg-width-wide"><img
src="https://ghost.org/changelog/content/images/2023/10/Frame-1--4-.png" class="kg-image" alt="" loading="lazy"
width="2000" height="1052"></figure>
<p>As we worked on this new editor, one of our main goals was to keep things the same. We made a few visual tweaks here
and there, but for the most part it's still the same editor you know and love... it just works better than it did
before.</p>
<p>Under the hood, though, the technical changes we've made to the editor unlock exciting possibilities for the future.
</p>
<p>Ghost's editor, called Koenig, was previously built in <a href="https://emberjs.com/">Ember.js</a> on an open
JSON-based document storage format called <a href="https://github.com/bustle/mobiledoc-kit">MobileDoc</a>. We loved
how it worked, but MobileDoc never became widely adopted, so the technology underpinning our editor became a bit
stagnant. This limited our ability to build new features, or solve frustrating core bugs (like better mobile
support).</p>
<p>Koenig has now been rebuilt on a new stack: <a href="https://react.dev/">React.js</a> and <a
href="https://lexical.dev/">Lexical</a> β€” both of which are open source frameworks developed by Meta. So, Ghost
is now using the same underlying technology that powers every single editor, comment box, or user input for billions
of users across Facebook and Instagram.</p>
<figure class="kg-card kg-image-card kg-width-wide kg-card-hascaption"><img
src="https://ghost.org/changelog/content/images/2023/10/[email protected]"
class="kg-image" alt="" loading="lazy" width="2000" height="1246">
<figcaption><span style="white-space: pre-wrap;">Try the new Koenig editor for yourself β€” </span><a
href="https://koenig.ghost.org/"><span style="white-space: pre-wrap;">https://koenig.ghost.org</span></a>
</figcaption>
</figure>
<p>Ghost is the first independent company outside of Meta to build a full-scale dynamic editor on top of Lexical, and we
worked directly with the Lexical core team to make it happen. Today's announcement reflects over a year of quiet,
dedicated work by both teams to get to where we are now.</p>
<p>We have lots of plans for continuing to improve Ghost's editing experience, and this shift in architecture has opened
a lot of new doors for what's possible next.</p>
<p>For developers building integrations with Ghost, check out our updated API docs, which cover how to interact with
Lexical content stored in the database:</p>
<figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container"
href="https://ghost.org/docs/admin-api/#posts">
<div class="kg-bookmark-content">
<div class="kg-bookmark-title">Ghost Admin API Documentation</div>
<div class="kg-bookmark-description">Manage content via Ghost’s Admin API, with secure role-based
authentication. Read more on Ghost Docs πŸ‘‰</div>
<div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://ghost.org/favicon.ico"
alt=""><span class="kg-bookmark-author">Ghost - The Professional Publishing Platform</span></div>
</div>
<div class="kg-bookmark-thumbnail"><img src="https://ghost.org/images/meta/ghost-docs.png" alt=""></div>
</a></figure>
<p></p>
2 changes: 1 addition & 1 deletion ghost/member-attribution/test/outbound-link-tagger.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ describe('OutboundLinkTagger', function () {
}
});
const html = await service.addToHtml('<a href="test">Hello world</a><a href="">Hello world</a>');
assert.equal(html, '<a href="test">Hello world</a><a href>Hello world</a>');
assert.equal(html, '<a href="test">Hello world</a><a href="">Hello world</a>');
});

it('keeps HTML if throws', async function () {
Expand Down
39 changes: 38 additions & 1 deletion yarn.lock
Original file line number Diff line number Diff line change
Expand Up @@ -1852,6 +1852,13 @@
dependencies:
regenerator-runtime "^0.13.4"

"@babel/runtime@^7.0.0":
version "7.23.9"
resolved "https://registry.yarnpkg.com/@babel/runtime/-/runtime-7.23.9.tgz#47791a15e4603bb5f905bc0753801cf21d6345f7"
integrity sha512-0CX6F+BI2s9dkUqr08KFrAIZgNFj75rdBU/DjCyYLIaV/quFjkk6T+EJ2LkZHyZTbEV4L5p97mNkUsHl2wLFAw==
dependencies:
regenerator-runtime "^0.14.0"

"@babel/runtime@^7.10.5", "@babel/runtime@^7.12.0", "@babel/runtime@^7.12.5", "@babel/runtime@^7.13.10", "@babel/runtime@^7.17.8", "@babel/runtime@^7.18.3", "@babel/runtime@^7.18.6", "@babel/runtime@^7.20.6", "@babel/runtime@^7.21.0", "@babel/runtime@^7.23.2", "@babel/runtime@^7.5.5", "@babel/runtime@^7.8.4", "@babel/runtime@^7.8.7", "@babel/runtime@^7.9.2":
version "7.23.2"
resolved "https://registry.yarnpkg.com/@babel/runtime/-/runtime-7.23.2.tgz#062b0ac103261d68a966c4c7baf2ae3e62ec3885"
Expand Down Expand Up @@ -3963,6 +3970,29 @@
resolved "https://registry.yarnpkg.com/@prisma/prisma-fmt-wasm/-/prisma-fmt-wasm-4.17.0-16.27eb2449f178cd9fe1a4b892d732cc4795f75085.tgz#030f8a4448892c345b3c5c0558ca0ebf4642f3de"
integrity sha512-zYz3rFwPB82mVlHGknAPdnSY/a308dhPOblxQLcZgZTDRtDXOE1MgxoRAys+jekwR4/bm3+rZDPs1xsFMsPZig==

"@probe.gl/bench@^4.0.6":
version "4.0.6"
resolved "https://registry.yarnpkg.com/@probe.gl/bench/-/bench-4.0.6.tgz#1aacc78f747259e4522ad4c852145f2727b88fc6"
integrity sha512-uwveDPyMPaGo/5HcBiOFQHPzFEIfsjV2VoW2qAsVVr7+FHie5QVDh6jnipAYeFsk6RZLX7vskbjRX3aFWu5HxA==
dependencies:
"@babel/runtime" "^7.0.0"
"@probe.gl/log" "4.0.6"

"@probe.gl/[email protected]":
version "4.0.6"
resolved "https://registry.yarnpkg.com/@probe.gl/env/-/env-4.0.6.tgz#ea73bfb60ed862dd37654b833ca2e38160d53f8b"
integrity sha512-nF7/LrBgp5YU2va+7pgKRHbh22zK8OIUhVw/N1O9pqM9AbifIGwoi0rFN5QIO4bxAvxcC6iUutgLQq5Y5yRr8A==
dependencies:
"@babel/runtime" "^7.0.0"

"@probe.gl/[email protected]":
version "4.0.6"
resolved "https://registry.yarnpkg.com/@probe.gl/log/-/log-4.0.6.tgz#820808bb958b9ec4df588ade684bed60ce2195ff"
integrity sha512-w4rESrMxLF+nsgxqBFUMlf/dFwOW3o+PDBzl5pAPpyhiYCUEwYCTgD4FwE/uguzpK1Q+ms3fDF7jSnoIqMR0fQ==
dependencies:
"@babel/runtime" "^7.0.0"
"@probe.gl/env" "4.0.6"

"@protobufjs/aspromise@^1.1.1", "@protobufjs/aspromise@^1.1.2":
version "1.1.2"
resolved "https://registry.yarnpkg.com/@protobufjs/aspromise/-/aspromise-1.1.2.tgz#9b8b0cc663d669a7d8f6f5d0893a14d348f30fbf"
Expand Down Expand Up @@ -18594,6 +18624,13 @@ [email protected]:
prompts "^2.0.0"
semver "^7.0.0"

html5parser@^2.0.2:
version "2.0.2"
resolved "https://registry.yarnpkg.com/html5parser/-/html5parser-2.0.2.tgz#dd504884761e024e682e1535ee9b5b997cc47293"
integrity sha512-L0y+IdTVxHsovmye8MBtFgBvWZnq1C9WnI/SmJszxoQjmUH1psX2uzDk21O5k5et6udxdGjwxkbmT9eVRoG05w==
dependencies:
tslib "^2.2.0"

htmlparser2@^3.10.1, htmlparser2@^3.9.1:
version "3.10.1"
resolved "https://registry.yarnpkg.com/htmlparser2/-/htmlparser2-3.10.1.tgz#bd679dc3f59897b6a34bb10749c855bb53a9392f"
Expand Down Expand Up @@ -29591,7 +29628,7 @@ tslib@^1.11.1, tslib@^1.13.0, tslib@^1.9.0:
resolved "https://registry.yarnpkg.com/tslib/-/tslib-1.14.1.tgz#cf2d38bdc34a134bcaf1091c41f6619e2f672d00"
integrity sha512-Xni35NKzjgMrwevysHTCArtLDpPvye8zV/0E4EyYn43P7/7qvQwPh9BGkHewbMulVntbigmcT7rdX3BNo9wRJg==

tslib@^2.0.0, tslib@^2.0.1, tslib@^2.0.3, tslib@^2.1.0, tslib@^2.3.0, tslib@^2.3.1, tslib@^2.4.0, tslib@^2.5.0:
tslib@^2.0.0, tslib@^2.0.1, tslib@^2.0.3, tslib@^2.1.0, tslib@^2.2.0, tslib@^2.3.0, tslib@^2.3.1, tslib@^2.4.0, tslib@^2.5.0:
version "2.6.2"
resolved "https://registry.yarnpkg.com/tslib/-/tslib-2.6.2.tgz#703ac29425e7b37cd6fd456e92404d46d1f3e4ae"
integrity sha512-AEYxH93jGFPn/a2iVAwW87VuUIkR1FVUKB77NwMF7nBTDkDrrT/Hpt/IrCJ0QXhW27jTBDcf5ZY7w6RiqTMw2Q==
Expand Down
Loading