Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relax default HTML allowlist #383

Merged
merged 5 commits into from
Sep 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,7 @@
- Bump markdown-it to [v14.1.0](https://github.com/markdown-it/markdown-it/blob/master/CHANGELOG.md#1410---2024-03-19), and follow the latest spec of [CommonMark 0.31.2](https://spec.commonmark.org/0.31.2/)
- Support for CSS nesting (`cssNesting` constructor option)
- Use simpler CSS minification when `minifyCSS` option is enabled ([#381](https://github.com/marp-team/marp-core/pull/381))
- Relax HTML allowlist: Allowed a lot of HTML elements and attributes by default ([#301](https://github.com/marp-team/marp-core/issues/301), [#383](https://github.com/marp-team/marp-core/pull/383))

* Upgrade development Node.js to v20 LTS ([#359](https://github.com/marp-team/marp-core/pull/359))
* Upgrade dependent packages to the latest version ([#380](https://github.com/marp-team/marp-core/pull/380))
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,7 @@ const marp = new Marp({

Setting whether to render raw HTML in Markdown. It's an alias to `markdown.html` ([markdown-it option](https://markdown-it.github.io/markdown-it/#MarkdownIt.new)) but has additional feature about HTML allowlist.

- (default): Use Marp's default allowlist.
- `true`: The all HTML will be allowed.
- `false`: All HTML except supported in Marpit Markdown will be disallowed.

Expand All @@ -265,7 +266,7 @@ By passing `object`, you can set the allowlist to specify allowed tags and attri
}
```

Marp core allows only `<br>` tag by default. That is defined in [a readonly `html` member in `Marp` class](https://github.com/marp-team/marp-core/blob/38fb33680c5837f9c48d8a88ac94b9f0862ab6c7/src/marp.ts#L34).
By default, Marp Core allows known HTML elements and attributes that are considered as safe. That is defined as a readonly `html` member in `Marp` class. [See the full default allowlist in the source code.](src/html/allowlist.ts)

> [!NOTE]
> Whatever any option is selected, `<!-- HTML comment -->` and `<style>` tags are always parsed by Marpit for directives / tweaking style.
Expand Down
1 change: 0 additions & 1 deletion marp.config.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ export default {
path.dirname(new URL(import.meta.url).pathname),
'./sandbox',
),
html: true,
options: {
minifyCSS: false,
},
Expand Down
242 changes: 242 additions & 0 deletions src/html/allowlist.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,242 @@
export type HTMLAllowList = {
[tag: string]:
| string[]
| { [attr: string]: boolean | ((value: string) => string) }
}

const globalAttrs = {
class: true,
dir: (value) => {
const normalized = value.toLowerCase()
return ['rtl', 'ltr', 'auto'].includes(normalized) ? normalized : ''
},
lang: true,
title: true,
} as const satisfies HTMLAllowList[string]

const generateUrlSanitizer =
(schemas: string[]) =>
(value: string): string => {
if (value.includes(':')) {
// Check the URL schema if it exists
const trimmed = value.trim().toLowerCase()
const schema = trimmed.split(':', 1)[0]

for (const allowedSchema of schemas) {
if (schema === allowedSchema) return value
if (allowedSchema.includes(':') && trimmed.startsWith(allowedSchema))
return value
}

return ''
}
return value
}

const webUrlSanitizer = generateUrlSanitizer(['http', 'https'])
const imageUrlSanitizer = generateUrlSanitizer(['http', 'https', 'data:image/'])
const srcSetSanitizer = (value: string): string => {
for (const src of value.split(',')) {
if (!imageUrlSanitizer(src)) return ''
}
return value
}

export const defaultHTMLAllowList = {
a: {
...globalAttrs,
href: webUrlSanitizer,
name: true, // deprecated attribute, but still useful in Marp for making stable anchor link
rel: true,
target: true,
},
abbr: globalAttrs,
address: globalAttrs,
article: globalAttrs,
aside: globalAttrs,
audio: {
...globalAttrs,
autoplay: true,
controls: true,
loop: true,
muted: true,
preload: true,
src: webUrlSanitizer,
},
b: globalAttrs,
bdi: globalAttrs,
bdo: globalAttrs,
big: globalAttrs,
blockquote: {
...globalAttrs,
cite: webUrlSanitizer,
},
br: globalAttrs,
caption: globalAttrs,
center: globalAttrs, // deprecated
cite: globalAttrs,
code: globalAttrs,
col: {
...globalAttrs,
align: true,
valign: true,
span: true,
width: true,
},
colgroup: {
...globalAttrs,
align: true,
valign: true,
span: true,
width: true,
},
dd: globalAttrs,
del: {
...globalAttrs,
cite: webUrlSanitizer,
datetime: true,
},
details: {
...globalAttrs,
open: true,
},
div: globalAttrs,
dl: globalAttrs,
dt: globalAttrs,
em: globalAttrs,
figcaption: globalAttrs,
figure: globalAttrs,
// footer: globalAttrs, // Inserted by Marpit directives so disallowed to avoid confusion
h1: globalAttrs,
h2: globalAttrs,
h3: globalAttrs,
h4: globalAttrs,
h5: globalAttrs,
h6: globalAttrs,
// header: globalAttrs, // Inserted by Marpit directives so disallowed to avoid confusion
hr: globalAttrs,
i: globalAttrs,
img: {
...globalAttrs,
align: true, // deprecated attribute, but still useful in Marp for aligning image
alt: true,
decoding: true,
height: true,
loading: true,
src: imageUrlSanitizer,
srcset: srcSetSanitizer,
title: true,
width: true,
},
ins: {
...globalAttrs,
cite: webUrlSanitizer,
datetime: true,
},
kbd: globalAttrs,
li: {
...globalAttrs,
type: true,
value: true,
},
mark: globalAttrs,
nav: globalAttrs,
ol: {
...globalAttrs,
reversed: true,
start: true,
type: true,
},
p: globalAttrs,
picture: globalAttrs,
pre: globalAttrs,
source: {
height: true,
media: true,
sizes: true,
src: imageUrlSanitizer,
srcset: srcSetSanitizer,
type: true,
width: true,
},
q: {
...globalAttrs,
cite: webUrlSanitizer,
},
rp: globalAttrs,
rt: globalAttrs,
ruby: globalAttrs,
s: globalAttrs,
section: globalAttrs,
small: globalAttrs,
span: globalAttrs,
sub: globalAttrs,
summary: globalAttrs,
sup: globalAttrs,
strong: globalAttrs,
strike: globalAttrs,
table: {
...globalAttrs,
width: true,
border: true,
align: true,
valign: true,
},
tbody: {
...globalAttrs,
align: true,
valign: true,
},
td: {
...globalAttrs,
width: true,
rowspan: true,
colspan: true,
align: true,
valign: true,
},
tfoot: {
...globalAttrs,
align: true,
valign: true,
},
th: {
...globalAttrs,
width: true,
rowspan: true,
colspan: true,
align: true,
valign: true,
},
thead: {
...globalAttrs,
align: true,
valign: true,
},
time: {
...globalAttrs,
datetime: true,
},
tr: {
...globalAttrs,
rowspan: true,
align: true,
valign: true,
},
u: globalAttrs,
ul: globalAttrs,
video: {
...globalAttrs,
autoplay: true,
controls: true,
loop: true,
muted: true,
playsinline: true,
poster: imageUrlSanitizer,
preload: true,
src: webUrlSanitizer,
height: true,
width: true,
},
wbr: globalAttrs,
} as const satisfies HTMLAllowList
7 changes: 3 additions & 4 deletions src/html/html.ts
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import selfClosingTags from 'self-closing-tags'
import { FilterXSS } from 'xss'
import { friendlyAttrValue, escapeAttrValue } from 'xss/lib/default'
import { FilterXSS, friendlyAttrValue, escapeAttrValue } from 'xss'
import { MarpOptions } from '../marp'

const selfClosingRegexp = /\s*\/?>$/
Expand All @@ -12,7 +11,7 @@ const xhtmlOutFilter = new FilterXSS({
}
return html
},
whiteList: {},
allowList: {},
})

export function markdown(md): void {
Expand Down Expand Up @@ -60,7 +59,7 @@ export function markdown(md): void {
}

const filter = new FilterXSS({
whiteList: allowList,
allowList,
onIgnoreTag: (_, rawHtml) => (html === true ? rawHtml : undefined),
safeAttrValue: (tag, attr, value) => {
let ret = friendlyAttrValue(value)
Expand Down
11 changes: 3 additions & 8 deletions src/marp.ts
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ import * as autoScalingPlugin from './auto-scaling'
import * as customElements from './custom-elements'
import * as emojiPlugin from './emoji/emoji'
import { generateHighlightJSInstance } from './highlightjs'
import { defaultHTMLAllowList, type HTMLAllowList } from './html/allowlist'
import * as htmlPlugin from './html/html'
import * as mathPlugin from './math/math'
import * as scriptPlugin from './script/script'
Expand All @@ -17,13 +18,7 @@ import * as slugPlugin from './slug/slug'

export interface MarpOptions extends Options {
emoji?: emojiPlugin.EmojiOptions
html?:
| boolean
| {
[tag: string]:
| string[]
| { [attr: string]: boolean | ((value: string) => string) }
}
html?: boolean | HTMLAllowList
markdown?: object
math?: mathPlugin.MathOptions
minifyCSS?: boolean
Expand All @@ -36,7 +31,7 @@ export class Marp extends Marpit {

private _highlightjs: HLJSApi | undefined

static readonly html = { br: [] }
static readonly html = defaultHTMLAllowList

constructor(opts: MarpOptions = {}) {
const mdOpts: Record<string, any> = {
Expand Down
Loading