Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define speculative HTML parsing #5959

Merged
merged 6 commits into from
Sep 14, 2021
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
268 changes: 246 additions & 22 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -110134,6 +110134,21 @@ dictionary <dfn dictionary>StorageEventInit</dfn> : <span>EventInit</span> {
particular <var>intended parent</var>, the UA must run the following steps:</p>

<ol>
<li><p>If the <span>active speculative HTML parser</span> is not null, then return the result of
<span data-x="create a speculative mock element">creating a speculative mock element</span>
given <var>given namespace</var>, the tag name of the given token, and the attributes of the
given token.</p></li>

<li>
<p>Otherwise, optionally <span>create a speculative mock element</span> given <var>given
namespace</var>, the tag name of the given token, and the attributes of the given token.</p>

<p class="note">The result is not used. This step allows for a <span>speculative fetch</span> to
be initiated from non-speculative parsing. The fetch is still speculative at this point,
because, for example, by the time the element is inserted, <var>intended parent</var> might
have been removed from the document.</p>
</li>

<li><p>Let <var>document</var> be <var>intended parent</var>'s <span>node document</span>.</li>

<li><p>Let <var>local name</var> be the tag name of the token.</p></li>
Expand Down Expand Up @@ -110867,20 +110882,27 @@ document.body.appendChild(text);
<p><span data-x="acknowledge self-closing flag">Acknowledge the token's <i data-x="self-closing flag">self-closing
flag</i></span>, if it is set.</p>

<p id="meta-charset-during-parse">If the element has a <code
data-x="attr-meta-charset">charset</code> attribute, and <span>getting an encoding</span> from
its value results in an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>, then
<span>change the encoding</span> to the resulting encoding.</p>
<p>If the <span>active speculative HTML parser</span> is null, then:</p>

<ol>
<li><p id="meta-charset-during-parse">If the element has a <code
data-x="attr-meta-charset">charset</code> attribute, and <span>getting an encoding</span> from
its value results in an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>,
then <span>change the encoding</span> to the resulting encoding.</p></li>

<li><p>Otherwise, if the element has an <code data-x="attr-meta-http-equiv">http-equiv</code>
attribute whose value is an <span>ASCII case-insensitive</span> match for the string "<code
data-x="">Content-Type</code>", and the element has a <code
data-x="attr-meta-content">content</code> attribute, and applying the <span>algorithm for
extracting a character encoding from a <code>meta</code> element</span> to that attribute's
value returns an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>,
then <span>change the encoding</span> to the extracted encoding.</p></li>
</ol>

<p>Otherwise, if the element has an <code data-x="attr-meta-http-equiv">http-equiv</code>
attribute whose value is an <span>ASCII case-insensitive</span> match for the string "<code
data-x="">Content-Type</code>", and the element has a <code
data-x="attr-meta-content">content</code> attribute, and applying the <span>algorithm for
extracting a character encoding from a <code>meta</code> element</span> to that attribute's
value returns an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>, then
<span>change the encoding</span> to the extracted encoding.</p>
<p class="note">The <span>speculative HTML parser</span> doesn't speculatively apply character
encoding declarations in order to reduce implementation complexity.</p>
</dd>

<dt>A start tag whose tag name is "title"</dt>
Expand Down Expand Up @@ -112362,8 +112384,8 @@ document.body.appendChild(text);

<dt id="scriptEndTag">An end tag whose tag name is "script"</dt>
<dd>
<p>If the <span>JavaScript execution context stack</span> is empty, <span>perform a microtask
checkpoint</span>.</p>
<p>If the <span>active speculative HTML parser</span> is null and the <span>JavaScript execution
context stack</span> is empty, then <span>perform a microtask checkpoint</span>.</p>

<p>Let <var>script</var> be the <span>current node</span> (which will be a
<code>script</code> element).</p>
Expand All @@ -112378,10 +112400,11 @@ document.body.appendChild(text);

<p>Increment the parser's <span>script nesting level</span> by one.</p>

<p><span data-x="prepare a script">Prepare</span> the <var>script</var>. This might
cause some script to execute, which might cause <span data-x="dom-document-write">new characters
to be inserted into the tokenizer</span>, and might cause the tokenizer to output more tokens,
resulting in a <a href="#nestedParsing">reentrant invocation of the parser</a>.</p>
<p>If the <span>active speculative HTML parser</span> is null, then <span data-x="prepare a
script">prepare</span> the <var>script</var>. This might cause some script to execute, which
might cause <span data-x="dom-document-write">new characters to be inserted into the
tokenizer</span>, and might cause the tokenizer to output more tokens, resulting in a <a
href="#nestedParsing">reentrant invocation of the parser</a>.</p>

<p>Decrement the parser's <span>script nesting level</span> by one. If the parser's <span>script
nesting level</span> is zero, then set the <span>parser pause flag</span> to false.</p>
Expand Down Expand Up @@ -112417,6 +112440,9 @@ document.body.appendChild(text);
<li><p>Let <var>the script</var> be the <span>pending parsing-blocking
script</span>. There is no longer a <span>pending parsing-blocking script</span>.</p></li>

<li><p><span>Start the speculative HTML parser</span> for this instance of the HTML
parser.</p></li>

<li><p>Block the <span data-x="tokenization">tokenizer</span> for this instance of the
<span>HTML parser</span>, such that the <span>event loop</span> will not run <span
data-x="concept-task">tasks</span> that invoke the <span
Expand All @@ -112438,6 +112464,9 @@ document.body.appendChild(text);
<code>Document</code>.</p>
</li>

<li><p><span>Stop the speculative HTML parser</span> for this instance of the HTML
parser.</p></li>

<li><p>Unblock the <span data-x="tokenization">tokenizer</span> for this instance of the
<span>HTML parser</span>, such that <span data-x="concept-task">tasks</span> that invoke the
<span data-x="tokenization">tokenizer</span> can again be run.</p></li>
Expand Down Expand Up @@ -113914,9 +113943,9 @@ document.body.appendChild(text);
<p>Increment the parser's <span>script nesting level</span> by one. Set the <span>parser pause
flag</span> to true.</p>

<p><a href="https://www.w3.org/TR/SVGMobile12/script.html#ScriptContentProcessing">Process the
SVG <code data-x="">script</code> element</a> according to the SVG rules, if the user agent
supports SVG. <ref spec=SVG></p>
<p>If the <span>active speculative HTML parser</span> is null and the user agent supports SVG,
then <a href="https://www.w3.org/TR/SVGMobile12/script.html#ScriptContentProcessing">Process the
SVG <code data-x="">script</code> element</a> according to the SVG rules. <ref spec=SVG></p>

<p class="note">Even if this causes <span data-x="dom-document-write">new characters to be
inserted into the tokenizer</span>, the parser will not be executed reentrantly, since the
Expand Down Expand Up @@ -113974,6 +114003,9 @@ document.body.appendChild(text);
<ol>
<!-- this happens as part of one of the tasks that runs the parser -->

<li><p>If the <span>active speculative HTML parser</span> is not null, then <span>stop the
speculative HTML parser</span> and return.</p></li>

<li><p>Set the <span>insertion point</span> to undefined.</p></li>

<li><p><span>Update the current document readiness</span> to "<code
Expand Down Expand Up @@ -114106,6 +114138,8 @@ document.body.appendChild(text);
<li><p>Throw away any pending content in the <span>input stream</span>, and discard any future
content that would have been added to it.</p></li>

<li><p><span>Stop the speculative HTML parser</span> for this HTML parser.</p></li>

<li><p><span>Update the current document readiness</span> to "<code
data-x="">interactive</code>".</p></li>

Expand All @@ -114123,6 +114157,193 @@ document.body.appendChild(text);
</div>


<div w-nodev>

<h4>Speculative HTML parsing</h4>

<p>User agents may implement an optimization, as described in this section, to speculatively fetch
resources that are declared in the HTML markup while the HTML parser is waiting for a
<span>pending parsing-blocking script</span> to be fetched and executed, or during normal parsing,
zcorpan marked this conversation as resolved.
Show resolved Hide resolved
at the time <span data-x="create an element for the token">an element is created for a token</span>.
While this optimization is not defined in precise detail, there are some rules to consider for
nteroperability.</p>
zcorpan marked this conversation as resolved.
Show resolved Hide resolved

<p>Each <span>HTML parser</span> can have an <dfn export>active speculative HTML parser</dfn>. It
is initially null.</p>

<p>The <dfn export>speculative HTML parser</dfn> must act like the normal HTML parser (e.g., the
tree builder rules apply), with some exceptions:</p>

<ul>
<li>
<p>The state of the normal HTML parser and the document itself must not be affected.</p>

<p class="example">For example, the <span>next input character</span> or the <span>stack of open
elements</span> for the normal HTML parser is not affected by the <span>speculative HTML
parser</span>.</p>
</li>

<li>
<p>Bytes pushed into the HTML parser's <span>input byte stream</span> must also be pushed into
the speculative HTML parser's <span>input byte stream</span>. Bytes read from the streams must
be independent.</p>
</li>

<li>
<p>The result of the speculative parsing is primarily a series of <span data-x="speculative
fetch">speculative fetches</span>. Which kinds of resources to speculatively fetch is
<span>implementation-defined</span>, but user agents must not speculatively fetch resources that
zcorpan marked this conversation as resolved.
Show resolved Hide resolved
would not be fetched with the normal HTML parser, under the assumption that the script that is
blocking the HTML parser does nothing.</p>

<p class="note">It is possible that the same markup is seen multiple times from the
<span>speculative HTML parser</span> and then the normal HTML parser. It is expected that
duplicated fetches will be prevented by caching rules, which are not yet fully specified.</p>
zcorpan marked this conversation as resolved.
Show resolved Hide resolved
</li>
</ul>

<p>A <dfn>speculative fetch</dfn> for a <span>speculative mock element</span> <var>element</var>
must follow these rules:</p>

<p class="XXX">Should some of these things be applied to the document "for real", even
though they are found speculatively?</p>

<ul>
<li>
<p>If the <span>speculative HTML parser</span> encounters one of the following elements, then
act as if that element is processed for the purpose of its effect of subsequent speculative
fetches.</p>

<ul class="brief">
<li>A <code>base</code> element.</li>

<li>A <code>meta</code> element whose <code data-x="attr-meta-http-equiv">http-equiv</code>
attribute is in the <span data-x="attr-meta-http-equiv-content-security-policy">Content
security policy</span> state.</li>

<li>A <code>meta</code> element whose <code data-x="attr-meta-name">name</code> attribute is an
<span>ASCII case-insensitive</span> match for "<code
data-x="meta-referrer">referrer</code>".</li>

<li>A <code>meta</code> element whose <code data-x="attr-meta-name">name</code> attribute is an
<span>ASCII case-insensitive</span> match for "<code data-x="">viewport</code>". (This can
affect whether a media query list <span>matches the environment</span>.) <ref
spec=CSSDEVICEADAPT></li>
</ul>
</li>

<li><p>Let <var>url</var> be the <span>URL</span> that <var>element</var> would fetch if it was
processed normally. If there is no such <span>URL</span> or if it is the empty string, then do
nothing. Otherwise, if <span>url</span> is already in the <span>list of speculative fetch
URLs</span>, then do nothing. Otherwise, fetch <span>url</span> as if the element was processed
normally, and add <var>url</var> to the <span>list of speculative fetch URLs</span>.</p></li>
</ul>

<p>Each <code>Document</code> has a <dfn>list of speculative fetch URLs</dfn>, which is a
<span>list</span> of <span data-x="URL">URLs</span>, initially empty.</p>

<p>To <dfn>start the speculative HTML parser</dfn> for an instance of an HTML parser
<var>parser</var>:</p>

<ol>
<li><p>Optionally, return. (This allows user agents to opt out of speculative HTML
parsing.)</p></li>
zcorpan marked this conversation as resolved.
Show resolved Hide resolved

<li>
<p>If <var>parser</var>'s <span>active speculative HTML parser</span> is not null, then
<span>stop the speculative HTML parser</span> for <var>parser</var>.</p>

<p class="note">This can happen when <code data-x="dom-document-write">document.write()</code>
writes another parser-blocking script. For simplicity, this specification always restarts
speculative parsing, but user agents can implement a more efficient strategy, so long as the end
result is equivalent.</p>
</li>

<li><p>Let <var>speculativeParser</var> be a new <span>speculative HTML parser</span>, with the
same state as <var>parser</var>.</p></li>

<li><p>Let <var>speculativeDoc</var> be a new isomorphic representation of <var>parser</var>'s
<code>Document</code>, where all elements are instead <span data-x="speculative mock
element">speculative mock elements</span>. Let <var>speculativeParser</var> parse into
<var>speculativeDoc</var>.</p></li>

<li><p>Set <var>parser</var>'s <span>active speculative HTML parser</span> to
<var>speculativeParser</var>.</p></li>

<li><p><span>In parallel</span>, run <var>speculativeParser</var> until it is stopped or until it
reaches the end of its <span>input stream</span>.</p></li>
</ol>


<p>To <dfn>stop the speculative HTML parser</dfn> for an instance of an HTML parser
<var>parser</var>:</p>

<ol>
<li><p>Let <var>speculativeParser</var> be <var>parser</var>'s <span>active speculative HTML
parser</span>.</p></li>

<li><p>If <var>speculativeParser</var> is null, then return.</p></li>

<li><p>Throw away any pending content in <var>speculativeParser</var>'s <span>input
stream</span>, and discard any future content that would have been added to it.</p></li>

<li><p>Set <var>parser</var>'s <span>active speculative HTML parser</span> to null.</p></li>
</ol>

<p>The <span>speculative HTML parser</span> will create <span
data-x="speculative mock element">speculative mock elements</span> instead of normal elements. DOM
operations that the tree builder normally does on elements are expected to work appropriately on
speculative mock elements.</p>

<p>A <dfn>speculative mock element</dfn> is a <span>struct</span> with the following <span
data-x="struct item">items</span>:</p>

<ul>
<li><p>A <span>string</span> <dfn data-x="concept-mock-namespace">namespace</dfn>, corresponding
to an element's <span data-x="concept-element-namespace">namespace</span>.</p></li>

<li><p>A <span>string</span> <dfn data-x="concept-mock-local-name">local name</dfn>,
corresponding to an element's <span data-x="concept-element-local-name">local
name</span>.</p></li>

<li><p>A <span>list</span> <dfn data-x="concept-mock-attribute-list">attribute list</dfn>,
corresponding to an element's <span>attribute list</span>.</p></li>

<li><p>A <span>list</span> <dfn data-x="concept-mock-children">children</dfn>, corresponding to
an element's <span data-x="concept-tree-child">children</span>.</p></li>
</ul>

<p>To <dfn>create a speculative mock element</dfn> given a <var>namespace</var>,
<var>tagName</var>, and <var>attributes</var>:</p>

<ol>
<li><p>Let <var>element</var> be a new <span>speculative mock element</span>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-namespace">namespace</span> to
<var>namespace</var>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-local-name">local name</span> to
<var>tagName</var>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-attribute-list">attribute list</span>
to <var>attributes</var>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-children">children</span> to a new
empty <span>list</span>.</p></li>

<li><p>Optionally, perform a <span>speculative fetch</span> for <var>element</var>.</p></li>

<li><p>Return <var>element</var>.</p></li>
</ol>

<p>When the tree builder says to insert an element into a <code>template</code> element's
<span>template contents</span>, if that is a <span>speculative mock element</span>, instead do
nothing. URLs found speculatively inside <code>template</code> elements might themselves be
templates, and must not be speculatively fetched.</p>
zcorpan marked this conversation as resolved.
Show resolved Hide resolved

</div>


<div w-nodev>

<h4>Coercing an HTML DOM into an infoset</h4>
Expand Down Expand Up @@ -125288,6 +125509,9 @@ INSERT INTERFACES HERE
<dt id="refsCSSCOLORADJUST">[CSSCOLORADJUST]</dt>
<dd><cite><a href="https://drafts.csswg.org/css-color-adjust/">CSS Color Adjustment Module</a></cite>, E. Etemad, R. Atanassov, R. Lillesveen, T. Atkins. W3C.</dd>

<dt id="refsCSSDEVICEADAPT">[CSSDEVICEADAPT]</dt>
<dd><cite><a href="https://drafts.csswg.org/css-device-adapt/">CSS Device Adaption</a></cite>, F. Rivoal, M. Rakow. W3C.</dd>

<dt id="refsCSSDISPLAY">[CSSDISPLAY]</dt>
<dd><cite><a href="https://drafts.csswg.org/css-display/">CSS Display</a></cite>, T. Atkins, E. Etemad. W3C.</dd>

Expand Down