Skip to content

Commit

Permalink
Define speculative HTML parsing
Browse files Browse the repository at this point in the history
Fixes #5624.
  • Loading branch information
zcorpan authored Sep 14, 2021
1 parent 257604d commit 92a152c
Showing 1 changed file with 249 additions and 22 deletions.
271 changes: 249 additions & 22 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -110297,6 +110297,21 @@ dictionary <dfn dictionary>StorageEventInit</dfn> : <span>EventInit</span> {
particular <var>intended parent</var>, the UA must run the following steps:</p>

<ol>
<li><p>If the <span>active speculative HTML parser</span> is not null, then return the result of
<span data-x="create a speculative mock element">creating a speculative mock element</span>
given <var>given namespace</var>, the tag name of the given token, and the attributes of the
given token.</p></li>

<li>
<p>Otherwise, optionally <span>create a speculative mock element</span> given <var>given
namespace</var>, the tag name of the given token, and the attributes of the given token.</p>

<p class="note">The result is not used. This step allows for a <span>speculative fetch</span> to
be initiated from non-speculative parsing. The fetch is still speculative at this point,
because, for example, by the time the element is inserted, <var>intended parent</var> might
have been removed from the document.</p>
</li>

<li><p>Let <var>document</var> be <var>intended parent</var>'s <span>node document</span>.</li>

<li><p>Let <var>local name</var> be the tag name of the token.</p></li>
Expand Down Expand Up @@ -111030,20 +111045,27 @@ document.body.appendChild(text);
<p><span data-x="acknowledge self-closing flag">Acknowledge the token's <i data-x="self-closing flag">self-closing
flag</i></span>, if it is set.</p>

<p id="meta-charset-during-parse">If the element has a <code
data-x="attr-meta-charset">charset</code> attribute, and <span>getting an encoding</span> from
its value results in an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>, then
<span>change the encoding</span> to the resulting encoding.</p>
<p>If the <span>active speculative HTML parser</span> is null, then:</p>

<ol>
<li><p id="meta-charset-during-parse">If the element has a <code
data-x="attr-meta-charset">charset</code> attribute, and <span>getting an encoding</span> from
its value results in an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>,
then <span>change the encoding</span> to the resulting encoding.</p></li>

<li><p>Otherwise, if the element has an <code data-x="attr-meta-http-equiv">http-equiv</code>
attribute whose value is an <span>ASCII case-insensitive</span> match for the string "<code
data-x="">Content-Type</code>", and the element has a <code
data-x="attr-meta-content">content</code> attribute, and applying the <span>algorithm for
extracting a character encoding from a <code>meta</code> element</span> to that attribute's
value returns an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>,
then <span>change the encoding</span> to the extracted encoding.</p></li>
</ol>

<p>Otherwise, if the element has an <code data-x="attr-meta-http-equiv">http-equiv</code>
attribute whose value is an <span>ASCII case-insensitive</span> match for the string "<code
data-x="">Content-Type</code>", and the element has a <code
data-x="attr-meta-content">content</code> attribute, and applying the <span>algorithm for
extracting a character encoding from a <code>meta</code> element</span> to that attribute's
value returns an <span>encoding</span>, and the
<span data-x="concept-encoding-confidence">confidence</span> is currently <i>tentative</i>, then
<span>change the encoding</span> to the extracted encoding.</p>
<p class="note">The <span>speculative HTML parser</span> doesn't speculatively apply character
encoding declarations in order to reduce implementation complexity.</p>
</dd>

<dt>A start tag whose tag name is "title"</dt>
Expand Down Expand Up @@ -112525,8 +112547,8 @@ document.body.appendChild(text);

<dt id="scriptEndTag">An end tag whose tag name is "script"</dt>
<dd>
<p>If the <span>JavaScript execution context stack</span> is empty, <span>perform a microtask
checkpoint</span>.</p>
<p>If the <span>active speculative HTML parser</span> is null and the <span>JavaScript execution
context stack</span> is empty, then <span>perform a microtask checkpoint</span>.</p>

<p>Let <var>script</var> be the <span>current node</span> (which will be a
<code>script</code> element).</p>
Expand All @@ -112541,10 +112563,11 @@ document.body.appendChild(text);

<p>Increment the parser's <span>script nesting level</span> by one.</p>

<p><span data-x="prepare a script">Prepare</span> the <var>script</var>. This might
cause some script to execute, which might cause <span data-x="dom-document-write">new characters
to be inserted into the tokenizer</span>, and might cause the tokenizer to output more tokens,
resulting in a <a href="#nestedParsing">reentrant invocation of the parser</a>.</p>
<p>If the <span>active speculative HTML parser</span> is null, then <span data-x="prepare a
script">prepare</span> the <var>script</var>. This might cause some script to execute, which
might cause <span data-x="dom-document-write">new characters to be inserted into the
tokenizer</span>, and might cause the tokenizer to output more tokens, resulting in a <a
href="#nestedParsing">reentrant invocation of the parser</a>.</p>

<p>Decrement the parser's <span>script nesting level</span> by one. If the parser's <span>script
nesting level</span> is zero, then set the <span>parser pause flag</span> to false.</p>
Expand Down Expand Up @@ -112580,6 +112603,9 @@ document.body.appendChild(text);
<li><p>Let <var>the script</var> be the <span>pending parsing-blocking
script</span>. There is no longer a <span>pending parsing-blocking script</span>.</p></li>

<li><p><span>Start the speculative HTML parser</span> for this instance of the HTML
parser.</p></li>

<li><p>Block the <span data-x="tokenization">tokenizer</span> for this instance of the
<span>HTML parser</span>, such that the <span>event loop</span> will not run <span
data-x="concept-task">tasks</span> that invoke the <span
Expand All @@ -112601,6 +112627,9 @@ document.body.appendChild(text);
<code>Document</code>.</p>
</li>

<li><p><span>Stop the speculative HTML parser</span> for this instance of the HTML
parser.</p></li>

<li><p>Unblock the <span data-x="tokenization">tokenizer</span> for this instance of the
<span>HTML parser</span>, such that <span data-x="concept-task">tasks</span> that invoke the
<span data-x="tokenization">tokenizer</span> can again be run.</p></li>
Expand Down Expand Up @@ -114077,9 +114106,9 @@ document.body.appendChild(text);
<p>Increment the parser's <span>script nesting level</span> by one. Set the <span>parser pause
flag</span> to true.</p>

<p><a href="https://www.w3.org/TR/SVGMobile12/script.html#ScriptContentProcessing">Process the
SVG <code data-x="">script</code> element</a> according to the SVG rules, if the user agent
supports SVG. <ref spec=SVG></p>
<p>If the <span>active speculative HTML parser</span> is null and the user agent supports SVG,
then <a href="https://www.w3.org/TR/SVGMobile12/script.html#ScriptContentProcessing">Process the
SVG <code data-x="">script</code> element</a> according to the SVG rules. <ref spec=SVG></p>

<p class="note">Even if this causes <span data-x="dom-document-write">new characters to be
inserted into the tokenizer</span>, the parser will not be executed reentrantly, since the
Expand Down Expand Up @@ -114137,6 +114166,9 @@ document.body.appendChild(text);
<ol>
<!-- this happens as part of one of the tasks that runs the parser -->

<li><p>If the <span>active speculative HTML parser</span> is not null, then <span>stop the
speculative HTML parser</span> and return.</p></li>

<li><p>Set the <span>insertion point</span> to undefined.</p></li>

<li><p><span>Update the current document readiness</span> to "<code
Expand Down Expand Up @@ -114269,6 +114301,8 @@ document.body.appendChild(text);
<li><p>Throw away any pending content in the <span>input stream</span>, and discard any future
content that would have been added to it.</p></li>

<li><p><span>Stop the speculative HTML parser</span> for this HTML parser.</p></li>

<li><p><span>Update the current document readiness</span> to "<code
data-x="">interactive</code>".</p></li>

Expand All @@ -114286,6 +114320,196 @@ document.body.appendChild(text);
</div>


<div w-nodev>

<h4>Speculative HTML parsing</h4>

<p>User agents may implement an optimization, as described in this section, to speculatively fetch
resources that are declared in the HTML markup while the HTML parser is waiting for a
<span>pending parsing-blocking script</span> to be fetched and executed, or during normal parsing,
at the time <span data-x="create an element for the token">an element is created for a token</span>.
While this optimization is not defined in precise detail, there are some rules to consider for
interoperability.</p>

<p>Each <span>HTML parser</span> can have an <dfn export>active speculative HTML parser</dfn>. It
is initially null.</p>

<p>The <dfn export>speculative HTML parser</dfn> must act like the normal HTML parser (e.g., the
tree builder rules apply), with some exceptions:</p>

<ul>
<li>
<p>The state of the normal HTML parser and the document itself must not be affected.</p>

<p class="example">For example, the <span>next input character</span> or the <span>stack of open
elements</span> for the normal HTML parser is not affected by the <span>speculative HTML
parser</span>.</p>
</li>

<li>
<p>Bytes pushed into the HTML parser's <span>input byte stream</span> must also be pushed into
the speculative HTML parser's <span>input byte stream</span>. Bytes read from the streams must
be independent.</p>
</li>

<li>
<p>The result of the speculative parsing is primarily a series of <span data-x="speculative
fetch">speculative fetches</span>. Which kinds of resources to speculatively fetch is
<span>implementation-defined</span>, but user agents must not speculatively fetch resources that
would not be fetched with the normal HTML parser, under the assumption that the script that is
blocking the HTML parser does nothing.</p>

<p class="note">It is possible that the same markup is seen multiple times from the
<span>speculative HTML parser</span> and then the normal HTML parser. It is expected that
duplicated fetches will be prevented by caching rules, which are not yet fully specified.</p>
</li>
</ul>

<p>A <dfn>speculative fetch</dfn> for a <span>speculative mock element</span> <var>element</var>
must follow these rules:</p>

<p class="XXX">Should some of these things be applied to the document "for real", even
though they are found speculatively?</p>

<ul>
<li>
<p>If the <span>speculative HTML parser</span> encounters one of the following elements, then
act as if that element is processed for the purpose of its effect of subsequent speculative
fetches.</p>

<ul class="brief">
<li>A <code>base</code> element.</li>

<li>A <code>meta</code> element whose <code data-x="attr-meta-http-equiv">http-equiv</code>
attribute is in the <span data-x="attr-meta-http-equiv-content-security-policy">Content
security policy</span> state.</li>

<li>A <code>meta</code> element whose <code data-x="attr-meta-name">name</code> attribute is an
<span>ASCII case-insensitive</span> match for "<code
data-x="meta-referrer">referrer</code>".</li>

<li>A <code>meta</code> element whose <code data-x="attr-meta-name">name</code> attribute is an
<span>ASCII case-insensitive</span> match for "<code data-x="">viewport</code>". (This can
affect whether a media query list <span>matches the environment</span>.) <ref
spec=CSSDEVICEADAPT></li>
</ul>
</li>

<li><p>Let <var>url</var> be the <span>URL</span> that <var>element</var> would fetch if it was
processed normally. If there is no such <span>URL</span> or if it is the empty string, then do
nothing. Otherwise, if <span>url</span> is already in the <span>list of speculative fetch
URLs</span>, then do nothing. Otherwise, fetch <span>url</span> as if the element was processed
normally, and add <var>url</var> to the <span>list of speculative fetch URLs</span>.</p></li>
</ul>

<p>Each <code>Document</code> has a <dfn>list of speculative fetch URLs</dfn>, which is a
<span>list</span> of <span data-x="URL">URLs</span>, initially empty.</p>

<p>To <dfn>start the speculative HTML parser</dfn> for an instance of an HTML parser
<var>parser</var>:</p>

<ol>
<li>
<p>Optionally, return.</p>

<p class="note">This step allows user agents to opt out of speculative HTML parsing.</p>
</li>

<li>
<p>If <var>parser</var>'s <span>active speculative HTML parser</span> is not null, then
<span>stop the speculative HTML parser</span> for <var>parser</var>.</p>

<p class="note">This can happen when <code data-x="dom-document-write">document.write()</code>
writes another parser-blocking script. For simplicity, this specification always restarts
speculative parsing, but user agents can implement a more efficient strategy, so long as the end
result is equivalent.</p>
</li>

<li><p>Let <var>speculativeParser</var> be a new <span>speculative HTML parser</span>, with the
same state as <var>parser</var>.</p></li>

<li><p>Let <var>speculativeDoc</var> be a new isomorphic representation of <var>parser</var>'s
<code>Document</code>, where all elements are instead <span data-x="speculative mock
element">speculative mock elements</span>. Let <var>speculativeParser</var> parse into
<var>speculativeDoc</var>.</p></li>

<li><p>Set <var>parser</var>'s <span>active speculative HTML parser</span> to
<var>speculativeParser</var>.</p></li>

<li><p><span>In parallel</span>, run <var>speculativeParser</var> until it is stopped or until it
reaches the end of its <span>input stream</span>.</p></li>
</ol>


<p>To <dfn>stop the speculative HTML parser</dfn> for an instance of an HTML parser
<var>parser</var>:</p>

<ol>
<li><p>Let <var>speculativeParser</var> be <var>parser</var>'s <span>active speculative HTML
parser</span>.</p></li>

<li><p>If <var>speculativeParser</var> is null, then return.</p></li>

<li><p>Throw away any pending content in <var>speculativeParser</var>'s <span>input
stream</span>, and discard any future content that would have been added to it.</p></li>

<li><p>Set <var>parser</var>'s <span>active speculative HTML parser</span> to null.</p></li>
</ol>

<p>The <span>speculative HTML parser</span> will create <span
data-x="speculative mock element">speculative mock elements</span> instead of normal elements. DOM
operations that the tree builder normally does on elements are expected to work appropriately on
speculative mock elements.</p>

<p>A <dfn>speculative mock element</dfn> is a <span>struct</span> with the following <span
data-x="struct item">items</span>:</p>

<ul>
<li><p>A <span>string</span> <dfn data-x="concept-mock-namespace">namespace</dfn>, corresponding
to an element's <span data-x="concept-element-namespace">namespace</span>.</p></li>

<li><p>A <span>string</span> <dfn data-x="concept-mock-local-name">local name</dfn>,
corresponding to an element's <span data-x="concept-element-local-name">local
name</span>.</p></li>

<li><p>A <span>list</span> <dfn data-x="concept-mock-attribute-list">attribute list</dfn>,
corresponding to an element's <span>attribute list</span>.</p></li>

<li><p>A <span>list</span> <dfn data-x="concept-mock-children">children</dfn>, corresponding to
an element's <span data-x="concept-tree-child">children</span>.</p></li>
</ul>

<p>To <dfn>create a speculative mock element</dfn> given a <var>namespace</var>,
<var>tagName</var>, and <var>attributes</var>:</p>

<ol>
<li><p>Let <var>element</var> be a new <span>speculative mock element</span>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-namespace">namespace</span> to
<var>namespace</var>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-local-name">local name</span> to
<var>tagName</var>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-attribute-list">attribute list</span>
to <var>attributes</var>.</p></li>

<li><p>Set <var>element</var>'s <span data-x="concept-mock-children">children</span> to a new
empty <span>list</span>.</p></li>

<li><p>Optionally, perform a <span>speculative fetch</span> for <var>element</var>.</p></li>

<li><p>Return <var>element</var>.</p></li>
</ol>

<p>When the tree builder says to insert an element into a <code>template</code> element's
<span>template contents</span>, if that is a <span>speculative mock element</span>, instead do
nothing. URLs found speculatively inside <code>template</code> elements might themselves be
templates, and must not be speculatively fetched.</p>

</div>


<div w-nodev>

<h4>Coercing an HTML DOM into an infoset</h4>
Expand Down Expand Up @@ -125478,6 +125702,9 @@ INSERT INTERFACES HERE
<dt id="refsCSSCOLORADJUST">[CSSCOLORADJUST]</dt>
<dd><cite><a href="https://drafts.csswg.org/css-color-adjust/">CSS Color Adjustment Module</a></cite>, E. Etemad, R. Atanassov, R. Lillesveen, T. Atkins. W3C.</dd>

<dt id="refsCSSDEVICEADAPT">[CSSDEVICEADAPT]</dt>
<dd><cite><a href="https://drafts.csswg.org/css-device-adapt/">CSS Device Adaption</a></cite>, F. Rivoal, M. Rakow. W3C.</dd>

<dt id="refsCSSDISPLAY">[CSSDISPLAY]</dt>
<dd><cite><a href="https://drafts.csswg.org/css-display/">CSS Display</a></cite>, T. Atkins, E. Etemad. W3C.</dd>

Expand Down

0 comments on commit 92a152c

Please sign in to comment.