Skip to content

Commit

Permalink
Normative: Make B.1.2 "String Literals" normative.
Browse files Browse the repository at this point in the history
(Part of Annex B reform, see PR #1595.)

This one was trickier than B.1.1

B.1.2's extension isn't simply to add a production to the grammar
(AKA add another right-hand-side to an existing production).
Instead it *replaces* a production/alternative with something more general.

So, when it comes to strict mode disallowing the general syntax,
we can't simply say that the general production
is a Syntax Error in strict mode,
because strict mode still has to allow the restricted syntax.

Instead, we say that if we're in strict mode code (or a TemplateLiteral),
an instance of the new production is a Syntax Error
*unless* it's an instance of the restricted syntax.
To express the latter condition,
we use the cover grammar machinery.
(It could be done in other ways, but I think this is clearest.)
  • Loading branch information
jmdyck committed Feb 23, 2020
1 parent ad7ad3e commit bf7b1e2
Showing 1 changed file with 99 additions and 79 deletions.
178 changes: 99 additions & 79 deletions spec.html
Original file line number Diff line number Diff line change
Expand Up @@ -11536,12 +11536,10 @@ <h2>Syntax</h2>

EscapeSequence ::
CharacterEscapeSequence
`0` [lookahead &lt;! DecimalDigit]
LegacyOctalEscapeSequence
HexEscapeSequence
UnicodeEscapeSequence
</emu-grammar>
<p>A conforming implementation, when processing strict mode code, must not extend the syntax of |EscapeSequence| to include <emu-xref href="#prod-annexB-LegacyOctalEscapeSequence"></emu-xref> as described in <emu-xref href="#sec-additional-syntax-string-literals"></emu-xref>.</p>
<emu-grammar type="definition">

CharacterEscapeSequence ::
SingleEscapeCharacter
NonEscapeCharacter
Expand All @@ -11558,6 +11556,18 @@ <h2>Syntax</h2>
`x`
`u`

LegacyOctalEscapeSequence ::
OctalDigit [lookahead &lt;! OctalDigit]
ZeroToThree OctalDigit [lookahead &lt;! OctalDigit]
FourToSeven OctalDigit
ZeroToThree OctalDigit OctalDigit

ZeroToThree :: one of
`0` `1` `2` `3`

FourToSeven :: one of
`4` `5` `6` `7`

HexEscapeSequence ::
`x` HexDigit HexDigit

Expand All @@ -11573,6 +11583,24 @@ <h2>Syntax</h2>
<p>&lt;LF&gt; and &lt;CR&gt; cannot appear in a string literal, except as part of a |LineContinuation| to produce the empty code points sequence. The proper way to include either in the String value of a string literal is to use an escape sequence such as `\\n` or `\\u000A`.</p>
</emu-note>

<h2>Supplemental Syntax</h2>
<p>When processing an instance of the production <emu-grammar>LegacyOctalEscapeSequence :: OctalDigit</emu-grammar> the following production is used to refine the interpretation of |LegacyOctalEscapeSequence|.</p>
<emu-grammar type="definition">
StrictZeroEscapeSequence ::
`0` [lookahead &lt;! DecimalDigit]
</emu-grammar>

<emu-clause id="sec-string-literals-early-errors">
<h1>Static Semantics: Early Errors</h1>
<emu-grammar>
EscapeSequence :: LegacyOctalEscapeSequence
</emu-grammar>
<ul>
<li>It is a Syntax Error if the source code matching this production is strict mode code and |EscapeSequence| is not covering a |StrictZeroEscapeSequence|.</li>
</ul>
<emu-note>In non-strict code, this syntax is allowed, but deprecated.</emu-note>
</emu-clause>

<emu-clause id="sec-string-literals-static-semantics-stringvalue">
<h1>Static Semantics: StringValue</h1>
<emu-see-also-para op="StringValue"></emu-see-also-para>
Expand All @@ -11586,7 +11614,7 @@ <h1>Static Semantics: StringValue</h1>
</emu-alg>
</emu-clause>

<emu-clause id="sec-static-semantics-sv">
<emu-clause id="sec-static-semantics-sv" oldids="sec-additional-syntax-string-literals-static-semantics">
<h1>Static Semantics: SV</h1>
<p>A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of code unit values contributed by the various parts of the string literal. As part of this process, some Unicode code points within the string literal are interpreted as having a mathematical value (MV), as described below or in <emu-xref href="#sec-literals-numeric-literals"></emu-xref>.</p>
<ul>
Expand Down Expand Up @@ -11648,7 +11676,7 @@ <h1>Static Semantics: SV</h1>
The SV of <emu-grammar>EscapeSequence :: CharacterEscapeSequence</emu-grammar> is the SV of |CharacterEscapeSequence|.
</li>
<li>
The SV of <emu-grammar>EscapeSequence :: `0`</emu-grammar> is the code unit 0x0000 (NULL).
The SV of <emu-grammar>EscapeSequence :: LegacyOctalEscapeSequence</emu-grammar> is the SV of |LegacyOctalEscapeSequence|.
</li>
<li>
The SV of <emu-grammar>EscapeSequence :: HexEscapeSequence</emu-grammar> is the SV of |HexEscapeSequence|.
Expand Down Expand Up @@ -11813,6 +11841,18 @@ <h1>Static Semantics: SV</h1>
<li>
The SV of <emu-grammar>NonEscapeCharacter :: SourceCharacter but not one of EscapeCharacter or LineTerminator</emu-grammar> is the UTF16Encoding of the code point value of |SourceCharacter|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: OctalDigit</emu-grammar> is the code unit whose value is the MV of |OctalDigit|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: ZeroToThree OctalDigit</emu-grammar> is the code unit whose value is (8<sub>ℝ</sub> times the MV of |ZeroToThree|) plus the MV of |OctalDigit|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: FourToSeven OctalDigit</emu-grammar> is the code unit whose value is (8<sub>ℝ</sub> times the MV of |FourToSeven|) plus the MV of |OctalDigit|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: ZeroToThree OctalDigit OctalDigit</emu-grammar> is the code unit whose value is (64<sub>ℝ</sub> (that is, 8<sup>2</sup>) times the MV of |ZeroToThree|) plus (8<sub>ℝ</sub> times the MV of the first |OctalDigit|) plus the MV of the second |OctalDigit|.
</li>
<li>
The SV of <emu-grammar>HexEscapeSequence :: `x` HexDigit HexDigit</emu-grammar> is the code unit whose value is (16<sub>ℝ</sub> times the MV of the first |HexDigit|) plus the MV of the second |HexDigit|.
</li>
Expand All @@ -11827,6 +11867,36 @@ <h1>Static Semantics: SV</h1>
</li>
</ul>
</emu-clause>

<emu-clause id="sec-string-literals-static-semantics-mv">
<h1>Static Semantics: MV</h1>
<ul>
<li>
The MV of <emu-grammar>ZeroToThree :: `0`</emu-grammar> is 0<sub>ℝ</sub>.
</li>
<li>
The MV of <emu-grammar>ZeroToThree :: `1`</emu-grammar> is 1<sub>ℝ</sub>.
</li>
<li>
The MV of <emu-grammar>ZeroToThree :: `2`</emu-grammar> is 2<sub>ℝ</sub>.
</li>
<li>
The MV of <emu-grammar>ZeroToThree :: `3`</emu-grammar> is 3<sub>ℝ</sub>.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `4`</emu-grammar> is 4<sub>ℝ</sub>.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `5`</emu-grammar> is 5<sub>ℝ</sub>.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `6`</emu-grammar> is 6<sub>ℝ</sub>.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `7`</emu-grammar> is 7<sub>ℝ</sub>.
</li>
</ul>
</emu-clause>
</emu-clause>

<emu-clause id="sec-literals-regular-expression-literals">
Expand Down Expand Up @@ -11964,11 +12034,20 @@ <h2>Syntax</h2>
CodePoint ::
HexDigits [> but only if MV of |HexDigits| &le; 0x10FFFF]
</emu-grammar>
<p>A conforming implementation must not use the extended definition of |EscapeSequence| described in <emu-xref href="#sec-additional-syntax-string-literals"></emu-xref> when parsing a |TemplateCharacter|.</p>
<emu-note>
<p>|TemplateSubstitutionTail| is used by the |InputElementTemplateTail| alternative lexical goal.</p>
</emu-note>

<emu-clause id="sec-template-literal-lexical-components-early-errors">
<h1>Early Errors</h1>
<emu-grammar>
TemplateCharacter :: `\` EscapeSequence
</emu-grammar>
<ul>
<li>It is a Syntax Error if |EscapeSequence| is an instance of <emu-grammar>EscapeSequence :: LegacyOctalEscapeSequence</emu-grammar> and |EscapeSequence| is not covering a |StrictZeroEscapeSequence|.</li>
</ul>
</emu-clause>

<emu-clause id="sec-static-semantics-tv-and-trv">
<h1>Static Semantics: TV and TRV</h1>
<p>A template literal component is interpreted as a sequence of Unicode code points. The Template Value (TV) of a literal component is described in terms of code unit values (SV, <emu-xref href="#sec-literals-string-literals"></emu-xref>) contributed by the various parts of the template literal component. As part of this process, some Unicode code points within the template component are interpreted as having a mathematical value (MV, <emu-xref href="#sec-literals-numeric-literals"></emu-xref>). In determining a TV, escape sequences are replaced by the UTF-16 code unit(s) of the Unicode code point represented by the escape sequence. The Template Raw Value (TRV) is similar to a Template Value with the difference that in TRVs escape sequences are interpreted literally.</p>
Expand Down Expand Up @@ -12064,7 +12143,7 @@ <h1>Static Semantics: TV and TRV</h1>
The TRV of <emu-grammar>EscapeSequence :: CharacterEscapeSequence</emu-grammar> is the TRV of |CharacterEscapeSequence|.
</li>
<li>
The TRV of <emu-grammar>EscapeSequence :: `0`</emu-grammar> is the code unit 0x0030 (DIGIT ZERO).
The TRV of <emu-grammar>EscapeSequence :: LegacyOctalEscapeSequence</emu-grammar> is the code unit 0x0030 (DIGIT ZERO).
</li>
<li>
The TRV of <emu-grammar>EscapeSequence :: HexEscapeSequence</emu-grammar> is the TRV of |HexEscapeSequence|.
Expand Down Expand Up @@ -24947,9 +25026,6 @@ <h1>Forbidden Extensions</h1>
<li>
The Syntactic Grammar must not be extended in any manner that allows the token `:` to immediately follow source text that matches the |BindingIdentifier| nonterminal symbol.
</li>
<li>
|TemplateCharacter| must not be extended to include <emu-xref href="#prod-annexB-LegacyOctalEscapeSequence"></emu-xref> as defined in <emu-xref href="#sec-additional-syntax-string-literals"></emu-xref>.
</li>
<li>
When processing strict mode code, the extensions defined in <emu-xref href="#sec-labelled-function-declarations"></emu-xref>, <emu-xref href="#sec-block-level-function-declarations-web-legacy-compatibility-semantics"></emu-xref>, <emu-xref href="#sec-functiondeclarations-in-ifstatement-statement-clauses"></emu-xref>, and <emu-xref href="#sec-initializers-in-forin-statement-heads"></emu-xref> must not be supported.
</li>
Expand Down Expand Up @@ -41701,9 +41777,15 @@ <h1>Lexical Grammar</h1>
<emu-prodref name=SingleEscapeCharacter></emu-prodref>
<emu-prodref name=NonEscapeCharacter></emu-prodref>
<emu-prodref name=EscapeCharacter></emu-prodref>
<emu-prodref name=LegacyOctalEscapeSequence></emu-prodref>
<emu-prodref name=ZeroToThree></emu-prodref>
<emu-prodref name=FourToSeven></emu-prodref>
<emu-prodref name=HexEscapeSequence></emu-prodref>
<emu-prodref name=UnicodeEscapeSequence></emu-prodref>
<emu-prodref name=Hex4Digits></emu-prodref>
<p>When processing an instance of the production <emu-prodref name=LegacyOctalEscapeSequence></emu-prodref> the following production is used to refine the interpretation of |LegacyOctalEscapeSequence|.</p>
<emu-prodref name=StrictZeroEscapeSequence></emu-prodref>
<p>&nbsp;</p>
<emu-prodref name=RegularExpressionLiteral></emu-prodref>
<emu-prodref name=RegularExpressionBody></emu-prodref>
<emu-prodref name=RegularExpressionChars></emu-prodref>
Expand Down Expand Up @@ -42039,73 +42121,11 @@ <h1>Numeric Literals</h1>

<emu-annex id="sec-additional-syntax-string-literals">
<h1>String Literals</h1>
<p>The syntax and semantics of <emu-xref href="#sec-literals-string-literals"></emu-xref> is extended as follows except that this extension is not allowed for strict mode code:</p>
<h2>Syntax</h2>
<emu-grammar type="definition">
EscapeSequence ::
CharacterEscapeSequence
LegacyOctalEscapeSequence
HexEscapeSequence
UnicodeEscapeSequence

LegacyOctalEscapeSequence ::
OctalDigit [lookahead &lt;! OctalDigit]
ZeroToThree OctalDigit [lookahead &lt;! OctalDigit]
FourToSeven OctalDigit
ZeroToThree OctalDigit OctalDigit

ZeroToThree :: one of
`0` `1` `2` `3`

FourToSeven :: one of
`4` `5` `6` `7`
<p>The following syntax from <emu-xref href="#sec-literals-string-literals"></emu-xref>, and its associated semantics, used to be normative optional:</p>
<emu-grammar>
EscapeSequence :: LegacyOctalEscapeSequence
</emu-grammar>
<p>This definition of |EscapeSequence| is not used in strict mode or when parsing |TemplateCharacter|.</p>

<emu-annex id="sec-additional-syntax-string-literals-static-semantics">
<h1>Static Semantics</h1>
<ul>
<li>
The SV of <emu-grammar>EscapeSequence :: LegacyOctalEscapeSequence</emu-grammar> is the SV of |LegacyOctalEscapeSequence|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: OctalDigit</emu-grammar> is the code unit whose value is the MV of |OctalDigit|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: ZeroToThree OctalDigit</emu-grammar> is the code unit whose value is (8 times the MV of |ZeroToThree|) plus the MV of |OctalDigit|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: FourToSeven OctalDigit</emu-grammar> is the code unit whose value is (8 times the MV of |FourToSeven|) plus the MV of |OctalDigit|.
</li>
<li>
The SV of <emu-grammar>LegacyOctalEscapeSequence :: ZeroToThree OctalDigit OctalDigit</emu-grammar> is the code unit whose value is (64 (that is, 8<sup>2</sup>) times the MV of |ZeroToThree|) plus (8 times the MV of the first |OctalDigit|) plus the MV of the second |OctalDigit|.
</li>
<li>
The MV of <emu-grammar>ZeroToThree :: `0`</emu-grammar> is 0.
</li>
<li>
The MV of <emu-grammar>ZeroToThree :: `1`</emu-grammar> is 1.
</li>
<li>
The MV of <emu-grammar>ZeroToThree :: `2`</emu-grammar> is 2.
</li>
<li>
The MV of <emu-grammar>ZeroToThree :: `3`</emu-grammar> is 3.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `4`</emu-grammar> is 4.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `5`</emu-grammar> is 5.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `6`</emu-grammar> is 6.
</li>
<li>
The MV of <emu-grammar>FourToSeven :: `7`</emu-grammar> is 7.
</li>
</ul>
</emu-annex>
<p>and the productions for |LegacyOctalEscapeSequence|, |ZeroToThree|, and |FourToSeven|.</p>
</emu-annex>

<emu-annex id="sec-html-like-comments">
Expand Down Expand Up @@ -42305,7 +42325,7 @@ <h1>Static Semantics: CharacterValue</h1>
</emu-alg>
<emu-grammar>CharacterEscape :: LegacyOctalEscapeSequence</emu-grammar>
<emu-alg>
1. Evaluate the SV of |LegacyOctalEscapeSequence| (see <emu-xref href="#sec-additional-syntax-string-literals"></emu-xref>) to obtain a code unit _cu_.
1. Evaluate the SV of |LegacyOctalEscapeSequence| (see <emu-xref href="#sec-static-semantics-sv"></emu-xref>) to obtain a code unit _cu_.
1. Return the numeric value of _cu_.
</emu-alg>
</emu-annex>
Expand Down Expand Up @@ -43281,7 +43301,7 @@ <h1>The Strict Mode of ECMAScript</h1>
A conforming implementation, when processing strict mode code, must disallow instances of the productions <emu-grammar>NonDecimalIntegerLiteral :: LegacyOctalIntegerLiteral</emu-grammar> and <emu-grammar>DecimalIntegerLiteral :: NonOctalDecimalIntegerLiteral</emu-grammar>.
</li>
<li>
A conforming implementation, when processing strict mode code, may not extend the syntax of |EscapeSequence| to include |LegacyOctalEscapeSequence| as described in <emu-xref href="#sec-additional-syntax-string-literals"></emu-xref>.
A conforming implementation, when processing strict mode code, must disallow instances of the production <emu-grammar>EscapeSequence :: LegacyOctalEscapeSequence</emu-grammar> that do not cover a |StrictZeroEscapeSequence|.
</li>
<li>
Assignment to an undeclared identifier or otherwise unresolvable reference does not create a property in the global object. When a simple assignment occurs within strict mode code, its |LeftHandSideExpression| must not evaluate to an unresolvable Reference. If it does a *ReferenceError* exception is thrown (<emu-xref href="#sec-putvalue"></emu-xref>). The |LeftHandSideExpression| also may not be a reference to a data property with the attribute value { [[Writable]]: *false* }, to an accessor property with the attribute value { [[Set]]: *undefined* }, nor to a non-existent property of an object whose [[Extensible]] internal slot has the value *false*. In these cases a `TypeError` exception is thrown (<emu-xref href="#sec-assignment-operators"></emu-xref>).
Expand Down

0 comments on commit bf7b1e2

Please sign in to comment.