From bf7b1e2b2af46add3a6d0d1bdd0fa7a769ba3faa Mon Sep 17 00:00:00 2001 From: Michael Dyck Date: Fri, 2 Aug 2019 23:36:08 -0400 Subject: [PATCH] Normative: Make B.1.2 "String Literals" normative. (Part of Annex B reform, see PR #1595.) This one was trickier than B.1.1 B.1.2's extension isn't simply to add a production to the grammar (AKA add another right-hand-side to an existing production). Instead it *replaces* a production/alternative with something more general. So, when it comes to strict mode disallowing the general syntax, we can't simply say that the general production is a Syntax Error in strict mode, because strict mode still has to allow the restricted syntax. Instead, we say that if we're in strict mode code (or a TemplateLiteral), an instance of the new production is a Syntax Error *unless* it's an instance of the restricted syntax. To express the latter condition, we use the cover grammar machinery. (It could be done in other ways, but I think this is clearest.) --- spec.html | 178 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 99 insertions(+), 79 deletions(-) diff --git a/spec.html b/spec.html index 9e00f585c6c..29b719fc1b0 100644 --- a/spec.html +++ b/spec.html @@ -11536,12 +11536,10 @@

Syntax

EscapeSequence :: CharacterEscapeSequence - `0` [lookahead <! DecimalDigit] + LegacyOctalEscapeSequence HexEscapeSequence UnicodeEscapeSequence - -

A conforming implementation, when processing strict mode code, must not extend the syntax of |EscapeSequence| to include as described in .

- + CharacterEscapeSequence :: SingleEscapeCharacter NonEscapeCharacter @@ -11558,6 +11556,18 @@

Syntax

`x` `u` + LegacyOctalEscapeSequence :: + OctalDigit [lookahead <! OctalDigit] + ZeroToThree OctalDigit [lookahead <! OctalDigit] + FourToSeven OctalDigit + ZeroToThree OctalDigit OctalDigit + + ZeroToThree :: one of + `0` `1` `2` `3` + + FourToSeven :: one of + `4` `5` `6` `7` + HexEscapeSequence :: `x` HexDigit HexDigit @@ -11573,6 +11583,24 @@

Syntax

<LF> and <CR> cannot appear in a string literal, except as part of a |LineContinuation| to produce the empty code points sequence. The proper way to include either in the String value of a string literal is to use an escape sequence such as `\\n` or `\\u000A`.

+

Supplemental Syntax

+

When processing an instance of the production LegacyOctalEscapeSequence :: OctalDigit the following production is used to refine the interpretation of |LegacyOctalEscapeSequence|.

+ + StrictZeroEscapeSequence :: + `0` [lookahead <! DecimalDigit] + + + +

Static Semantics: Early Errors

+ + EscapeSequence :: LegacyOctalEscapeSequence + +
    +
  • It is a Syntax Error if the source code matching this production is strict mode code and |EscapeSequence| is not covering a |StrictZeroEscapeSequence|.
  • +
+ In non-strict code, this syntax is allowed, but deprecated. +
+

Static Semantics: StringValue

@@ -11586,7 +11614,7 @@

Static Semantics: StringValue

- +

Static Semantics: SV

A string literal stands for a value of the String type. The String value (SV) of the literal is described in terms of code unit values contributed by the various parts of the string literal. As part of this process, some Unicode code points within the string literal are interpreted as having a mathematical value (MV), as described below or in .

    @@ -11648,7 +11676,7 @@

    Static Semantics: SV

    The SV of EscapeSequence :: CharacterEscapeSequence is the SV of |CharacterEscapeSequence|.
  • - The SV of EscapeSequence :: `0` is the code unit 0x0000 (NULL). + The SV of EscapeSequence :: LegacyOctalEscapeSequence is the SV of |LegacyOctalEscapeSequence|.
  • The SV of EscapeSequence :: HexEscapeSequence is the SV of |HexEscapeSequence|. @@ -11813,6 +11841,18 @@

    Static Semantics: SV

  • The SV of NonEscapeCharacter :: SourceCharacter but not one of EscapeCharacter or LineTerminator is the UTF16Encoding of the code point value of |SourceCharacter|.
  • +
  • + The SV of LegacyOctalEscapeSequence :: OctalDigit is the code unit whose value is the MV of |OctalDigit|. +
  • +
  • + The SV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit is the code unit whose value is (8 times the MV of |ZeroToThree|) plus the MV of |OctalDigit|. +
  • +
  • + The SV of LegacyOctalEscapeSequence :: FourToSeven OctalDigit is the code unit whose value is (8 times the MV of |FourToSeven|) plus the MV of |OctalDigit|. +
  • +
  • + The SV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit OctalDigit is the code unit whose value is (64 (that is, 82) times the MV of |ZeroToThree|) plus (8 times the MV of the first |OctalDigit|) plus the MV of the second |OctalDigit|. +
  • The SV of HexEscapeSequence :: `x` HexDigit HexDigit is the code unit whose value is (16 times the MV of the first |HexDigit|) plus the MV of the second |HexDigit|.
  • @@ -11827,6 +11867,36 @@

    Static Semantics: SV

+ + +

Static Semantics: MV

+
    +
  • + The MV of ZeroToThree :: `0` is 0. +
  • +
  • + The MV of ZeroToThree :: `1` is 1. +
  • +
  • + The MV of ZeroToThree :: `2` is 2. +
  • +
  • + The MV of ZeroToThree :: `3` is 3. +
  • +
  • + The MV of FourToSeven :: `4` is 4. +
  • +
  • + The MV of FourToSeven :: `5` is 5. +
  • +
  • + The MV of FourToSeven :: `6` is 6. +
  • +
  • + The MV of FourToSeven :: `7` is 7. +
  • +
+
@@ -11964,11 +12034,20 @@

Syntax

CodePoint :: HexDigits [> but only if MV of |HexDigits| ≤ 0x10FFFF]
-

A conforming implementation must not use the extended definition of |EscapeSequence| described in when parsing a |TemplateCharacter|.

|TemplateSubstitutionTail| is used by the |InputElementTemplateTail| alternative lexical goal.

+ +

Early Errors

+ + TemplateCharacter :: `\` EscapeSequence + + +
+

Static Semantics: TV and TRV

A template literal component is interpreted as a sequence of Unicode code points. The Template Value (TV) of a literal component is described in terms of code unit values (SV, ) contributed by the various parts of the template literal component. As part of this process, some Unicode code points within the template component are interpreted as having a mathematical value (MV, ). In determining a TV, escape sequences are replaced by the UTF-16 code unit(s) of the Unicode code point represented by the escape sequence. The Template Raw Value (TRV) is similar to a Template Value with the difference that in TRVs escape sequences are interpreted literally.

@@ -12064,7 +12143,7 @@

Static Semantics: TV and TRV

The TRV of EscapeSequence :: CharacterEscapeSequence is the TRV of |CharacterEscapeSequence|.
  • - The TRV of EscapeSequence :: `0` is the code unit 0x0030 (DIGIT ZERO). + The TRV of EscapeSequence :: LegacyOctalEscapeSequence is the code unit 0x0030 (DIGIT ZERO).
  • The TRV of EscapeSequence :: HexEscapeSequence is the TRV of |HexEscapeSequence|. @@ -24947,9 +25026,6 @@

    Forbidden Extensions

  • The Syntactic Grammar must not be extended in any manner that allows the token `:` to immediately follow source text that matches the |BindingIdentifier| nonterminal symbol.
  • -
  • - |TemplateCharacter| must not be extended to include as defined in . -
  • When processing strict mode code, the extensions defined in , , , and must not be supported.
  • @@ -41701,9 +41777,15 @@

    Lexical Grammar

    + + + +

    When processing an instance of the production the following production is used to refine the interpretation of |LegacyOctalEscapeSequence|.

    + +

     

    @@ -42039,73 +42121,11 @@

    Numeric Literals

    String Literals

    -

    The syntax and semantics of is extended as follows except that this extension is not allowed for strict mode code:

    -

    Syntax

    - - EscapeSequence :: - CharacterEscapeSequence - LegacyOctalEscapeSequence - HexEscapeSequence - UnicodeEscapeSequence - - LegacyOctalEscapeSequence :: - OctalDigit [lookahead <! OctalDigit] - ZeroToThree OctalDigit [lookahead <! OctalDigit] - FourToSeven OctalDigit - ZeroToThree OctalDigit OctalDigit - - ZeroToThree :: one of - `0` `1` `2` `3` - - FourToSeven :: one of - `4` `5` `6` `7` +

    The following syntax from , and its associated semantics, used to be normative optional:

    + + EscapeSequence :: LegacyOctalEscapeSequence -

    This definition of |EscapeSequence| is not used in strict mode or when parsing |TemplateCharacter|.

    - - -

    Static Semantics

    -
      -
    • - The SV of EscapeSequence :: LegacyOctalEscapeSequence is the SV of |LegacyOctalEscapeSequence|. -
    • -
    • - The SV of LegacyOctalEscapeSequence :: OctalDigit is the code unit whose value is the MV of |OctalDigit|. -
    • -
    • - The SV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit is the code unit whose value is (8 times the MV of |ZeroToThree|) plus the MV of |OctalDigit|. -
    • -
    • - The SV of LegacyOctalEscapeSequence :: FourToSeven OctalDigit is the code unit whose value is (8 times the MV of |FourToSeven|) plus the MV of |OctalDigit|. -
    • -
    • - The SV of LegacyOctalEscapeSequence :: ZeroToThree OctalDigit OctalDigit is the code unit whose value is (64 (that is, 82) times the MV of |ZeroToThree|) plus (8 times the MV of the first |OctalDigit|) plus the MV of the second |OctalDigit|. -
    • -
    • - The MV of ZeroToThree :: `0` is 0. -
    • -
    • - The MV of ZeroToThree :: `1` is 1. -
    • -
    • - The MV of ZeroToThree :: `2` is 2. -
    • -
    • - The MV of ZeroToThree :: `3` is 3. -
    • -
    • - The MV of FourToSeven :: `4` is 4. -
    • -
    • - The MV of FourToSeven :: `5` is 5. -
    • -
    • - The MV of FourToSeven :: `6` is 6. -
    • -
    • - The MV of FourToSeven :: `7` is 7. -
    • -
    -
    +

    and the productions for |LegacyOctalEscapeSequence|, |ZeroToThree|, and |FourToSeven|.

    @@ -42305,7 +42325,7 @@

    Static Semantics: CharacterValue

    CharacterEscape :: LegacyOctalEscapeSequence - 1. Evaluate the SV of |LegacyOctalEscapeSequence| (see ) to obtain a code unit _cu_. + 1. Evaluate the SV of |LegacyOctalEscapeSequence| (see ) to obtain a code unit _cu_. 1. Return the numeric value of _cu_.
    @@ -43281,7 +43301,7 @@

    The Strict Mode of ECMAScript

    A conforming implementation, when processing strict mode code, must disallow instances of the productions NonDecimalIntegerLiteral :: LegacyOctalIntegerLiteral and DecimalIntegerLiteral :: NonOctalDecimalIntegerLiteral.
  • - A conforming implementation, when processing strict mode code, may not extend the syntax of |EscapeSequence| to include |LegacyOctalEscapeSequence| as described in . + A conforming implementation, when processing strict mode code, must disallow instances of the production EscapeSequence :: LegacyOctalEscapeSequence that do not cover a |StrictZeroEscapeSequence|.
  • Assignment to an undeclared identifier or otherwise unresolvable reference does not create a property in the global object. When a simple assignment occurs within strict mode code, its |LeftHandSideExpression| must not evaluate to an unresolvable Reference. If it does a *ReferenceError* exception is thrown (). The |LeftHandSideExpression| also may not be a reference to a data property with the attribute value { [[Writable]]: *false* }, to an accessor property with the attribute value { [[Set]]: *undefined* }, nor to a non-existent property of an object whose [[Extensible]] internal slot has the value *false*. In these cases a `TypeError` exception is thrown ().