Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend ECMA-262 syntax into a superset of JSON #1188

Merged
merged 1 commit into from
Jul 18, 2018

Conversation

gibson042
Copy link
Contributor

This implements https://github.com/tc39/proposal-json-superset , currently at stage 3.

 - Narrow the StringLiteral restriction in sec-line-terminators
@ljharb ljharb added normative change Affects behavior required to correctly evaluate some ECMAScript source text pending stage 4 This proposal has not yet achieved stage 4, but may otherwise be ready to merge. proposal This is related to a specific proposal, and will be closed/merged when the proposal reaches stage 4. labels May 7, 2018
@ljharb
Copy link
Member

ljharb commented May 7, 2018

Tests: tc39/test262#1543 / tc39/test262#1544

@ljharb ljharb requested review from bterlson and bmeck May 7, 2018 22:11
@littledan
Copy link
Member

Thanks for your contribution, @gibson042 .

spec.html Outdated
@@ -9908,7 +9908,7 @@ <h2>Syntax</h2>
<!-- es6num="11.3" -->
<emu-clause id="sec-line-terminators">
<h1>Line Terminators</h1>
<p>Like white space code points, line terminator code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other. However, unlike white space code points, line terminators have some influence over the behaviour of the syntactic grammar. In general, line terminators may occur between any two tokens, but there are a few places where they are forbidden by the syntactic grammar. Line terminators also affect the process of automatic semicolon insertion (<emu-xref href="#sec-automatic-semicolon-insertion"></emu-xref>). A line terminator cannot occur within any token except a |StringLiteral|, |Template|, or |TemplateSubstitutionTail|. Line terminators may only occur within a |StringLiteral| token as part of a |LineContinuation|.</p>
<p>Like white space code points, line terminator code points are used to improve source text readability and to separate tokens (indivisible lexical units) from each other. However, unlike white space code points, line terminators have some influence over the behaviour of the syntactic grammar. In general, line terminators may occur between any two tokens, but there are a few places where they are forbidden by the syntactic grammar. Line terminators also affect the process of automatic semicolon insertion (<emu-xref href="#sec-automatic-semicolon-insertion"></emu-xref>). A line terminator cannot occur within any token except a |StringLiteral|, |Template|, or |TemplateSubstitutionTail|. &lt;LF&gt; and &lt;CR&gt; line terminators cannot occur within a |StringLiteral| token except as part of a |LineContinuation|.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity and future explorers, can you elaborate on why this restriction is narrowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the fundamental change of proposal-json-superset, just in a part of the spec that I didn't realize included affected text until working on tc39/test262#1565 . As of this PR landing, StringLiteral will be able to include any LineTerminator as part of a LineContinuation (backslash followed by LineTerminatorSequence, used to statically define long strings over multiple lines without affecting their content), and—invalidating the old text—will also be able to include U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR line terminators in any position (as is the case for JSON strings).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, no line terminators could occur in string literals (except as part of a LineContinuation).

With the change, U+2028 and U+2029 can occur in string literals, even outside of LineContinuation.

In other words, this change clarifies that LineContinuations are unaffected by the change, i.e. \ followed by U+2028 or U+2029 in a string literal still evaluates to the empty string. This was already clear from the proposal, but updating this note as part of this patch improves accuracy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, sounds good!

@littledan
Copy link
Member

This proposal reached Stage 4 in the May 2018 TC39 meeting. Is the patch ready to land?

@littledan littledan removed the pending stage 4 This proposal has not yet achieved stage 4, but may otherwise be ready to merge. label Jun 24, 2018
@ljharb ljharb assigned ljharb and unassigned bterlson Jul 18, 2018
@ljharb ljharb merged commit cef8d49 into tc39:master Jul 18, 2018
@mathiasbynens
Copy link
Member

Side note: this commit has just been merged, yet it doesn’t show up in the first few entries on https://github.com/tc39/ecma262/commits/master. Instead, it’s further down the list here:

Note that May 7th is the date the original PR was created.

Is this confusing to anyone else or just me?

AFAICT until now, the commit log always followed merge order. Are we changing that now?

@ljharb
Copy link
Member

ljharb commented Jul 18, 2018

This is actually a bug in github; it sorts by (and has sorted by) commit date, not actual merge order. cc @keithamus; is this something that we could request?

(The reason this happened here is that when I rebased the PR, i used --committer-date-is-author-date, which avoids obscuring the original commit date)

@mathiasbynens
Copy link
Member

(The reason this happened here is that when I rebased the PR, i used --committer-date-is-author-date, which avoids obscuring the original commit date)

This way, the commit order doesn’t make sense even in git log. Can we not do that please?

@ljharb
Copy link
Member

ljharb commented Jul 18, 2018

My git log locally works in proper merge order (ie, starting at the branch HEAD, and moving backwards one commit parent at a time). I'm not sure why it's working differently for you; perhaps aliases, or an outdated git version?

@mathiasbynens
Copy link
Member

You’re right, git log does show in proper merge order. My bad — I was confused again by the fact that a) the commit dates don’t match the merge dates for recent commits, as explained before, and b) the commit overview doesn’t link to the PRs for any of the recently-merged commits (another difference from how things have been done until now).

Please don’t use --committer-date-is-author-date, and include the PR links in the commit message when merging, just like we’ve always done for this repository.

@ljharb
Copy link
Member

ljharb commented Jul 19, 2018

I'll definitely include the PR links; that's a convention that wasn't stressed in our editors' meeting (however, if you click on any commit, the header on github shows the PR numbers, branches, and tags that it's contained in - so that link isn't strictly required).

jmdyck added a commit to jmdyck/ecma262 that referenced this pull request May 25, 2020
PR tc39#1188 deleted the text
"but using the alternative definition of |DoubleStringCharacter| provided below"
from step 4 of JSON.parse's algorithm.

At that point, the accompanying Note's phrase
"as modified by Step 4 above"
became obsolete.
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request May 26, 2020
PR tc39#1188 deleted the text
"but using the alternative definition of |DoubleStringCharacter| provided below"
from step 4 of JSON.parse's algorithm.

At that point, the accompanying Note's phrase
"as modified by Step 4 above"
became obsolete.
jmdyck added a commit to jmdyck/ecma262 that referenced this pull request Oct 15, 2020
PR tc39#1188 deleted the text
"but using the alternative definition of |DoubleStringCharacter| provided below"
from step 4 of JSON.parse's algorithm.

At that point, the accompanying Note's phrase
"as modified by Step 4 above"
became obsolete.
bakkot pushed a commit that referenced this pull request Oct 15, 2020
PR #1188 deleted the text
"but using the alternative definition of |DoubleStringCharacter| provided below"
from step 4 of JSON.parse's algorithm.

At that point, the accompanying Note's phrase
"as modified by Step 4 above"
became obsolete.
ljharb pushed a commit to jmdyck/ecma262 that referenced this pull request Dec 1, 2020
…2013)

PR tc39#1188 deleted the text "but using the alternative definition of |DoubleStringCharacter| provided below" from step 4 of JSON.parse's algorithm.

At that point, the accompanying Note's phrase "as modified by Step 4 above" became obsolete.
jmdyck added a commit to jmdyck/test262-parser-tests that referenced this pull request Jan 30, 2021
These two tests were expected to raise a syntax error
because they involve a string literal containing
U+2028 (LINE SEPARATOR) or U+2029 (PARAGRAPH SEPARATOR),
which used to be disallowed.

However, those code points have been allowed in string literals
since the merge of tc39/ecma262#1188
michaelficarra pushed a commit to tc39/test262-parser-tests that referenced this pull request May 27, 2021
These two tests were expected to raise a syntax error
because they involve a string literal containing
U+2028 (LINE SEPARATOR) or U+2029 (PARAGRAPH SEPARATOR),
which used to be disallowed.

However, those code points have been allowed in string literals
since the merge of tc39/ecma262#1188
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
has test262 tests normative change Affects behavior required to correctly evaluate some ECMAScript source text proposal This is related to a specific proposal, and will be closed/merged when the proposal reaches stage 4.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants