📎 [WIP] AST updates #1868
Replies: 12 comments
-
This looks amazing and thank you for writing it down.
This is a bit misleading in my view because my understanding is that an assignment consists of a value that gets assigned to something. My initial thought was, let's name this assignee but it seems that assignees must be people. |
Beta Was this translation helpful? Give feedback.
-
I think English is pretty flexible in that regard. You can totally personify an object or concept with a word like “assignee”. I could go either way, “Assignment” feels similar enough to “Binding” to me; both describing the operation that’s happening on that target. “Binding” also could have been “BindingTarget” but that feels overly qualified. |
Beta Was this translation helpful? Give feedback.
-
Binary Expressions & Assignment ExpressionsAll of these expressions (binary and assignment) fall under the category of "lhs op rhs" but there are important differences between many of them:
Most JavaScript ASTs only distinguish the first (assignment vs binary) and third (binary vs logical), and they do so somewhat inconsistently:
I would argue for these reasons we should split all of these up: JsComparativeExpression = '<' | '>' | '<=' | '>=' | '==' | '===' | '!=' | '!=='
JsArithmeticExpression = '+' | '-' | '*' | '/' | '%' | '**'
JsBitwiseExpression = '<<' | '>>' | '>>>' | '&' | '|' | '^'
JsLogicalExpression = '??' | '||' | '&&'
JsInExpression = 'in'
JsInstanceofExpression = 'instanceof'
JsSimpleAssignmentExpression = '='
JsArithmeticAssignmentExpression = '+=' | '-=' | '*=' | '%=' | '**=' |
JsBitwiseAssignmentExpression = '>>=' | '<<=' | '>>>=' | '&=' | '|=' | '^='
JsLogicalAssignmentExpression = '&&=' | '||=' | '??='
AnyJsAssignmentExpression =
JsSimpleAssignmentExpression
| JsArithmeticAssignmentExpression
| JsBitwiseAssignmentExpression
| JsLogicalAssignmentExpression Notes:
Thinking about it, I kinda find the concept of a "binary operator" a little bit poorly defined at the level of abstraction we have in our AST. I think it might be more meaningful in a higher-level representation where things like I'm not entirely opposed to having an enum that captures all these related ast nodes, but I'm not sure if it's really necessary.
|
Beta Was this translation helpful? Give feedback.
-
Identifiers & NamesThe That's why we should introduce a new
I further suggest introducing the new "concept" of
|
Beta Was this translation helpful? Give feedback.
-
ImportsThis is more of an exploration import "mod"
import foo from "mod"
import { foo } from "mod"
import * as foo from "mod"
import foo, { bar } from "mod"
import foo, * as bar from "mod"
// TypeScript:
import type foo from "mod"
import type { foo } from "mod"
import { type foo } from "mod"
import type * as foo from "mod"
import foo, { type bar } from "mod"
import foo = require("mod")
import foo = a.b.c
// Invalid:
import type foo, { bar } from "mod"
import type foo, * as bar from "mod"
import foo = a
import type foo = a Lexical GrammarIf we try to enforce specifier = 'ident' ('as' 'ident')?
type_specifier = 'type'? specifier
specifiers = (specifier (',' specifier) ','?)?
type_specifiers = (type_specifier (',' type_specifier) ','?)?
assert_entry = 'ident' ':' 'string'
assert = 'assert' '{' (assert_entry (',' assert_entry) ','?)? '}'
import = 'import' 'string' assert? ';'?
import = 'import' 'type'? 'ident' 'from' 'string' assert? ';'?
import = 'import' 'type'? '{' specifiers '}' 'from' 'string' assert? ';'?
import = 'import' '{' type_specifiers '}' 'from' 'string' assert? ';'?
import = 'import' 'type'? '*' 'as' 'ident' 'from' 'string' assert? ';'?
import = 'import' 'ident' ',' '{' type_specifiers '}' 'from' 'string' assert? ';'?
import = 'import' 'ident' ',' '*' 'as' 'ident' 'from' 'string' assert? ';'?
import = 'import' 'type'? 'ident' '=' 'require' '(' 'string' ')' ';'?
import = 'import' 'ident' '=' 'ident' ('.' 'ident')* ';'? If we validate specifier = 'type'? 'ident' ('as' 'ident')?
specifiers = '{' (specifier (',' specifier) ','?)? '}'
assert_entry = 'ident' ':' 'string'
assert = 'assert' '{' (assert_entry (',' assert_entry) ','?)? '}'
import = 'import' 'string' assert? ';'?
import = 'import' 'type'? 'ident' 'from' 'string' assert? ';'?
import = 'import' 'type'? specifiers 'from' 'string' assert? ';'?
import = 'import' 'type'? '*' 'as' 'ident' 'from' 'string' assert? ';'?
import = 'import' 'ident' ',' specifiers 'from' 'string' assert? ';'?
import = 'import' 'ident' ',' '*' 'as' 'ident' 'from' 'string' assert? ';'?
import = 'import' 'type'? 'ident' '=' 'require' '(' 'string' ')' ';'?
import = 'import' 'ident' '=' 'ident' ('.' 'ident')* ';'? If we validate commas at runtime: specifier = 'type'? 'ident' ('as' 'ident')?
specifiers = '{' (specifier (',' specifier) ','?)? '}'
assert_entry = 'ident' ':' 'string'
assert = 'assert' '{' (assert_entry (',' assert_entry) ','?)? '}'
default_clause = 'ident'
named_clause = specifiers | ('*' 'as' 'ident')
import = 'import' 'string' assert? ';'?
import = 'import' 'type'? default_clause? ','? named_clause? 'from' 'string' assert? ';'
import = 'import' 'type'? 'ident' '=' 'require' '(' 'string' ')' ';'?
import = 'import' 'ident' '=' 'ident' ('.' 'ident')* ';'? In the interest of making imports easier to work with (they can be incredibly frustrating), I think we should trade off correctness for simplicity here. Syntax GrammarJsNamedImportSpecifier = 'type'? JsName ('as' JsIdentifierBinding)?
JsNamedImportSpecifierList = '{' (JsImportSpecifier (',' JsImportSpecifier) ','?)? '}'
JsNamespaceImportSpecifier = '*' 'as' JsIdentifierBinding
JsDefaultImportSpecifier = JsIdentifierBinding
JsImportSource = 'string'
JsImportAssertionEntry = 'ident' ':' 'string'
JsImportAssertion = 'assert' '{' (JsImportAssertionEntry (',' JsImportAssertionEntry) ','?)? '}'
JsImportBareClause = JsImportSource JsImportAssertion?
JsImportSpecifiersClause = 'type'? JsDefaultImportSpecifier? ','? (JsNamedImportSpecifierList | JsNamespaceImportSpecifier)? 'from' JsImportSource JsImportAssertion?
TsImportRequireClause = 'ident' '=' 'require' '(' 'string' ')'
TsImportAliasClause = 'ident' '=' TsNamespace
AnyJsImportClause = JsImportBareClause | JsImportSpecifiersClause | TsImportRequireClause | TsImportAliasClause
JsImport = 'import' AnyJsImportClause ';'? There might be some issues with how I defined that |
Beta Was this translation helpful? Give feedback.
-
In working on the export stuff, I realized that we might want to have a separate export import x = require("m") This is different from |
Beta Was this translation helpful? Give feedback.
-
I haven't had much time to look into One thing I noticed is that this syntax is adding TS syntax to JS nodes. My understanding was that we are OK having unions that contain as well JS and TS syntax but that JS nodes should only contain JS children, meaning, we would need to introduce a
I think we should split the
It would resemble what we have for bindings: |
Beta Was this translation helpful? Give feedback.
-
Oh, I think that not allowing JS nodes to contain TS nodes as children would make the AST a lot less usable. We'd need to have forks of: imports, exports, variables, functions, classes, jsx, try-catch, for/for-of/in, ast, expression statements, and probably some others I'm missing. I think it's okay for JS nodes to have something like
Thoughts? |
Beta Was this translation helpful? Give feedback.
-
This is genius! I love it. And I think it solves the most problematic cases of when "illegal" identifiers bubble up as you suggested in our meeting yesterday. This would involve the following changes:
There may be more cases where syntax errors can "bubble up" but my impression is that they're less widespread and more local than what we've seen with The downside is that this is a |
Beta Was this translation helpful? Give feedback.
-
I extended your grammar a bit:
One question that popped up while I worked on this is if we should introduce a |
Beta Was this translation helpful? Give feedback.
-
ExportsBuh, exports are killing it. This is insane... One important finding is that
|
Beta Was this translation helpful? Give feedback.
-
Additional changes:
|
Beta Was this translation helpful? Give feedback.
-
Description
Reviewing the new Ungrammar AST carefully there are a number of changes I think should be made for these reasons (in this order):
There are some places where being accurate isn't always reasonable:
When it comes to naming there's a couple high-level guidances (not strict rules) that I've come up with:
Parameter
notParam
)items
notbody
)There are a couple "category" words that I've defined. In all cases, any node that falls into these categories should be swappable with any other node that falls into the same category. They are:
There's also a somewhat special category that I think needs better defining:
O(1)
access to CST nodes, so we should consider how we use it and if we should "defensively" use it.There are some other names that I've started to come up with names of that only really apply to node attributes:
%
orin
)2 % 2
), and they should have the same type (i.e.for (let x in y)
should not be named left/right).Then there is the structure of a node name:
The important parts here are:
Any
orBogus
(which I'll get to later), should come first because they describe properties of how we write ASTs (with enums and error recovery nodes) and this helps us from ever confusingTsAnyType (the enum)
withTsAnyType (the ts any type)
That all being said, here are some changes that I'd make to follow the above thinking:
JsAny*
→AnyJs*
JsUnknown*
→BogusJs*
JsRoot.program = JsModule | JsScript
*Statement
*Expression
*Type
*Literal
JsImportCallExpression
→JsImportExpression
*AssignmentTarget
→*Assignment
*Pattern
(etc) →*BindingPattern
&*AssignmentPattern
*Binding
&*Assignment
should not include patternsTs{Keyword}Type
→Ts{Keyword}KeywordType
Ts{Literal}LiteralType
→Ts{Literal}Type
TsObjectType
→TsLiteralType
(or better name)Unknown*
→Bogus*
For{In,Of}Statement.left
→For{In,Of}Statement.initializer
For{In,Of}Statement.initializer = JsVariableDeclaration | JsAnyAssignmentPattern
Breaking some of those down more specifically:
JsAny*
→AnyJs*
Using the rules about how these "meta" names should be included.
JsUnknown*
→BogusJs*
Names like
TsUnknownType
(if we ever decided to have that as an error recovery node) would be incredibly confusing if not conflicting. Initially I was just going to say we should name itUnknownTsType
but then @sebmck and I sat down and thought about how that doesn't leave exactly the right impression.We sat around with a thesaurus and tried out a bunch of different names, we ended up liking "bogus" because:
That being said, we're also worried that ESL speakers might not be familiar with that word, and we're not super tied to it, so if someone has suggestions please share them.
JsRoot.program = JsModule | JsScript
JsScript
cannot contain imports or exports,JsModule
can. This is a very important distinction in how you interpret these files and should be reflected in the AST.All statements end in
*Statement
The distinction of "declaration" gets confusing because some declarations are statements and others are not. Doubly confusing when some statement declarations are named like
VariableDeclarationStatement
.All expressions end in
*Expression
Babel uses nodes like
StringLiteral
in non-expression positions (such asimport "source"
). I don't believe we're doing that anywhere (but we shouldn't) so using*Expression
everywhere makes it clear that you should only use them in expression positions.Drop the
*Literal*
Using
JsStringLiteralExpression
feels like we're qualifying its type too much.JsStringExpression
is plenty specific. Same goes for all the others.All types end in
*Type
For consistency's sake. Encourages correct usage.
Imports/Exports are not "statements"
In the sense that imports/exports can not appear in any statement position means we can either consider them "special" statements, or something else entirely. In order to simplify our idea of
JsAnyStatement
we should consider imports/exports to be their own thing.I'm not sure if it makes sense to try and unify the different import/export types behind singular
JsImport
andJsExport
nodes, but I think we might want to try.JsImportCallExpression
→JsImportExpression
Despite the syntax,
import()
should not be considered a "call" to a function namedimport
.If there were a
JsAnyCallExpression
, it would be confusing if it included imports because they have substantially different behavior. Especially when you consider the static nature of how they are used in bundlers. Removing*Call*
from the name makes it clearer they are unrelated.JsReferenceIdentifierExpression
→JsIdentifierExpression
Identifiers in expression positions can only ever be references.
If we use
Reference
here, there are arguably other places we should add that qualifier such asJSThisReferenceExpression
orJSSuperReferenceExpression
. That feels like an unnecessary amount of qualification.Previously this was called
JsReferenceIdentifier
so that it could be used in non-expression positions. But that feels like the wrong separation.*AssignmentTarget
→*Assignment
AssignmentTarget
is a really long qualifier. Previous in Rome Classic we just usedAssignment
and that worked fine.[WIP]
Binding(Pattern)
&Assignment(Pattern)
I think it's important to distinguish when something is a "binding" or "assignment" vs when it's a "pattern". An "Object Binding" seems nonsensical. And this actually aligns with the naming of the spec.
AnyJsBinding
AnyJsAssignment
id
JsIdentifierBinding
JsIdentifierAssignment
expr.name
JsStaticMemberAssignment
expr[expr]
JsComputedMemberAssignment
target=expr
JsBindingDefault
JsAssignmentDefault
AnyJsBindingPattern
AnyJsAssignmentPattern
{}
JsObjectBindingPattern
JsObjectAssignmentPattern
AnyJsObjectBindingPatternMember
AnyJsObjectAssignmentPatternMember
{name|[expr]:id}
JsObjectBindingPatternProperty
JsObjectAssignmentPatternProperty
{id}
JsObjectBindingPatternShorthandProperty
JsObjectAssignmentPatternShorthandProperty
{...id}
JsObjectBindingPatternRest
JsObjectAssignmentPatternRest
[]
JsArrayBindingPattern
JsArrayAssignmentPattern
AnyJsArrayBindingPatternElement
AnyJsArrayAssignmentPatternElement
[...target]
JsArrayBindingPatternRestElement
JsArrayAssignmentPatternRestElement
Ts{Keyword}Type
→Ts{Keyword}KeywordType
&Ts{Literal}LiteralType
→Ts{Literal}Type
This feels like a better alignment:
So I think we should use
*KeywordType
:This also happens to be how TypeScript names these "keywords".
TsObjectType
→TsLiteralType
In TypeScript
type t = {}
doesn't really mean "an object", it describes the shape of a value, and can describe types such as functions. This can be confusing to a lot of TypeScript users, and developers working with Rome's AST might be doubly confused if it's called an "object".TypeScript and Babel both call this node a
TypeLiteral
, which flipping it to follow our "All types end in*Type
" rule, isTsLiteralType
.Calling this a "literal" is a bit odd though, it doesn't really meet any definition for "literal" that I can come up with. So I'm very open to other ideas.
I have a couple other things I'm thinking about, but I wanted to share what I had so far.
Beta Was this translation helpful? Give feedback.
All reactions