-
Initialized/called by
js2-parse
-
Only one active scanner/token stream per buffer
-
The lexer (aka scanner in the docs) runs JIT before the parser itself
-
State is managed using buffer-local variables. The main variables that the scanner/parsers use are:
js2-ti-tokens
: Vector of recently lexed (not necessarily parsed)js2-ti-tokens-cursor
: Pointer intojs2-ti-tokens
with the "current" tokenjs2-ti-lookahead
: Counter recording how many tokens have been lexed but not parsed
js2-get-token
is used to advance the current token cursor, which
automatically pulls in more tokens if js2-ti-tokens
is out
js2-get-token
is the function called by most of the token-dealing code. It looks in the buffer, and dispatches tojs2-get-token-internal
if it needs to read a new tokenjs2-get-token-internal
calls the scanner (usingjs2-get-token-internal-1
), processes comments, and fontifies errors and keywords
- initializes (buffer-local) state variables
- calls
js2-do-parse
- does error reporting
- runs post-parse hooks
- returns AST from
js2-do-parse
This function creates the ast instance and runs a while loop that
parses all of the top-level blocks. Functions are parsed using
js2-parse-function-stmt
(called after consuming the "function"
keyword) and other statements are called using js2-parse-statement
.
There's also the "use strict" handling here and some comment-handling stuff. At the end, post-parse-callbacks are run and undeclared-vars are highlighted.
Local variables:
- root:
(make-js2-ast-root :buffer (buffer-name) :pos pos)
- tt:
(js2-get-token)
- n: One of
(js2-parse-function-expr)
or(js2-parse-function-stmt)
- Checks for a generator function
star-p
- Ensure function is not unnamed
(js2-must-match-name "msg.unnamed.function.stmt")
- Parses out the Rhino-specific "member expression", if any
- Parses out the left paren
- Calls
(js2-parse-function 'FUNCTION_STATEMENT pos star-p async-p name)
and returns its value.
Calls js2-parse-function-internal
and handles re-parsing of the function if strict
directives are found
- Binds its name (if it's a statement) in the parent scope
- Creates new scope (incl. handling dynamic scope)
- Parses function params using
js2-parse-function-params
- Checks for arrow syntax
- Dispatch body parsing based on whether there are curly braces (deprecated SpiderMonkey
feature) --> mostly use
(js2-parse-function-body fn-node)
- Check for inconsistent returns
- Binds its name (maybe) in its own scope
- Creates a new block node
- Pushes statement nodes
(js2-parse-statement)
onto the block node - Check for use strict directive and throw 'reparse t if found
- Checks for user input to interrupt
- Checks for no-side-effects code
- Parses statements using
js2-statement-helper
- Dynamic dispatch based on token type using
(aref js2-parsers tt)
- Parses variable declarations and wraps it ina statement node using
make-js2-expr-stmt-node
- Calls
js2-parse-variables
for the actual parsing
- Handles destructuring, name assignment in scope, declaration checks
- Uses
js2-parse-destruct-primary-expr
for destructuring assignments - Uses
js2-parse-assign-expr
for the assignment expressions (i.e., the RHS)
- For yield expressions calls
js2-parse-return-or-yield
- For assignment (incl. augmented) calls itself recursively and ensures proper identifiers
- For arrow functions, calls
js2-parse-function
(detecting whether it's async or not) - Everything else is handled through
js2-parse-cond-expr
Uses js2-parse-or-expr
to parse out one expression, then checks for the "? __ : __"
pattern to create a cond-node
Classic recursive descent parser. In order of priority:
- OR
- AND
- BIT OR
- BIT XOR
- BIT AND
- EQ [EQ, NE, SHEQ, SHNE]
- REL [IN, INSTANCEOF, LE, LT, GE, GT]
- SHIFT [LSH, URSH, RSH]
- ADD/SUB
- MUL/DIV/MOD
- EXPON [incl. fix for right-associativity]
- UNARY [VOID, NOT, BITNOT, TYPEOF, POS, NEG, INC, DEC, DELPROP]
Unary is a more complicated parser, which checks for valid increment/decrement sequences,
that delprop ("delete a.b") is called on a valid property, etc. It also dispatches parsing
to (js2-parse-member-expr-tail t (js2-parse-xml-initializer))
if it finds an XML
snippet, and finally dispatches to (js2-parse-member-expr t)
- Handles calls using "new" directly, or dispatches to
js2-parse-primary-expr
Parse a literal (leaf) expression of some sort:
- class
- function/async function
- Array literals or comprehensions
- Object literal
- Let expression
- Parenthetisized or generator comp
- XMLATTR [E4X attribute expression using @ sign, e.g.,
@attr
] - names
- numbers
- strings
- template literals
- regexp literal
- null, this, super, false, true
- tripledot (must be arrow function in rest param)
- reserved keyword
- EOF
- scanner error
Parses a literal XML fragment. The tokenizer changes here to js2-get-next-xml-token
,
which emits the following token types:
- js2-ERROR
- js2-XML: Generic XML fragment, including CDATA, comments, doctype. Stops at any bare
{
- js2-XMLEND: Found the last close-tag
The tokenizer tracks significantly more state than it emits (in-tag, in-attr, num-open-tags, etc.)
After every js2-XML
token, js2-get-next-xml-token
makes it into a string (not
fontified as such), and uses js2-parse-expr
(and the normal tokenizer) to parse the
expression in curly braces. It then wraps the expression in a js2-xml-js-expr-node
.
Hypothesis: Extend JS2-JSX mode and advice js2-parse-xml-initializer
to our own XML
parser when in said mode. Re-use as much of the js2 parser (e.g., js2-parse-member-expr)
and use the grammar at https://facebook.github.io/jsx/ to coordinate between them