-
-
Notifications
You must be signed in to change notification settings - Fork 504
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: various updates and fixes (#519)
- Loading branch information
Showing
12 changed files
with
242 additions
and
34 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,153 @@ | ||
--- | ||
title: Architecture (WIP) | ||
description: How Biome works. | ||
title: Architecture | ||
description: How Biome works under the hood. | ||
--- | ||
|
||
Biome uses a server-client architecture to run its tasks. | ||
This document covers some of the internals of Biome, and how they are used inside the project. | ||
|
||
## Parser and CST | ||
|
||
The architecture of the parser is bumped by an internal fork of [rowan], a library | ||
that implements the [Green and Red tree] pattern. | ||
|
||
The CST (Concrete Syntax Tree) is data structure very similar to AST (Abstract Syntax Tree) that keeps track of all the information of a program, trivia included. | ||
|
||
**Trivia** are represented by all that information that are important to a program to run: | ||
- spaces | ||
- tabs | ||
- comments | ||
|
||
Trivia are attached to a node. A node can have leading trivia and trailing trivia. If you read code from left-to-right, leading trivia appear before a keyword, and trialing trivia appear after a keyword. | ||
|
||
Leading trivia and trailing trivia are categorized as follows: | ||
- Every trivia up to the token/keyword (including line breaks) will be the **leading trivia**; | ||
- Everything until the next linebreak (but not including it) will be the **trailing trivia**; | ||
|
||
Given the following JavaScript snippet, `// comment 1` is a trailing trivia of the token `;`, and `// comment 2` is a leading trivia to the keyword `const`. Below a minimized version of the CST represented by Biome: : | ||
|
||
```js | ||
const a = "foo"; // comment 1 | ||
// comment 2 | ||
const b = "bar"; | ||
``` | ||
|
||
``` | ||
0: [email protected] | ||
... | ||
1: [email protected] ";" [] [Whitespace(" "), Comments("// comment 1")] | ||
1: [email protected] | ||
... | ||
1: [email protected] "const" [Newline("\n"), Comments("// comment 2"), Newline("\n")] [Whitespace(" ")] | ||
3: [email protected] "" [] [] | ||
``` | ||
|
||
The CST is never directly accessible by design, a developer can read its information using the Red tree, using a number of APIs that are autogenerated from the grammar of the language. | ||
|
||
|
||
#### Resilient and recoverable parser | ||
|
||
In order to construct a CST, a parser needs to be error resilient and recoverable: | ||
- resilient: a parser that is able to resume parsing after encountering syntax error that belong to the language; | ||
- recoverable: a parser that is able to **understand** where an error occurred, and being able to resume the parsing by creating **correct** information; | ||
|
||
The recoverable part of the parser is not a science, and there aren't any rules set on stone. This means that depending on what the parser was parsing and where an error occurred, the parser might be able to recover itself in an expected ways. | ||
|
||
To protect the consumers from consuming incorrect syntax, the parser also uses `Bogus` nodes. These nodes are used decorate the broken code caused by an syntax error. | ||
|
||
In the following example, the parenthesis in the `while` are missing, although parser is able to recover itself in a good manner, and it's able to represent the code with a decent CST. The parenthesis and condition of the loop are marked as missing, and the code block is correctly parsed: | ||
|
||
|
||
```js | ||
while {} | ||
``` | ||
|
||
## Daemon | ||
``` | ||
JsModule { | ||
interpreter_token: missing (optional), | ||
directives: JsDirectiveList [], | ||
items: JsModuleItemList [ | ||
JsWhileStatement { | ||
while_token: [email protected] "while" [] [Whitespace(" ")], | ||
l_paren_token: missing (required), | ||
test: missing (required), | ||
r_paren_token: missing (required), | ||
body: JsBlockStatement { | ||
l_curly_token: [email protected] "{" [] [], | ||
statements: JsStatementList [], | ||
r_curly_token: [email protected] "}" [] [], | ||
}, | ||
}, | ||
], | ||
eof_token: [email protected] "" [] [], | ||
} | ||
``` | ||
|
||
A [daemon](<https://en.wikipedia.org/wiki/Daemon_(computing)>) is a long-running server | ||
This is error emitted during parsing: | ||
|
||
``` | ||
main.tsx:1:7 parse ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ | ||
✖ expected `(` but instead found `{` | ||
> 1 │ while {} | ||
│ ^ | ||
ℹ Remove { | ||
``` | ||
|
||
The same can't be said for the following snippet. The parser can't properly understand the syntax during the recovery phase, so it needs to rely on the bogus nodes to mark some syntax as erroneous. Notice the `JsBogusStatement`: | ||
|
||
```js | ||
function} | ||
``` | ||
|
||
``` | ||
JsModule { | ||
interpreter_token: missing (optional), | ||
directives: JsDirectiveList [], | ||
items: JsModuleItemList [ | ||
TsDeclareFunctionDeclaration { | ||
async_token: missing (optional), | ||
function_token: FUNCTION_KW@0..8 "function" [] [], | ||
id: missing (required), | ||
type_parameters: missing (optional), | ||
parameters: missing (required), | ||
return_type_annotation: missing (optional), | ||
semicolon_token: missing (optional), | ||
}, | ||
JsBogusStatement { | ||
items: [ | ||
R_CURLY@8..9 "}" [] [], | ||
], | ||
}, | ||
], | ||
eof_token: EOF@9..9 "" [] [], | ||
} | ||
``` | ||
|
||
This is the error we get from the parsing phase: | ||
|
||
``` | ||
main.tsx:1:9 parse ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ | ||
✖ expected a name for the function in a function declaration, but found none | ||
> 1 │ function} | ||
│ ^ | ||
``` | ||
|
||
## Formatter (WIP) | ||
|
||
## Linter (WIP) | ||
|
||
## Daemon (WIP) | ||
|
||
Biome uses a server-client architecture to run its tasks. | ||
|
||
A [daemon] is a long-running server | ||
that Biome spawns in the background and uses to process requests from the editor and CLI. | ||
|
||
|
||
[rowan]: https://github.com/rust-analyzer/rowan | ||
[Green and Red tree]: https://learn.microsoft.com/en-us/archive/blogs/ericlippert/persistence-facades-and-roslyns-red-green-trees | ||
[daemon]: https://en.wikipedia.org/wiki/Daemon_(computing) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.