Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: dump asts #1041

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 18 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions tools/bin/main.ml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ let version = Version.version

let main () =
match Sys.argv |> Array.to_list |> List.tl with
| ["dump"; file] -> Tools.dump file |> logAndExit
| "doc" :: rest -> (
match rest with
| ["-h"] | ["--help"] -> logAndExit (Ok docHelp)
Expand Down
13 changes: 13 additions & 0 deletions tools/npm/Tools_Docgen.res
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,21 @@ type constructor = {
payload?: constructorPayload,
}

type rec typeInSignature = {
path: string,
genericTypeParameters: array<typeInSignature>,
}

type signatureDetais = {
parameters: array<typeInSignature>,
returnType: typeInSignature,
}

@tag("kind")
type detail =
| @as("record") Record({items: array<field>})
| @as("variant") Variant({items: array<constructor>})
| @as("alias") Signature(signatureDetais)
nojaf marked this conversation as resolved.
Show resolved Hide resolved

type source = {
filepath: string,
Expand All @@ -38,6 +49,8 @@ type rec item =
name: string,
deprecated?: string,
source: source,
/** Additional documentation of signature, if available. */
detail?: detail,
})
| @as("type")
Type({
Expand Down
14 changes: 14 additions & 0 deletions tools/npm/Tools_Docgen.resi
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,22 @@ type constructor = {
deprecated?: string,
payload?: constructorPayload,
}

type rec typeInSignature = {
path: string,
genericTypeParameters: array<typeInSignature>,
}

type signatureDetais = {
parameters: array<typeInSignature>,
returnType: typeInSignature,
}

@tag("kind")
type detail =
| @as("record") Record({items: array<field>})
| @as("variant") Variant({items: array<constructor>})
| @as("alias") Signature(signatureDetais)

type source = {
filepath: string,
Expand All @@ -37,6 +49,8 @@ type rec item =
name: string,
deprecated?: string,
source: source,
/** Additional documentation of signature, if available. */
detail?: detail,
})
| @as("type")
Type({
Expand Down
271 changes: 271 additions & 0 deletions tools/src/prettier_printer.ml
Original file line number Diff line number Diff line change
@@ -0,0 +1,271 @@
module DSL = struct
type namedField = {name: string; value: oak}

and oak =
| Application of string * oak
| Record of namedField list
| Ident of string
| Tuple of namedField list
| List of oak list
| String of string
Comment on lines +4 to +10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand the justification of introducing another representation - the intention here is to have a generic pretty printer that we can use for debugging, correct? Where tast is the first thing we've made a debug printer for, but we can extend it as needed.

Provided the answer to the above is "yes", here's another important question:

  • We usually deal with loc:s and cursor position in the editor tooling. It's important to know whether the loc of something is a) a regular loc b) a ghost loc c) an empty/broken loc. It's also important to be able to mark in pretty printing whether the cursor is inside of the item's loc. Would it be difficult to extend this pretty printer to handle loc:s?

Even if we don't use it right now, I'd like to see a PoC that it's doable before we commit to a specific pretty printing DSL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand the justification of introducing another representation - the intention here is to have a generic pretty printer that we can use for debugging, correct? Where tast is the first thing we've made a debug printer for, but we can extend it as needed.

Yes, I wanted to pattern match in tools.ml but wasn't sure what shape I was aiming for. So, I created a new representation and wrote a printer for it. In the future, this could be reused for the untyped tree.

I believe the loc aspect can be captured as you described. It would be helpful to have a concrete example to ensure I fully understand what you would like to see.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Have a look through this file: https://github.com/rescript-lang/rescript-vscode/blob/master/analysis/src/DumpAst.ml

That prints (parts of) the parsetree, and marks structures with whether they hold the cursor or not (or if the loc is broken).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, in the tree traversal you would like to pass the cursor position and have that print something special if a node contains the cursor?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, or if the loc is broken. Reason is that both of those are very important when working with things like hovers, autocomplete, etc. I don't need to see the actual locs I think, just if the cursor is in there and/or if the loc is broken. For actual locs, it's easy enough to print via bsc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of questions there:

I'm not sure if I understand what a broken loc is. What does that mean?

I don't need to see the actual locs

I'm a little surprised by this. Can you elaborate on this? Or is it more a practical thing?

A while ago you told me that bsc -dtypedtree dumps the typed tree, however, inside a real project you need to pass in the dependencies and other flags. Is there an easy way find out what other arguments would be required? Like what is bunx rescript eventually gonna pass down?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Broken loc: A broken loc is what the parser inserts when a syntax error is produced that doesn't have a clear loc by itself. Remember that the parser and editor tooling does most of its work on broken rather than complete sources. So, sometimes the parser inserts things like %rescript.exprHole with a "broken" loc at a place where it couldn't figure out more about what type of expression or pattern you're trying to write. These are important cues that help when doing autocomplete.
  2. Not needing the actual locs: Exactly, it's a practical thing. They are needed occasionally when debugging something intricate, but mostly what you're after when working on editor features (or debugging existing features) is figuring out where the cursor is located, not what those exact locs are.
  3. When you just care about what editor sees/works on, you don't need to know that much about what bsc would run when compiling. That's when those flags are needed. If you want to see precisely what the editor sees via the parser, this command is good enough (and it requires no knowledge of libs, PPXes etc): bsc whatever.res -dparsetree -ignore-parse-errors -only-parse -bs-no-builtin-ppx -bs-loc. This will give you just the parsetree back, just as the parser sees it (which is what the editor operates on). No PPXes (including disabling the builtin PPXes) and -bs-loc will give you the actual locs.

end

(** Transform the Oak types to string *)
module CodePrinter = struct
open DSL

(**
The idea is that we capture events in a context type.
Doing this allows us to reason about the current state of the writer
and whether the next expression fits on the current line or not.
*)

type writerEvents =
| Write of string
| WriteLine
| IndentBy of int
| UnindentBy of int

type writerMode = Standard | TrySingleLine | ConfirmedMultiline

(* Type representing the writer context during code printing

- [indent_size] is the configured indentation size, typically 2
- [max_line_length] is the maximum line length before we break the line
- [current_indent] is the current indentation size
- [current_line_column] is the characters written on the current line
- [line_count] is the number of lines written
- [events] is the write events in reverse order, head event is last written
- [mode] is the current writer mode (Standard or SingleLine)
*)
type context = {
indent_size: int;
max_line_length: int;
current_indent: int;
current_line_column: int;
line_count: int;
events: writerEvents list;
mode: writerMode;
}

type appendEvents = context -> context

let emptyContext =
{
indent_size = 2;
max_line_length = 120;
current_indent = 0;
current_line_column = 0;
line_count = 0;
events = [];
mode = Standard;
}

(** Fold all the events in context into text *)
let dump (ctx : context) =
let buf = Buffer.create 1024 in
let addSpaces n = Buffer.add_string buf (String.make n ' ') in

List.fold_right
(fun event current_indent ->
match event with
| Write str ->
Buffer.add_string buf str;
current_indent
| WriteLine ->
Buffer.add_char buf '\n';
addSpaces current_indent;
current_indent
| IndentBy n -> current_indent + n
| UnindentBy n -> current_indent - n)
ctx.events ctx.current_indent
|> ignore;
Buffer.contents buf

let debug_context (ctx : context) =
let mode =
match ctx.mode with
| Standard -> "Standard"
| TrySingleLine -> "TrySingleLine"
| ConfirmedMultiline -> "ConfirmedMultiline"
in
Format.printf
"Current indent: %d, Current column: %d, # Lines: %d Events: %d, Mode: %s\n"
ctx.current_indent ctx.current_line_column ctx.line_count
(List.length ctx.events) mode;
ctx

let updateMode (newlineWasAdded : bool) (ctx : context) =
match ctx.mode with
| Standard -> ctx
| ConfirmedMultiline -> ctx
| TrySingleLine ->
{
ctx with
mode =
(if newlineWasAdded || ctx.current_line_column > ctx.max_line_length
then ConfirmedMultiline
else TrySingleLine);
}

let id x = x

(** add a write event to the context *)
let ( !- ) str ctx =
nojaf marked this conversation as resolved.
Show resolved Hide resolved
{
ctx with
events = Write str :: ctx.events;
current_line_column = ctx.current_line_column + String.length str;
}
|> updateMode false

(** compose two context transforming functions *)
let ( +> ) f g ctx =
nojaf marked this conversation as resolved.
Show resolved Hide resolved
let fCtx = f ctx in
match fCtx.mode with
| ConfirmedMultiline -> fCtx
| _ -> g fCtx

let sepNln ctx =
{
ctx with
events = WriteLine :: ctx.events;
current_line_column = ctx.current_indent;
line_count = ctx.line_count + 1;
}
|> updateMode true

let sepSpace ctx = !-" " ctx
let sepComma ctx = !-", " ctx
let sepSemi ctx = !-"; " ctx
let sepOpenT ctx = !-"(" ctx
let sepCloseT ctx = !-")" ctx
let sepOpenR ctx = !-"{" ctx
let sepCloseR ctx = !-"}" ctx
let sepOpenL ctx = !-"[" ctx
let sepCloseL ctx = !-"]" ctx
let sepEq ctx = !-" = " ctx
let wrapInParentheses f = sepOpenT +> f +> sepCloseT
let indent ctx =
let nextIdent = ctx.current_indent + ctx.indent_size in
{
ctx with
current_indent = nextIdent;
current_line_column = nextIdent;
events = IndentBy ctx.indent_size :: ctx.events;
}
let unindent ctx =
let nextIdent = ctx.current_indent - ctx.indent_size in
{
ctx with
current_indent = nextIdent;
current_line_column = nextIdent;
events = UnindentBy ctx.indent_size :: ctx.events;
}

let indentAndNln f = indent +> sepNln +> f +> unindent

let col (f : 't -> appendEvents) (intertwine : appendEvents) items ctx =
let rec visit items ctx =
match items with
| [] -> ctx
| [item] -> f item ctx
| item :: rest ->
let ctx' = (f item +> intertwine) ctx in
visit rest ctx'
in
visit items ctx

let expressionFitsOnRestOfLine (f : appendEvents) (fallback : appendEvents)
(ctx : context) =
match ctx.mode with
| ConfirmedMultiline -> ctx
| _ -> (
let shortCtx =
match ctx.mode with
| Standard -> {ctx with mode = TrySingleLine}
| _ -> ctx
in
let resultCtx = f shortCtx in
match resultCtx.mode with
| ConfirmedMultiline -> fallback ctx
| TrySingleLine -> {resultCtx with mode = ctx.mode}
| Standard ->
failwith "Unexpected Standard mode after trying SingleLine mode")

let rec genOak (oak : oak) : appendEvents =
match oak with
| Application (name, argument) -> genApplication name argument
| Record record -> genRecord record
| Ident ident -> genIdent ident
| String str -> !-(Format.sprintf "\"%s\"" str)
| Tuple ts -> genTuple ts
| List xs -> genList xs

and genApplication (name : string) (argument : oak) : appendEvents =
let short = !-name +> sepOpenT +> genOak argument +> sepCloseT in
let long =
!-name +> sepOpenT
+> (match argument with
| List _ | Record _ -> genOak argument
| _ -> indentAndNln (genOak argument) +> sepNln)
+> sepCloseT
in
expressionFitsOnRestOfLine short long

and genRecord (recordFields : namedField list) : appendEvents =
let short =
match recordFields with
| [] -> sepOpenR +> sepCloseR
| fields ->
sepOpenR +> sepSpace
+> col genNamedField sepSemi fields
+> sepSpace +> sepCloseR
in
let long =
sepOpenR
+> indentAndNln (col genNamedField sepNln recordFields)
+> sepNln +> sepCloseR
in
expressionFitsOnRestOfLine short long

and genTuple (oaks : namedField list) : appendEvents =
let short = col genNamedField sepComma oaks in
let long = col genNamedField sepNln oaks in
expressionFitsOnRestOfLine short long

and genIdent (ident : string) : appendEvents = !-ident

and genNamedField (field : namedField) : appendEvents =
let genValue =
match field.value with
| Tuple _ -> sepOpenT +> genOak field.value +> sepCloseT
| _ -> genOak field.value
in
let short = !-(field.name) +> sepEq +> genValue in
let long =
!-(field.name) +> sepEq
+>
match field.value with
| List _ | Record _ -> genOak field.value
| _ -> indentAndNln genValue
in
expressionFitsOnRestOfLine short long

and genList (items : oak list) : appendEvents =
let genItem = function
| Tuple _ as item -> wrapInParentheses (genOak item)
| item -> genOak item
in
let short =
match items with
| [] -> sepOpenL +> sepCloseL
| _ ->
sepOpenL +> sepSpace +> col genItem sepSemi items +> sepSpace
+> sepCloseL
in
let long =
sepOpenL +> indentAndNln (col genItem sepNln items) +> sepNln +> sepCloseL
in
expressionFitsOnRestOfLine short long
end
Loading