Slang #1032

Janther · 2024-08-15T08:40:38Z

This is a proposal for a version 2 of this plugin.

It has been rewritten leveraging the power of Nomic Foundation's Slang to provide a much more scalable and better supported parser.

While we are still in beta we have disabled browser support until we can make sure we can fully provide this functionality.

The coverage has been reduced as well since slang's AST is much more detailed than solidity-parser's and while all our tests still pass, they don't provide every possible scenario.

Using with the old parser is still possible but a deprecated warning will be logged.

… in testing

Tests passing without altering the AST

# Conflicts: # package-lock.json # package.json

src/slang-nodes/index.d.ts

src/slang-printers/types.d.ts

OmarTawfik

I did an initial pass and left a couple of comments. Thanks for working on this. Exciting stuff!

README.md

…ead of naming every single class

Janther · 2024-08-22T00:20:28Z

src/slang-nodes/index.d.ts

+  | StrictAstNode
+  | Comment
+  | Identifier
+  | YulIdentifier


@OmarTawfik It would be beneficial if Identifier and YulIdentifier were in the types.
For example:

export declare class EnumDefinition { readonly cst: NonterminalNode; private readonly fetch; constructor(cst: NonterminalNode); get enumKeyword(): TerminalNode; get name(): TerminalNode; get openBrace(): TerminalNode; get members(): EnumMembers; get closeBrace(): TerminalNode; }

I care about getting the location information of name but not from enumKeyword, openBrace and closeBrace.
I went through every TerminalNode in the AST and plucked out the Identifiers and YulIdentifiers but if you make a change in the AST I have no Type security that will trigger an error on my side if I need to adapt. So I would appreciate if it would be possible to deliver something like:

export declare class EnumDefinition { readonly cst: NonterminalNode; private readonly fetch; constructor(cst: NonterminalNode); get enumKeyword(): TerminalNode; get name(): Identifier; get openBrace(): TerminalNode; get members(): EnumMembers; get closeBrace(): TerminalNode; }

Thanks for the suggestion!

but if you make a change in the AST I have no Type security that will trigger an error on my side

To make sure I understand you correctly, do you mean getNodeMetadata().offsets?
Is that because you are storing offsets as an array, and accessing it by a numeric index? Would it help if it was changed to be an object/dict with the field name as keys? for example offsets.name here won't break if the position of name was changed relative to its siblings.

It would be beneficial if Identifier and YulIdentifier were in the types.

How would that help in this case? if they were non-terminal nodes? how would this be tracked otherwise?

One other way you can use Slang API if you want, so that you don't have to do the manual tracking of offsets yourself, is to use cursors.
For example, in getNodeMetadata(), instead of iterating over ast.cst.children(), you can ast.cst.createCursor(), and then use cursor.goToNextSibling() to iterate, and cursor.textOffset() or cursor.textRange() will be updated.

🤔 using a Map with the cst as the key could really improve this aspect of the implementation.
it would even render the passing of an offset moot since the we are already passing the AST to the constructors.
I'll experiment with this and see how it behaves.

I was unable to use the Node as the key to the Map because the Node object I get from iterating over cst.children() is not the same as when I check ast.cst.

const source = ` pragma solidity ^0.8.26; `; const language = new Language('0.8.26'); const parsed = new SourceUnit( language.parse(NonterminalKind.SourceUnit, source).tree() ); parsed.cst.children()[0]; // NonterminalNode { kind: 'SourceUnitMembers', textLength: { utf8: 26, utf16: 26, line: 2, column: 0 }, type: 'Nonterminal', children: λ, createCursor: λ, toJSON: λ, unparse: λ } parsed.members.cst; // NonterminalNode { kind: 'SourceUnitMembers', textLength: { utf8: 26, utf16: 26, line: 2, column: 0 }, type: 'Nonterminal', children: λ, createCursor: λ, toJSON: λ, unparse: λ } parsed.cst.children()[0] === parsed.members.cst; // false

It probably happens because the cst node is not reused but reinstantiated either by Rust, TypeScript, or the communication between Rust and TypeScript.

at the moment I can't use names in an object/dict in the current solution since when I'm iterating over there cst.children() I have no clue about the ast object.

I made a proof of concept in a different branch and even if the tests don't pass because of the described issue, the change impacts greatly in a good way cleaning up a lot.

If you could make the cst point to the same object or maybe add an id field to Terminal and Nonterminal Nodes, let me know so I can make a PR with these changes?

By the way, the idea to have Identifier and YulIdentifier as a specific type is partly related to the offsets array but also related to having clarity to which Terminal Nodes can have comments attached to them on the Prettier side.

Thanks for pointing out. The id already exists in the Rust API, and I would be happy to add it in TypeScript. It should be a simple change.
When I'm done with WASM, I will also be able to expose the field names themselves as a strongly-typed EdgeLabel enum, so you would be able to refer to and persist the name of get name(): Identifier field as EdgeLabel.Name.

… could be removed.

…ment

…are using instead of a range

fvictorio · 2024-09-11T09:06:42Z

I didn't do an in-depth review of the PR; this is just the result of manually testing it and skimming the code for a couple of hours. If there are things you would like me to review in depth, please point them out to me, because I don't think I'll be able to do a line-by-line review of the whole PR.

Also, leaving inline comments in a PR this big is impossible (GitHub's UI sucks for big diffs), so I will just list my comments here.

Pushed commits

I added a couple of commits. The first one clears the current line when printing a warning. Without this, if you do prettier Foo.sol and get a warning, the line says Foo.sol[prettier-solidity] 'solidity-parse' has been deprecated, please use 'slang-solidity'. The Foo.sol part is a temporary write that prettier does to stdout, and the warning messes with the cleanup of that temporary warning.

The other commit just makes src/index.ts clearer, by using explicit names for antlr and slang variables.

`Comments.sol` tests

These tests are only run with the ANTLR parser, and the run-format-test.js file has a comment that says // TODO: finish Comments. Does this mean that there's something pending around comments when using the Slang parser?

And, as an aside, all the Slang-related code in run-format-test.js doesn't seem to make sense? shouldTestSlang will always return false, so all the code used in that scenario seems like dead code.

Defaulting to the latest supported version

I mentioned this already before, but defaulting to the latest Solidity version supported by slang won't work for projects with multiple versions of Solidity, which is a common thing for projects that have been around for some years.

I don't know what's the right solution here. I kind of feel that Slang should export some helper utility to let you infer the version to use for a given code (by getting the pragmas doing what solidity-analyzer does, and then returning the latest supported version that matches the pragmas). But there might be better solutions.

Other comments

There are some commented out lines in CI.yml
The options.parentParser trick used in the SourceUnit nodes is, if I remember correctly, done to avoid printing empty trailing lines in Solidity snippets within a markdown file. Can we add a comment explaining that? Otherwise it's a bit obscure.
Out of scope for the Slang refactor, but I think we should use tsx instead of ts-node.

Janther · 2024-09-17T01:27:21Z

I didn't do an in-depth review of the PR; this is just the result of manually testing it and skimming the code for a couple of hours. If there are things you would like me to review in depth, please point them out to me, because I don't think I'll be able to do a line-by-line review of the whole PR.

The only thing that pops up is the slang-comments section which is new.

Comments.sol tests

These tests are only run with the ANTLR parser, and the run-format-test.js file has a comment that says // TODO: finish Comments. Does this mean that there's something pending around comments when using the Slang parser?

These were some extremely edge cases with the positioning of comments where honestly comments should not go. I pushed a commit making these scenarios, not the same as with the antlr parser but closer to what prettier does with JS in a similar situation.

And, as an aside, all the Slang-related code in run-format-test.js doesn't seem to make sense? shouldTestSlang will always return false, so all the code used in that scenario seems like dead code.

yeah this was dead code (just removed it) that was helpful when I was slowly implementing new printers and had a proper list of tests that needed to be tested with both parsers.

Defaulting to the latest supported version

I mentioned this already before, but defaulting to the latest Solidity version supported by slang won't work for projects with multiple versions of Solidity, which is a common thing for projects that have been around for some years.

I don't know what's the right solution here. I kind of feel that Slang should export some helper utility to let you infer the version to use for a given code (by getting the pragmas doing what solidity-analyzer does, and then returning the latest supported version that matches the pragmas). But there might be better solutions.

This is planned but not in the scope of this PR.

Other comments

There are some commented out lines in CI.yml

While we slang doesn't support WASM, the build process is broken and the CI step for browser should be skipped.
Once they start supporting it, the CI comes right back.

The options.parentParser trick used in the SourceUnit nodes is, if I remember correctly, done to avoid printing empty trailing lines in Solidity snippets within a markdown file. Can we add a comment explaining that? Otherwise it's a bit obscure.

Done.

Out of scope for the Slang refactor, but I think we should use tsx instead of ts-node.

Will look into this. The main thing would be that it would need to be compatible with jest-light-runner

Janther added 30 commits August 15, 2024 20:07

Adding slang

67fb2f2

WIP: reading the new parsed object and basic structure of the printers

f2dcfed

Adding a printed 'TODO' to unimplemented printers so they can be seen…

d54e8f5

… in testing

WIP: more printers implemented

f05707a

WIP: 82/210 printers

9728d2b

WIP: 91/210 printers implemented

837387d

WIP: 102/210 printers implemented

88242a1

WIP: 116/210 printers implemented

b6a349b

small details in format

9248284

WIP: 122/210 printers implemented

acde001

WIP: 131/210 printers implemented

b48c8a2

WIP: 145/210 printers implemented

1a6b077

WIP: 159/210 printers implemented

7d63be0

WIP: 165/210 printers implemented

9ffcc08

WIP: 168/213 printers implemented

4cdb43c

WIP: 172/213 printers implemented

cfcbe87

WIP: 183/213 printers implemented

d24ea6b

WIP: 188/213 printers implemented

a239f8a

respecting preferred quotes in string literals

1862de7

indentation after equal sign

4c846a1

Experimental Ternaries

ea0f5ab

Member Access Chains

c84ad19

Additional edge cases

589f422

Binary Operations

13af26e

slang v0.15.0

cc892d7

WIP: 201/215 printers implemented

b1b5be8

WIP: 204/215 printers implemented

4920d6d

Tests passing without altering the AST

adding parentheses in binary operations

f8c3f86

WIP: 206/215 printers implemented

9b5e403

recolect cst Metadata at the general parser level

f9ed6f8

Janther added 2 commits August 19, 2024 17:40

Merge remote-tracking branch 'origin/main' into slang

77f5c56

# Conflicts: # package-lock.json # package.json

Comments now are AstNodes, cleaning up types and logic

159a78e

OmarTawfik reviewed Aug 19, 2024

View reviewed changes

src/slang-nodes/index.d.ts Outdated Show resolved Hide resolved

OmarTawfik reviewed Aug 19, 2024

View reviewed changes

src/slang-printers/types.d.ts Show resolved Hide resolved

OmarTawfik reviewed Aug 19, 2024

View reviewed changes

Janther added 2 commits August 20, 2024 12:14

move utility functions to their own files

7eb71b4

Renaming HubNode as CollectionNode

945b4d5

OmarTawfik reviewed Aug 20, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

renaming 'slang' to 'slang-solidity'

61fa38e

Janther force-pushed the slang branch from f42e2c8 to 61fa38e Compare August 21, 2024 01:35

Janther added 2 commits August 21, 2024 15:16

allow console warning in linting

fa6a294

generating CollectionNode and BinaryOperation from StrictAstNode inst…

f851b8e

…ead of naming every single class

Janther commented Aug 22, 2024

View reviewed changes

Janther and others added 9 commits August 22, 2024 15:34

printing undefined as empty string so most of the Ternary operators…

65f4a66

… could be removed.

Removing unneeded helper function isLineComment

e33e1f8

only adding space if value is declared in YulVariableDeclarationState…

1e33ba6

…ment

cleaning up Object.create cases

4ceae94

fixing AST comparison tests

453c6a1

update deps

cbb9eaf

Since Slang is under active development we better fix the version we …

f213b18

…are using instead of a range

Clear line when printing warnings

e7ad30d

Use explicit names for antlr and slang parsers, printers, etc.

9703ff8

add comments for new line decision at the end of SourceUnit

d3f7141

Comments running closer to prettier for edge cases

22edc8e

Janther force-pushed the slang branch from ba7c71d to 22edc8e Compare September 17, 2024 01:29

fvictorio approved these changes Sep 18, 2024

View reviewed changes

Janther merged commit 0fded41 into v2 Sep 18, 2024
7 checks passed

Janther deleted the slang branch September 18, 2024 08:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slang #1032

Slang #1032

Janther commented Aug 15, 2024

OmarTawfik left a comment

Janther Aug 22, 2024

OmarTawfik Aug 22, 2024

Janther Aug 22, 2024

Janther Aug 23, 2024 •

edited

Loading

OmarTawfik Aug 23, 2024

fvictorio commented Sep 11, 2024

Janther commented Sep 17, 2024 •

edited

Loading

`Comments.sol` tests

Defaulting to the latest supported version

Other comments

Slang #1032

Slang #1032

Conversation

Janther commented Aug 15, 2024

OmarTawfik left a comment

Choose a reason for hiding this comment

Janther Aug 22, 2024

Choose a reason for hiding this comment

OmarTawfik Aug 22, 2024

Choose a reason for hiding this comment

Janther Aug 22, 2024

Choose a reason for hiding this comment

Janther Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

OmarTawfik Aug 23, 2024

Choose a reason for hiding this comment

fvictorio commented Sep 11, 2024

Pushed commits

Comments.sol tests

Defaulting to the latest supported version

Other comments

Janther commented Sep 17, 2024 • edited Loading

Comments.sol tests

Defaulting to the latest supported version

Other comments

Janther Aug 23, 2024 •

edited

Loading

`Comments.sol` tests

Janther commented Sep 17, 2024 •

edited

Loading

`Comments.sol` tests