-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slang #1032
Conversation
Tests passing without altering the AST
# Conflicts: # package-lock.json # package.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did an initial pass and left a couple of comments. Thanks for working on this. Exciting stuff!
…ead of naming every single class
| StrictAstNode | ||
| Comment | ||
| Identifier | ||
| YulIdentifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OmarTawfik It would be beneficial if Identifier
and YulIdentifier
were in the types.
For example:
export declare class EnumDefinition {
readonly cst: NonterminalNode;
private readonly fetch;
constructor(cst: NonterminalNode);
get enumKeyword(): TerminalNode;
get name(): TerminalNode;
get openBrace(): TerminalNode;
get members(): EnumMembers;
get closeBrace(): TerminalNode;
}
I care about getting the location information of name
but not from enumKeyword
, openBrace
and closeBrace
.
I went through every TerminalNode
in the AST and plucked out the Identifier
s and YulIdentifier
s but if you make a change in the AST I have no Type security that will trigger an error on my side if I need to adapt. So I would appreciate if it would be possible to deliver something like:
export declare class EnumDefinition {
readonly cst: NonterminalNode;
private readonly fetch;
constructor(cst: NonterminalNode);
get enumKeyword(): TerminalNode;
get name(): Identifier;
get openBrace(): TerminalNode;
get members(): EnumMembers;
get closeBrace(): TerminalNode;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion!
but if you make a change in the AST I have no Type security that will trigger an error on my side
To make sure I understand you correctly, do you mean getNodeMetadata().offsets
?
Is that because you are storing offsets
as an array, and accessing it by a numeric index? Would it help if it was changed to be an object/dict with the field name as keys? for example offsets.name
here won't break if the position of name
was changed relative to its siblings.
It would be beneficial if Identifier and YulIdentifier were in the types.
How would that help in this case? if they were non-terminal nodes? how would this be tracked otherwise?
One other way you can use Slang API if you want, so that you don't have to do the manual tracking of offsets yourself, is to use cursors.
For example, in getNodeMetadata()
, instead of iterating over ast.cst.children()
, you can ast.cst.createCursor()
, and then use cursor.goToNextSibling()
to iterate, and cursor.textOffset()
or cursor.textRange()
will be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 using a Map with the cst as the key could really improve this aspect of the implementation.
it would even render the passing of an offset moot since the we are already passing the AST to the constructors.
I'll experiment with this and see how it behaves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unable to use the Node
as the key to the Map because the Node object I get from iterating over cst.children()
is not the same as when I check ast.cst
.
const source = `
pragma solidity ^0.8.26;
`;
const language = new Language('0.8.26');
const parsed = new SourceUnit(
language.parse(NonterminalKind.SourceUnit, source).tree()
);
parsed.cst.children()[0]; // NonterminalNode { kind: 'SourceUnitMembers', textLength: { utf8: 26, utf16: 26, line: 2, column: 0 }, type: 'Nonterminal', children: λ, createCursor: λ, toJSON: λ, unparse: λ }
parsed.members.cst; // NonterminalNode { kind: 'SourceUnitMembers', textLength: { utf8: 26, utf16: 26, line: 2, column: 0 }, type: 'Nonterminal', children: λ, createCursor: λ, toJSON: λ, unparse: λ }
parsed.cst.children()[0] === parsed.members.cst; // false
It probably happens because the cst node is not reused but reinstantiated either by Rust, TypeScript, or the communication between Rust and TypeScript.
at the moment I can't use names in an object/dict in the current solution since when I'm iterating over there cst.children()
I have no clue about the ast object.
I made a proof of concept in a different branch and even if the tests don't pass because of the described issue, the change impacts greatly in a good way cleaning up a lot.
If you could make the cst point to the same object or maybe add an id
field to Terminal and Nonterminal Nodes, let me know so I can make a PR with these changes?
By the way, the idea to have Identifier
and YulIdentifier
as a specific type is partly related to the offsets array but also related to having clarity to which Terminal Nodes can have comments attached to them on the Prettier side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out. The id
already exists in the Rust API, and I would be happy to add it in TypeScript. It should be a simple change.
When I'm done with WASM, I will also be able to expose the field names themselves as a strongly-typed EdgeLabel
enum, so you would be able to refer to and persist the name of get name(): Identifier
field as EdgeLabel.Name
.
… could be removed.
…are using instead of a range
I didn't do an in-depth review of the PR; this is just the result of manually testing it and skimming the code for a couple of hours. If there are things you would like me to review in depth, please point them out to me, because I don't think I'll be able to do a line-by-line review of the whole PR. Also, leaving inline comments in a PR this big is impossible (GitHub's UI sucks for big diffs), so I will just list my comments here. Pushed commitsI added a couple of commits. The first one clears the current line when printing a warning. Without this, if you do The other commit just makes
|
The only thing that pops up is the
These were some extremely edge cases with the positioning of comments where honestly comments should not go. I pushed a commit making these scenarios, not the same as with the antlr parser but closer to what prettier does with JS in a similar situation.
yeah this was dead code (just removed it) that was helpful when I was slowly implementing new printers and had a proper list of tests that needed to be tested with both parsers.
This is planned but not in the scope of this PR.
While we slang doesn't support WASM, the build process is broken and the CI step for browser should be skipped.
Done.
Will look into this. The main thing would be that it would need to be compatible with |
This is a proposal for a version 2 of this plugin.
It has been rewritten leveraging the power of Nomic Foundation's Slang to provide a much more scalable and better supported parser.
While we are still in beta we have disabled browser support until we can make sure we can fully provide this functionality.
The coverage has been reduced as well since
slang
's AST is much more detailed thansolidity-parser
's and while all our tests still pass, they don't provide every possible scenario.Using with the old parser is still possible but a deprecated warning will be logged.