-
Notifications
You must be signed in to change notification settings - Fork 789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate treesitter grammar #14527
Comments
Helix also uses it; exclusively . |
@vzarytovskii Do you have any thoughts about how that would work with the lex filter? |
Yeah, no specific ideas just yet, probably should figure it out when we'll start working on it. |
How can someone help to get this started? Interested in contributing |
@Eliemer These documents has some context on how to proceed https://tree-sitter.github.io/tree-sitter/creating-parsers |
I am aware of this grammar, but if you look at the README, you'll see that it does not cover all language features and whitespace-sensitive aspect. Generating it from fslexyacc files and lexfilter (if possible of course) has a benefit of having it always up to date when we are updating it with new features. |
On my endeavour to find an ANTLR grammar for F#, I discovered a few things, who might be interesting. First, there are a gazillion similar formats, obviously. 😊 So, I digged deep into this ecosystem and there are all sorts of compiler in every direction, some are more maintained than others. As an example, I discovered an EBNF <--> Treesitter compiler . And there is a similar project, that goes only from Treesitter to EBNF, and it shows an already a generated EBNF file for OCaml: https://github.com/mingodad/plgh/blob/main/tree-sitter-ocaml.ebnf So, what's obvious, I think, is that EBNF is a considerably easier format, I think. So, at that point it seems that editing the existing EBNF of OCaml and than translating it to Treesitter might be an option. 🤷🏻♂️ I dont know, how it compares to generating from Yacc and Lex 🙈 I also found a couple of other, very interesting projects, and they would help to generate an ANTLR file, that I strife to create for OneDev. So if going the route from EBNF to Treesitter sounds acceptable, would this provide a path for both, Antlr and Treesitter. P.S: And if that all doesn't help, I also stumbled across a couple of articles, who might help to implement treesitter directly, and understand its format. https://derek.stride.host/posts/comprehensive-introduction-to-tree-sitter https://gist.github.com/Aerijo/df27228d70c633e088b0591b8857eeef |
Ocaml syntax does not account for whitespace sensitivity (i.e. lexfilter), so won't be much helpful here unfortunately. |
Yeah, I actually considered another way now. Going from .fsy to EBNF and then to Treesitter. This doesn't involve OCaml at all. |
Fsy to ebnf won't likely work to, it won't be covering whitespace sensitivity |
If anyone is interested I’ve been slowly working on a F# treesitter grammar that supports indentation-based scoping |
Nice |
I would like to help with testing and improving it. |
How is whitespace significance breaking either of the protocols? Or do you think its lost in the translation? |
Yeah, I think there's a possibility of losing a bunch of info during conversions. Besides fslexyacc alone doesn't carry the indent/whitespace info. |
Yeah, I will see. Considering Python is popular, do I guess this info is not being lost.
What else does? Chet told me, the files are at the compiler repo: https://github.com/dotnet/fsharp/blob/main/src/Compiler/pars.fsy |
@vzarytovskii any help is much welcomed.
|
lexfilter in the repo |
Yeah, I already found your previous comment on Discord about that, many thanks. @Nsidorenco I am testing it with Helix, but I am unsure why it currently fails. So I cant provide you any meaningful feedback as of now, and hope I can do so in the future. Thanks a lot for developing this, you`re great 🥳 |
Is your feature request related to a problem? Please describe.
Currently, more and more tooling and editors are relying on treesitter for navigation, parsing and semantic highlighting (e.g. in-browser VScode, nvim, github,), we should provide TS grammar for F#.
Describe the solution you'd like
TS grammar should be (if possible) generated from our fsl/fsy and hosted in the repo.
Links
Treesitter docs: https://tree-sitter.github.io/tree-sitter/
Existing grammars, incl. some ws-sensitive:
OCaml: https://github.com/tree-sitter/tree-sitter-ocaml
Python: https://github.com/tree-sitter/tree-sitter-python
Yaml: https://github.com/ikatyang/tree-sitter-yaml
Haskell: https://github.com/tree-sitter/tree-sitter-haskell
The text was updated successfully, but these errors were encountered: