Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: expand for and struct grammar with headers? #142

Open
2 tasks done
simonmandlik opened this issue Jun 21, 2024 · 3 comments
Open
2 tasks done

Question: expand for and struct grammar with headers? #142

simonmandlik opened this issue Jun 21, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@simonmandlik
Copy link

simonmandlik commented Jun 21, 2024

Did you check existing issues?

  • I have read all the tree-sitter docs if it relates to using the parser
  • I have searched the existing issues of tree-sitter-julia

Tree-Sitter CLI Version, if relevant (output of tree-sitter --version)

No response

Describe the bug

This is not a bug, but more like a question/feature request.

I'm trying to update / fix julia queries for neovim and I'm having a very hard time with for loops and also structs.

This python example

for a in range(10):
    pass

is parsed as follows:

(module ; [0, 0] - [2, 0]
  (for_statement ; [0, 0] - [1, 8]
    left: (identifier) ; [0, 4] - [0, 5]
    right: (call ; [0, 9] - [0, 18]
      function: (identifier) ; [0, 9] - [0, 14]
      arguments: (argument_list ; [0, 14] - [0, 18]
        (integer))) ; [0, 15] - [0, 17]
    body: (block ; [1, 4] - [1, 8]
      (pass_statement)))) ; [1, 4] - [1, 8]

and this julia example

for a in 1:10, b in 1:10
    print(a)
end

is parsed as

(source_file ; [0, 0] - [3, 0]
  (for_statement ; [0, 0] - [2, 3]
    (for_binding ; [0, 4] - [0, 13]
      (identifier) ; [0, 4] - [0, 5]
      (range_expression ; [0, 9] - [0, 13]
        (integer_literal) ; [0, 9] - [0, 10]
        (integer_literal))) ; [0, 11] - [0, 13]
    (for_binding ; [0, 15] - [0, 24]
      (identifier) ; [0, 15] - [0, 16]
      (range_expression ; [0, 20] - [0, 24]
        (integer_literal) ; [0, 20] - [0, 21]
        (integer_literal))) ; [0, 22] - [0, 24]
    (call_expression ; [1, 4] - [1, 12]
      (identifier) ; [1, 4] - [1, 9]
      (argument_list ; [1, 9] - [1, 12]
        (identifier))))) ; [1, 10] - [1, 11]

Because the two for_binding nodes are not grouped together in any way and are siblings of the call_expression, I couldn't write any query that would correctly select the loop "header" (regardless of the number of variables iterated over), and neither any query that would select the body without the "header". This might be due to the fact that I'm no expert in TS queries, but for Python such queries are really simple.

Similar situation is with struct definitions:

struct A{B, C} <: D
    x
    y
end

is parsed as

(source_file ; [0, 0] - [4, 0]
  (struct_definition ; [0, 0] - [3, 3]
    name: (identifier) ; [0, 7] - [0, 8]
    (type_parameter_list ; [0, 8] - [0, 14]
      (identifier) ; [0, 9] - [0, 10]
      (identifier)) ; [0, 12] - [0, 13]
    (type_clause ; [0, 15] - [0, 19]
      (operator) ; [0, 15] - [0, 17]
      (identifier)) ; [0, 18] - [0, 19]
    (identifier) ; [1, 4] - [1, 5]
    (identifier))) ; [2, 4] - [2, 5]

Again, struct header nodes type_parameter_list and type_clause are siblings of the struct body.

Is there a reason not to group struct and loop "headers" together similarly to how python is parsed?

@simonmandlik simonmandlik added the bug Something isn't working label Jun 21, 2024
@simonmandlik
Copy link
Author

Ifs in python also provide consequence child:

if True:
    pass
elif False:
    pass
else:
    pass
(module ; [0, 0] - [6, 0]
  (if_statement ; [0, 0] - [5, 8]
    condition: (true) ; [0, 3] - [0, 7]
    consequence: (block ; [1, 4] - [1, 8]
      (pass_statement)) ; [1, 4] - [1, 8]
    alternative: (elif_clause ; [2, 0] - [3, 8]
      condition: (false) ; [2, 5] - [2, 10]
      consequence: (block ; [3, 4] - [3, 8]
        (pass_statement))) ; [3, 4] - [3, 8]
    alternative: (else_clause ; [4, 0] - [5, 8]
      body: (block ; [5, 4] - [5, 8]
        (pass_statement))))) ; [5, 4] - [5, 8]

whereas in julia all "consequence" lines are siblings of the condition:

if true
    1
    1
elseif false
    1
else
    1
end
(source_file ; [0, 0] - [8, 0]
  (if_statement ; [0, 0] - [7, 3]
    condition: (boolean_literal) ; [0, 3] - [0, 7]
    (integer_literal) ; [1, 4] - [1, 5]
    (integer_literal) ; [2, 4] - [2, 5]
    alternative: (elseif_clause ; [3, 0] - [5, 0]
      condition: (boolean_literal) ; [3, 7] - [3, 12]
      (integer_literal)) ; [4, 4] - [4, 5]
    alternative: (else_clause ; [5, 0] - [7, 0]
      (integer_literal)))) ; [6, 4] - [6, 5]

@savq
Copy link
Collaborator

savq commented Jun 23, 2024

There's two seperate issues here, so I'll address them separately.

Querying inner blocks

The block rule used in the grammar is not visible (see #73). There's no technical limitation here, but making it visible is a breaking change that would require updating almost all tests.

Querying "headers"

If blocks were visible, querying headers would be really simple, since they're always "the thing before the block".

For now, I can only think of a couple of workarounds:

  • if and while conditions are a single expression, so this would work:
    (if_statement . (_) @condition)
  • for and let have their own rules for bindings, so this would work:
    (for_statement ((for_binding) ("," (for_binding))*) @bindings)

In the case of structs... The way they're currently parsed is awful. I took a much simpler approach for the lezer-julia grammar, and that should probably get ported here.

@simonmandlik
Copy link
Author

@savq thanks for the reply!

I prepared a PR nvim-treesitter/nvim-treesitter-textobjects#639, any comments would be greatly appreciated!

The block rule used in the grammar is not visible (see #73). There's no technical limitation here, but making it visible is a breaking change that would require updating almost all tests.

Yes, this would really help a lot. For ifs, conditions are easy for example as they are under the condition field, but selecting blocks is more difficult (and would have to rely on the matching algorithm, as elseif is for example a sibling of all nodes in the block)

(for_statement ((for_binding) ("," (for_binding))*) @bindings)

I tested this and it selects only one for_binding at a time, not all of them

@savq savq mentioned this issue Sep 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants