Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplifying grammar for ocaml-tree-sitter introduces conflict #286

Closed
aryx opened this issue Feb 18, 2022 · 10 comments
Closed

Simplifying grammar for ocaml-tree-sitter introduces conflict #286

aryx opened this issue Feb 18, 2022 · 10 comments
Assignees
Labels
blocking A task in some other project depends on this bug Something isn't working priority:medium to do, not blocking users

Comments

@aryx
Copy link
Collaborator

aryx commented Feb 18, 2022

If you try ./add-simple-language https://github.com/alex-pinkus/tree-sitter-swift
and then
./test-lang swift,
then the original grammar is fine, as well as the test, but when we try to use ocaml-tree-sitter-semgrep on it, we get a conflict:

make gen-c
make[2]: Entering directory '/home/pad/github/ocaml-tree-sitter-semgrep/lang/swift'
if [ ! -e src/parser.c ] \
          || [ ! -e "../semgrep-grammars/lang/swift/src/grammar.json" ] \
          || [ "../semgrep-grammars/lang/swift/src/grammar.json" -nt src/parser.c ]; then \
  /home/pad/github/ocaml-tree-sitter-semgrep/scripts/ocaml-tree-sitter-gen-c; \
else \
  echo "src/parser.c for swift is up to date."; \
fi
swift: Importing initial 'grammar.json'.
swift: Simplifying 'grammar.json' for ocaml-tree-sitter.
swift: Generating definitive 'parser.c'.

Unresolved conflict for symbol sequence:

  locally_permitted_modifiers_repeat1  •  '@'  …

Possible interpretations:

  1:  (locally_permitted_modifiers  locally_permitted_modifiers_repeat1)  •  '@'  …
  2:  (locally_permitted_modifiers_repeat1  locally_permitted_modifiers_repeat1  •  locally_permitted_modifiers_repeat1)

Possible resolutions:

  1:  Specify a left or right associativity in `locally_permitted_modifiers`
  2:  Add a conflict for these rules: `locally_permitted_modifiers`

make[2]: *** [Makefile:48: src/grammar.json] Error 1
make[2]: Leaving directory '/home/pad/github/ocaml-tree-sitter-semgrep/lang/swift'
make[1]: *** [Makefile:26: gen] Error 2
make[1]: Leaving directory '/home/pad/github/ocaml-tree-sitter-semgrep/lang/swift'
make: *** [Makefile:19: build] Error 2

The important part is problably:
swift: Simplifying 'grammar.json' for ocaml-tree-sitter.
which introduces some conflicts.

This is for semgrep/semgrep#2232

@mjambon
Copy link
Member

mjambon commented Feb 19, 2022

What's the priority for this?

I haven't looked at the problem but I hope there's a workaround by writing the original grammar differently.

@mjambon mjambon added blocking A task in some other project depends on this bug Something isn't working labels Feb 19, 2022
@aryx
Copy link
Collaborator Author

aryx commented Feb 19, 2022

Medium prio. Would be nice to start integrating swift.

@aryx
Copy link
Collaborator Author

aryx commented Feb 19, 2022

I was hoping to put Nat on swift, but he would be blocked by this issue.

@alex-pinkus
Copy link

Could I work around this in the Swift grammar by just declaring the conflict for locally_permitted_modifiers? That conflict would be inert in the full grammar but get you past this issue in the simplified one.

@mjambon mjambon added the priority:medium to do, not blocking users label Feb 19, 2022
@aryx
Copy link
Collaborator Author

aryx commented Feb 24, 2022

Maybe, but maybe it would hide a bug/limitation in ocaml-tree-sitter-semgrep that we need to fix anyway.
It would be nice to understand the root cause of why we get a conflict after doing the grammar preprocessing we do inside ocaml-tree-sitter-semgrep.

@mjambon
Copy link
Member

mjambon commented Feb 24, 2022

I haven't had a chance to figure this out yet and our whole team at r2c is away next week. We're getting fewer and fewer bugs with ocaml-tree-sitter but they're getting increasingly tricky. It's going to be tough for us to fix this until the week of March 7.

@mjambon
Copy link
Member

mjambon commented Mar 9, 2022

Here's what I got:

Unhiding the rule _locally_permitted_modifiers by renaming it locally_permitted_modifiers is sufficient to cause the conflict:

$ cp grammar.js grammar.js.orig
$ sed -i -e 's/_locally_permitted_modifiers/locally_permitted_modifiers/g' grammar.js
$ tree-sitter generate
Unresolved conflict for symbol sequence:

  locally_permitted_modifiers_repeat1  •  '@'  …

Possible interpretations:

  1:  (locally_permitted_modifiers  locally_permitted_modifiers_repeat1)  •  '@'  …
  2:  (locally_permitted_modifiers_repeat1  locally_permitted_modifiers_repeat1  •  locally_permitted_modifiers_repeat1)

Possible resolutions:

  1:  Specify a left or right associativity in `locally_permitted_modifiers`
  2:  Add a conflict for these rules: `locally_permitted_modifiers`

ocaml-tree-sitter makes this transformation for all the hidden rules and it's always been fine but apparently this particular rule locally_permitted_modifiers ends up being problematic. There must be something special with rules that start with an underscore, other than being hidden from tree-sitter's parse tree.

The official documentation states:

You may have noticed in the above examples that some of the grammar rule name like _expression and _type began with an underscore. Starting a rule’s name with an underscore causes the rule to be hidden in the syntax tree. This is useful for rules like _expression in the grammars above, which always just wrap a single child node. If these nodes were not hidden, they would add substantial depth and noise to the syntax tree without making it any easier to understand.

@mjambon
Copy link
Member

mjambon commented Mar 9, 2022

@alex-pinkus Forcing the rule to be inlined eliminates the conflict:

 module.exports = grammar({
   name: "swift",
+  inline: $ => [
+    $.locally_permitted_modifiers
+  ],
   conflicts: ($) => [

(A) Since it's important in the original grammar that _locally_permitted_modifiers be inlined, I suggest marking it as such by inserting an inline section. This would solve the problem for ocaml-tree-sitter:

module.exports = grammar({
  name: "swift",
  inline: $ => [
    $._locally_permitted_modifiers
  ],
  ...

(B) Otherwise, I think we can solve this by patching the grammar in ocaml-tree-sitter-semgrep, which is something we do to add semgrep syntax anyway.

I don't think we should modify the ocaml-tree-sitter code generator to force all unhidden rules to be inlined because that may not be correct or efficient in general.

Please let us know if you prefer solution A or B.

alex-pinkus added a commit to alex-pinkus/tree-sitter-swift that referenced this issue Mar 9, 2022
See semgrep/ocaml-tree-sitter-semgrep#286

When generating a simplified grammar, the `ocaml-tree-sitter` code
generator forces all hidden nodes to be visible. It appears that hidden
nodes automatically get marked as `inline`, and that without that
auto-inlining, the `_locally_permitted_modifiers` rule causes a
conflict.

Rather than making the semgrep folks work around their conflict, we just
check the `inline` declaration into the grammar itself so that it sticks
around in the simplified grammar.
@alex-pinkus
Copy link

I'm more than happy to put that in the original grammar -- to your point, since compilation fails without it, I'm (unintentionally) relying on that inlining behavior for correctness, so I might as well declare it.

alex-pinkus added a commit to alex-pinkus/tree-sitter-swift that referenced this issue Mar 9, 2022
See semgrep/ocaml-tree-sitter-semgrep#286

When generating a simplified grammar, the `ocaml-tree-sitter` code
generator forces all hidden nodes to be visible. It appears that hidden
nodes automatically get marked as `inline`, and that without that
auto-inlining, the `_locally_permitted_modifiers` rule causes a
conflict.

Rather than making the semgrep folks work around their conflict, we just
check the `inline` declaration into the grammar itself so that it sticks
around in the simplified grammar.
@mjambon
Copy link
Member

mjambon commented Mar 9, 2022

great, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocking A task in some other project depends on this bug Something isn't working priority:medium to do, not blocking users
Development

No branches or pull requests

3 participants