Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase dynamic precedence of call #185

Merged
merged 1 commit into from
May 18, 2022
Merged

Conversation

nmote
Copy link
Contributor

@nmote nmote commented May 17, 2022

This is a weird one.

I found that under certain circumstances, the pattern foo(1, 2) was not matching the source file foo(1, 2) in Semgrep. When I looked at the CST generated by tree sitter, I found that if foo(1, 2) is followed by a newline, it is parsed as a call, but if it is not, it's parsed as a constructor. Often, patterns are provided over the command line and do not include a newline, whereas source files usually do. I tried to reproduce the difference in parsing using tree sitter directly, and could not. This puzzled me. I considered the possibility that the extensions we've made to the grammar to parse Semgrep constructs might have affected this, but that seemed far-fetched to me.

I recalled that calls and constructor calls are truly ambiguous with each other, so I decided to start looking into how they are resolved. There is an entry in the conflicts array for _simple_user_type vs _expression, which is where the conflict ultimately manifests. However, there weren't any relevant calls to prec.dynamic which would give the parser information on how to resolve the conflict at runtime. So, I formed the theory that tree sitter arbitrarily resolves it in one direction, whereas ocaml-tree-sitter's generated code sometimes arbitrarily resolves it in the other direction. I made the change here, propagated it over to Semgrep, and now have the desired behavior where Semgrep parses foo(1, 2) as a call whether or not it is followed by a newline.

Test plan: Automated tests, plus manual testing with Semgrep described above.

@alex-pinkus
Copy link
Owner

Great deep dive, thank you for doing this!

@alex-pinkus alex-pinkus merged commit 636c976 into alex-pinkus:main May 18, 2022
@nmote nmote deleted the callprec branch May 18, 2022 01:50
nmote added a commit to semgrep/ocaml-tree-sitter-semgrep that referenced this pull request May 18, 2022
This pulls in alex-pinkus/tree-sitter-swift#185,
which fixes an ambiguous parse that led to unexpected failures to match
in Semgrep.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants