Increase dynamic precedence of call #185
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a weird one.
I found that under certain circumstances, the pattern
foo(1, 2)
was not matching the source filefoo(1, 2)
in Semgrep. When I looked at the CST generated by tree sitter, I found that iffoo(1, 2)
is followed by a newline, it is parsed as a call, but if it is not, it's parsed as a constructor. Often, patterns are provided over the command line and do not include a newline, whereas source files usually do. I tried to reproduce the difference in parsing using tree sitter directly, and could not. This puzzled me. I considered the possibility that the extensions we've made to the grammar to parse Semgrep constructs might have affected this, but that seemed far-fetched to me.I recalled that calls and constructor calls are truly ambiguous with each other, so I decided to start looking into how they are resolved. There is an entry in the
conflicts
array for_simple_user_type
vs_expression
, which is where the conflict ultimately manifests. However, there weren't any relevant calls toprec.dynamic
which would give the parser information on how to resolve the conflict at runtime. So, I formed the theory that tree sitter arbitrarily resolves it in one direction, whereas ocaml-tree-sitter's generated code sometimes arbitrarily resolves it in the other direction. I made the change here, propagated it over to Semgrep, and now have the desired behavior where Semgrep parsesfoo(1, 2)
as a call whether or not it is followed by a newline.Test plan: Automated tests, plus manual testing with Semgrep described above.