disambiguate fold vs parenthesized assignment #239

brandonspark · 2024-01-25T23:21:39Z

Currently, there is an ambiguity in parsing fold expressions and parenthesized assignment expressions. This is described in issues #201 and #212.

This arises in the following trace:

( <id>    =
       ^ where we are here

The crux of the issue is that we do not know whether to shift the = forward, or to reduce the <id> to an expression_not_binary. If we do the former, we can no longer parse a fold expression, which expects <exp> <fold_operator> '...'. If we do the latter, we can no longer parse a parenthesized assignment expression, as an assignment expression does not permit arbitrary expressions to appear on the LHS.

The current behavior is that tree-sitter-cpp will reduce the <id>, and thus fail to parse any parenthesized assignment expression. Unfortunately, these are reasonably common, so this is actually disastrous.

This PR makes it so that parenthesized expressions specifically allow assignment expressions which feature expressions on the LHS, so that it can still parse even if we reduce the identifier. This causes the grammar to be a little more permissive than it needs to be, but it does not disallow valid programs, as another proposed solution to this problem does.

I believe it is better for us to be more permissive, in this case, so that we no longer disallow valid programs, with the understanding that we can restore stringency at a later date.

grammar.js

jdrouhard · 2024-01-26T16:33:32Z

Thanks!

amaanq · 2024-02-01T18:56:01Z

Hey this looks great, this was actually my original idea to fix this but I felt there could've been a better way, since this feels more like a bodge that just happens to work (in my head), but if it's the only solution then it's totally okay.

Though I do have a few nitpicks to point out (and wish we got resolved before merging) - namely not generating with master and the naming/exposing of the lhs rule, I'll just push a fix for that though so no worries.

## What: This PR updates both the `semgrep-c` and `semgrep-cpp` modules with the latest changes from `tree-sitter-c` and `tree-sitter-cpp`. ## Why: I recently fixed an [upstream tree-sitter issue](tree-sitter/tree-sitter-cpp#239) in the C++ grammar. Unfortunately, this means we need to take all of the new `tree-sitter-cpp` changes to pull it in. ## How: Lots of grammar hacking. In several places, I augmented the C AST to be more in line with the CPP AST. This makes it noticeably less simple, but is required, as the `tree-sitter-c` grammar becomes more complex. This leads to a gross amount of code duplication between the C and C++ translation code, but since the typed CSTs are different at the moment, there isn't a good way to prevent that. This PR also makes use of the groundwork I laid down in #9681, which introduces a `preproc_if_poly` (and friends) types, which are polymorphic types that encode the structure of the preprocessor statements that occur in C and C++. Here, we see that we save massive amounts of logic duplication with minimal boilerplate. ## Test plan: `make test` Added a parsing test for the parenthesized assignment thing that all of this was originally for, too.

disambiguate fold vs parenthesized assignment

e772504

aryx requested review from jdrouhard and amaanq January 26, 2024 09:56

jdrouhard reviewed Jan 26, 2024

View reviewed changes

grammar.js Outdated Show resolved Hide resolved

alias new node

08ee919

jdrouhard merged commit 4ca37be into tree-sitter:master Jan 26, 2024
1 check failed

This was referenced Jan 26, 2024

update cpp semgrep/ocaml-tree-sitter-semgrep#472

Merged

chore(c/cpp): bring up to speed again semgrep/semgrep#9688

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disambiguate fold vs parenthesized assignment #239

disambiguate fold vs parenthesized assignment #239

brandonspark commented Jan 25, 2024

jdrouhard commented Jan 26, 2024

amaanq commented Feb 1, 2024

disambiguate fold vs parenthesized assignment #239

disambiguate fold vs parenthesized assignment #239

Conversation

brandonspark commented Jan 25, 2024

jdrouhard commented Jan 26, 2024

amaanq commented Feb 1, 2024