DotLambda: add parser recovery #16238

DedSec256 · 2023-11-07T19:40:31Z

This PR adds parser recovery for the SynExpr.DotLambda for cases _. and _

Special thanks to @auduchinok for the consultation.

src/Compiler/pars.fsy

tests/service/ParserTests.fs

src/Compiler/pars.fsy

T-Gro

Thank you for this change.
I would also prefer to have these tests as syntaxtree tests.

Reason is, if anything changes in the SyntaxTree, the baselines can be easily re-created, whereas specific assertions need to be manually checked.

(which is not a big deal when there is 2 of them, but it is when we have hundreds)

DedSec256 · 2023-11-08T19:59:47Z

Interesting observations on the commit 27e8d65:

After adding recovery, in the following invalid code DotLambdas began to be parsed

let a = ( upcast _ ) : obj

let b = ( _ :> _ ) : obj

let c = ( _ :> obj)

Also parens expressions began to be parsed more correctly.

src/Compiler/pars.fsy

T-Gro · 2023-11-09T10:59:21Z

@DedSec256 :

Seeing the newly appeared _ recoveries, I would be inclined to only do a recovery for UNDERSCORE DOT .., but not for a standalone UNDERSCORE only.
Chance of breaking something existing (also precedence rules) would decrease dramatically.

src/Compiler/pars.fsy

auduchinok · 2023-11-09T11:01:46Z

Seeing the newly appeared _ recoveries, I would be inclined to only do a recovery for UNDERSCORE DOT .., but not for a standalone UNDERSCORE only.

Why, though? There're still errors on the incorrect code, so we aren't changing how any correct code is parsed, but surroundings are much better parsed in all the unfinished code examples.

The purpose of the parser recovery is to parse as much code in the most correct way as we can, so all the analysis works with the best possible tree. We don't have any ambiguity here, so it's clear that _ is an unfinished dot lambda, and being able to analyze things like parens or other pars of unfinished code is crucial for the tooling.

T-Gro · 2023-11-09T13:28:45Z

Seeing the newly appeared _ recoveries, I would be inclined to only do a recovery for UNDERSCORE DOT .., but not for a standalone UNDERSCORE only.

Why, though? There're still errors on the incorrect code, so we aren't changing how any correct code is parsed, but surroundings are much better parsed in all the unfinished code examples.

The purpose of the parser recovery is to parse as much code in the most correct way as we can, so all the analysis works with the best possible tree. We don't have any ambiguity here, so it's clear that _ is an unfinished dot lambda, and being able to analyze things like parens or other pars of unfinished code is crucial for the tooling.

I do not see a standalone _ as a lambda, e.g. in this example let a = ( upcast _ ) : obj from @DedSec256 above.

auduchinok · 2023-11-09T13:38:19Z

I do not see a standalone _ as a lambda, e.g. in this example let a = ( upcast _ ) : obj from @DedSec256 above.

Yes, but it should be something parseable when possible. And here it's unambiguously an unfinished dot lambda, as it's the only kind of an expression that could start with _. When there're more cases in the language it's possible to update these recovery rules, so it's not a problem.

Even looking at the same examples, it's clear that none of upcast or parens expressions were parsed prior to this change. It means that as soon as you type _, you get completely broken analysis in the tooling for some part of the tree, even though it could be parsed correctly prior the typing. This PR improves this a lot.

It's the same kind of recovery as in the most of these PRs and it's what makes a much better IDE experience.

It's OK to make assumptions about what an unfinished code may become after the the user types more into the editor. Not doing that makes analysis of surrounding code break. Consider interface recovery in #15943. We don't know whether it's going to be an interface ... end type representation or whether it's an interface ... with member declaration. But it doesn't matter much as long as we can provide improved tooling, and being able to parse it as something is the crucial part here.

T-Gro · 2023-11-09T13:42:30Z

... And here it's unambiguously an unfinished dot lambda, as it's the only kind of an expression that could start with _. When there're more cases in the language it's possible to update these recovery rules, so it's not a problem...

Couldn't it also be an unfinished identifier starting with an underscore?
What is what I would expect here, at least without the DotLambda feature in place.

auduchinok · 2023-11-09T13:46:19Z

Couldn't it also be an unfinished identifier starting with an underscore?

It's a good example, thanks! I think would be much better to produce an identifier expression indeed. It's still going to be parsed as an atomic expression, so everything seems good. The downside is it'd require more checks here and there about _ not actually being a valid identifier.

auduchinok · 2023-11-09T13:48:37Z

Another option is to produce a generic error expression node covering _ range. It's maybe even a better option.

So we have options to produce a dot lambda, an identifier, and a generic error node.

I think the choice should be made based on these things:

how many checks we need to add/update to make it ignored in the type checking
how many changes we need in features like code completion for it to work properly, i.e. suggesting names prefixed with _

T-Gro · 2023-11-09T13:56:15Z

If it were "just parsed" as an _ identifier, leaving potential error for typechecking.
Wouldn't it then cover all the needs, including completion working OOB for items prefixed with _ ?

auduchinok · 2023-11-09T13:57:14Z

If it were "just parsed" as an _ identifier, leaving potential error for typechecking.
Wouldn't it then cover all the needs, including completion working OOB for items prefixed with _ ?

That's what I'd hope for! 🙂

auduchinok · 2023-11-09T14:06:30Z

If it were "just parsed" as an _ identifier, leaving potential error for typechecking.
Wouldn't it then cover all the needs, including completion working OOB for items prefixed with _ ?

That's what I'd hope for! 🙂

The problem I see is it'd be really nice if we the tree could also show that this is an error node somehow in this case. Producing just an identifier moves the responsibility to various features to check it at a later stage, potentially duplicating code. The more I think about it, the more I'd prefer it to be parsed as a dotLambda (or an error, in the worst case) if features like code completion keep working, at least for now.

Maybe I'd suggest to keep it as is in this PR if we know that there's no regression in code completion for _ prefix, and do another iteration later, when we're not constrained by a close release date (i.e. we still would like to cherry-pick these changes into our release)? I think it needs a bit more thoughtful consideration and testing here.

auduchinok · 2023-11-09T14:19:16Z

Maybe SynExpr.FromParseError(SynExpr.Named(...)) would work, but it needs testing...

src/Compiler/SyntaxTree/SyntaxTrivia.fs

src/Compiler/pars.fsy

T-Gro · 2023-11-14T13:13:11Z

@DedSec256 : Thanks for the changes! Are we ready to merge here?

DedSec256 requested a review from a team as a code owner November 7, 2023 19:40

auduchinok reviewed Nov 7, 2023

View reviewed changes

src/Compiler/pars.fsy Outdated Show resolved Hide resolved

tests/service/ParserTests.fs Outdated Show resolved Hide resolved

T-Gro reviewed Nov 7, 2023

View reviewed changes

src/Compiler/pars.fsy Outdated Show resolved Hide resolved

T-Gro approved these changes Nov 7, 2023

View reviewed changes

DedSec256 force-pushed the ber.a/dotLambda-recovery branch from 4fdcfc4 to 601f0d0 Compare November 8, 2023 18:44

auduchinok reviewed Nov 9, 2023

View reviewed changes

src/Compiler/pars.fsy Outdated Show resolved Hide resolved

src/Compiler/pars.fsy Outdated Show resolved Hide resolved

T-Gro self-requested a review November 9, 2023 10:59

T-Gro reviewed Nov 9, 2023

View reviewed changes

src/Compiler/pars.fsy Show resolved Hide resolved

DedSec256 force-pushed the ber.a/dotLambda-recovery branch from 27e8d65 to 8128b34 Compare November 13, 2023 16:53

DedSec256 requested review from auduchinok and T-Gro November 13, 2023 18:08

auduchinok reviewed Nov 13, 2023

View reviewed changes

src/Compiler/SyntaxTree/SyntaxTrivia.fs Outdated Show resolved Hide resolved

src/Compiler/pars.fsy Outdated Show resolved Hide resolved

DedSec256 added 2 commits November 13, 2023 22:15

wip

7d7bc6e

wip

6a16eaa

DedSec256 force-pushed the ber.a/dotLambda-recovery branch from 2918a7f to 6a16eaa Compare November 13, 2023 19:20

T-Gro approved these changes Nov 14, 2023

View reviewed changes

auduchinok approved these changes Nov 14, 2023

View reviewed changes

psfinaki approved these changes Nov 14, 2023

View reviewed changes

psfinaki merged commit 52d98fe into dotnet:main Nov 14, 2023
24 checks passed

DedSec256 deleted the ber.a/dotLambda-recovery branch November 14, 2023 14:58

T-Gro mentioned this pull request Nov 23, 2023

No completion for unfinished shorthand lambda #16221

Closed

allisonchou mentioned this pull request Dec 15, 2023

[Automated] PRs inserted in VS build main-34415.28 #16445

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DotLambda: add parser recovery #16238

DotLambda: add parser recovery #16238

DedSec256 commented Nov 7, 2023

T-Gro left a comment

DedSec256 commented Nov 8, 2023

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023 •

edited

Loading

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023 •

edited

Loading

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023 •

edited

Loading

auduchinok commented Nov 9, 2023 •

edited

Loading

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023

auduchinok commented Nov 9, 2023 •

edited

Loading

auduchinok commented Nov 9, 2023

T-Gro commented Nov 14, 2023

DotLambda: add parser recovery #16238

DotLambda: add parser recovery #16238

Conversation

DedSec256 commented Nov 7, 2023

T-Gro left a comment

Choose a reason for hiding this comment

DedSec256 commented Nov 8, 2023

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023 • edited Loading

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023 • edited Loading

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023 • edited Loading

auduchinok commented Nov 9, 2023 • edited Loading

T-Gro commented Nov 9, 2023

auduchinok commented Nov 9, 2023

auduchinok commented Nov 9, 2023 • edited Loading

auduchinok commented Nov 9, 2023

T-Gro commented Nov 14, 2023

auduchinok commented Nov 9, 2023 •

edited

Loading

auduchinok commented Nov 9, 2023 •

edited

Loading

auduchinok commented Nov 9, 2023 •

edited

Loading

auduchinok commented Nov 9, 2023 •

edited

Loading

auduchinok commented Nov 9, 2023 •

edited

Loading