refactor/feat: refactor identifier parsing a bit #109203

Ezrashaw · 2023-03-16T06:27:06Z

+ error recovery for expected_ident_found

Prior art: #108854

rustbot · 2023-03-16T06:27:12Z

r? @compiler-errors

(rustbot has picked a reviewer for you, use r? to override)

compiler/rustc_parse/src/parser/diagnostics.rs

compiler/rustc_parse/src/parser/item.rs

compiler/rustc_span/src/lib.rs

compiler/rustc_parse/src/parser/diagnostics.rs

compiler-errors · 2023-03-16T22:44:16Z

I don't have much time to study these parser code changes and validate they're correct/don't introduce other regressions.

maybe nils can take a look, or can re-roll
r? @Nilstrieb

Noratrieb · 2023-03-17T07:05:50Z

Can you separate the refactorings from the additional recovery? Either in a separate commit or a separate PR.

Ezrashaw · 2023-03-17T07:29:57Z

@Nilstrieb Will do soonish. I was going to, but forgot and couldn't be bothered lol 🙄.

Ezrashaw · 2023-03-17T09:28:59Z

@Nilstrieb Split into three commits; improving spans for HelpIdentifierStartsWithNumber was lumped in there

compiler/rustc_span/src/lib.rs

compiler/rustc_parse/src/parser/diagnostics.rs

Noratrieb · 2023-03-18T12:25:14Z

compiler/rustc_parse/src/parser/pat.rs

@@ -395,7 +391,7 @@ impl<'a> Parser<'a> {
            } else {
                PatKind::Lit(const_expr)
            }
-        } else if self.can_be_ident_pat() {
+        } else if self.can_be_ident_pat() || self.is_lit_bad_ident().is_some() {


This code adds (not really adds, it was already here with the early return previously which was also a mistake but now is the second best moment to fix it) a parser regression.

macro_rules! pat { ($p:pat) => {}; } fn main() { pat!(3meow); }

This code should compile (as literals are valid patterns) but doesn't. For reference, replacing :pat with :expr makes it compile. It also compiles on stable, where the early return above hasn't landed yet.

This can be fixed by adding a self.may_revover() && before this check. I still don't exactly like it, but it should fix the regression. I would prefer it if this wasn't an eager recovery but instead only started to influence behavior once there truly was an error, but I can accept it if you add the may_recover() and don't want to refactor it further.

For the future, always remember to think about how such changes can influence parser behavior and make sure to gate it behind a may_recover() if there the parser isn't in an error state yet.

Right, so I was the author of that suggestion. Bear in mind that I don't have years on experience working on rustc but I don't think this is a regression.

If you compile the following code:

macro_rules! pat { ($p:pat) => { let $p = 5; }; } fn main() { pat!(3meow); }

(the same code but using the $p metavariable)

Then it emits an error. This is the case since 1.0.0, the only thing this suggestion does is pick up on always invalid code and provide a better error message.

Secondly, AFAIK Parser::may_recover isn't "correct" here. Technically speaking, it is only for eager token recovery which neither the suggestion PR nor this PR introduce that. (eager recovery meaning consuming multiple tokens that might be valid?)

if there the parser isn't in an error state yet.

We are in an error state though, a numeric literal with an invalid suffix is always invalid.

Maybe I'm completely wrong (*cough* like #107813 *cough*) though.

Secondly, AFAIK Parser::may_recover isn't "correct" here. Technically speaking, it is only for eager token recovery which neither the suggestion PR nor this PR introduce that.

You are absolutely right. Redirecting this to am early error like this is always wrong.

a numeric literal with an invalid suffix is always invalid.

This is not quite correct. A literal with an invalid syntax is semantically invalid. This means that it's not allowed in post expansion Rust code, as shown in your example which correctly errors. But an invalid suffix is syntactically valid, so we can't error out because macros might delete it like in my example.
I am not really sure about the best way to get the nice diagnostic without introducing regressions. Maybe finding out where the example normally errors and then trying to add something there?

But don't worry about making mistakes here, introducing parser regressions like this happens to many others as well, it's hard to spot unless you're already aware of the potential problem^^

finding out where the example normally errors and then trying to add something there

Hmm, that'd be difficult because "invalid int lit suffix" errors are emitted while parsing expressions (which obviously can be in pattern position as well) and this diagnostic is only applicable to patterns. Maybe we could just put in self.may_recover and a fixme?

You mean literals instead of expressions? That does sound a little tricky, adding a parameter would fix it but is also probably a little much.
So I guess the alternatives here are also having the diagnostic in other places where literals are allowed or not having it at all. I don't want to just put broken code behind a FIXME as these usually don't get fixed in quite a while.

Maybe you have some other ideas or you could try out adding the parameter if possible and see how bad it is.

Hmm, so what do we do from here?

I'll leave it up to you whether you would prefer removing the diagnostic or whether you'd be fine with changing it so that it also shows the note inside expressions. I don't think it would hurt, so I'd be fine with either.

In the meantime, you could also split out the first and last commit into a separate PR if you like, I would approve that.

Are you sure may_recover wouldn't work here? It'll be a bit overreaching (all use in macros won't have the improved diagnostic) but it'll fix the regression.

I think so, but I'm not entirely sure, I would need to check.
But actually, I just changed my mind about this PR. While this doesn't fix the regression, it doesn't introduce a new one either. We should merge this and I'll open an issue about the regression (which you can claim if you want, but of course don't have to).
We can continue this discussion on the issue.

And may_recover is probably better than nothing, so let's do that anyways. It's not like this is a very critical issue anyways.

compiler/rustc_parse/src/parser/diagnostics.rs

Ezrashaw · 2023-03-19T10:38:38Z

@Nilstrieb In any case, I've pushed my proposed changes with Parser::may_recover.

compiler/rustc_parse/src/parser/pat.rs

Ezrashaw · 2023-03-20T03:35:36Z

@Nilstrieb whoops, all fixed. Would you like a PR renaming may_recover to may_recover_lookahead as well?

Noratrieb · 2023-03-20T07:36:34Z

I've played around with it and actually, the input tokens to attribute/derive proc macros do have to be semantically valid, so doing the eager check there is okay. So actually having the self.may_recover() check here does catch all cases. It would be nice if you could add a comment to the may_recover call roughly like that

Don't eagerly error on semantically invalid tokens when matching declarative macros, as the input to those doesn't have to be semantically valid.
For attribute/derive proc macros this is not the case, so doing the recovery for them is fine.

Would you like a PR renaming may_recover to may_recover_lookahead as well?

Yes, that would be useful (although the exact wording might be subject to some bikeshedding)

After you've added that comment and removed the comment on "can we recover here" (since you do recover there) this should be good to go.

Ezrashaw · 2023-03-20T07:49:51Z

@Nilstrieb
~~After you've added that comment and removed the comment on "can we recover here" (since you do recover there) this should be good to go.~~

~~Sorry, I meant can we recursively recover there? I'm not entirely sure that we shouldn't but I'm not sure.~~

EDIT: On second thought, probably not a good idea to recursively recover there, otherwise everything should be good to go?

Also, with the may_recover -> may_recover_lookahead, should I just create a PR and bikeshed it on the PR?

Noratrieb

@bors r+

Noratrieb · 2023-03-20T12:55:36Z

yes, just create a PR for that

Noratrieb · 2023-03-20T14:04:49Z

@bors r+

bors · 2023-03-20T14:04:51Z

📌 Commit 05b5046 has been approved by Nilstrieb

It is now in the queue for this repository.

…=Nilstrieb refactor/feat: refactor identifier parsing a bit \+ error recovery for `expected_ident_found` Prior art: rust-lang#108854

…iaskrgr Rollup of 9 pull requests Successful merges: - rust-lang#108954 (rustdoc: handle generics better when matching notable traits) - rust-lang#109203 (refactor/feat: refactor identifier parsing a bit) - rust-lang#109213 (Eagerly intern and check CrateNum/StableCrateId collisions) - rust-lang#109358 (rustc: Remove unused `Session` argument from some attribute functions) - rust-lang#109359 (Update stdarch) - rust-lang#109378 (Remove Ty::is_region_ptr) - rust-lang#109423 (Use region-erased self type during IAT selection) - rust-lang#109447 (new solver cleanup + implement coherence) - rust-lang#109501 (make link clickable) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup

compiler-errors · 2023-04-06T19:39:24Z

compiler/rustc_parse/src/parser/diagnostics.rs

+            suffix,
+        }) = self.token.kind
+            && rustc_ast::MetaItemLit::from_token(&self.token).is_none()
+        {
+            Some((symbol.as_str().len(), suffix.unwrap()))


re: #110014

- suffix, + suffix: Some(suffix), }) = self.token.kind && rustc_ast::MetaItemLit::from_token(&self.token).is_none() { - Some((symbol.as_str().len(), suffix.unwrap())) + Some((symbol.as_str().len(), suffix))

~~Would you like me to PR this?~~ Done.

would be nice if you could put up the fix, yes :)

…on, r=compiler-errors fix: fix regression in rust-lang#109203 Fixes rust-lang#110014 r? `@compiler-errors`

…iaskrgr Rollup of 6 pull requests Successful merges: - rust-lang#109806 (Workaround rust-lang#109797 on windows-gnu) - rust-lang#109957 (diagnostics: account for self type when looking for source of unsolved type variable) - rust-lang#109960 (Fix buffer overrun in bootstrap and (test-only) symlink_junction) - rust-lang#110013 (Label `non_exhaustive` attribute on privacy errors from non-local items) - rust-lang#110016 (Run collapsed GUI test in mobile mode as well) - rust-lang#110022 (fix: fix regression in rust-lang#109203) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup

rustbot assigned compiler-errors Mar 16, 2023

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Mar 16, 2023

Ezrashaw commented Mar 16, 2023

View reviewed changes

compiler/rustc_parse/src/parser/diagnostics.rs Outdated Show resolved Hide resolved

compiler-errors reviewed Mar 16, 2023

View reviewed changes

compiler/rustc_parse/src/parser/item.rs Outdated Show resolved Hide resolved

compiler-errors reviewed Mar 16, 2023

View reviewed changes

compiler/rustc_span/src/lib.rs Outdated Show resolved Hide resolved

compiler/rustc_parse/src/parser/diagnostics.rs Outdated Show resolved Hide resolved

rustbot assigned Noratrieb and unassigned compiler-errors Mar 16, 2023

Ezrashaw force-pushed the refactor-ident-parsing branch from dd630e2 to 6b65663 Compare March 17, 2023 06:28

Ezrashaw force-pushed the refactor-ident-parsing branch from 6b65663 to 9eebc5e Compare March 17, 2023 09:27

Noratrieb requested changes Mar 18, 2023

View reviewed changes

Noratrieb added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 18, 2023

Ezrashaw added 2 commits March 19, 2023 20:20

refactor: refactor identifier parsing somewhat

c9ddb73

refactor: improve "ident starts with number" error

b4e17a5

Ezrashaw force-pushed the refactor-ident-parsing branch 2 times, most recently from b1b182e to d7efda3 Compare March 19, 2023 10:37

Noratrieb reviewed Mar 19, 2023

View reviewed changes

compiler/rustc_parse/src/parser/pat.rs Outdated Show resolved Hide resolved

Noratrieb reviewed Mar 19, 2023

View reviewed changes

compiler/rustc_parse/src/parser/pat.rs Outdated Show resolved Hide resolved

This comment has been minimized.

Sign in to view

Ezrashaw force-pushed the refactor-ident-parsing branch from d7efda3 to f08d17a Compare March 20, 2023 03:32

feat: implement error recovery in expected_ident_found

05b5046

Ezrashaw force-pushed the refactor-ident-parsing branch from f08d17a to 05b5046 Compare March 20, 2023 07:54

Noratrieb approved these changes Mar 20, 2023

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 20, 2023

matthiaskrgr mentioned this pull request Mar 21, 2023

Rollup of 9 pull requests #109465

Closed

matthiaskrgr mentioned this pull request Mar 22, 2023

Rollup of 9 pull requests #109503

Merged

bors merged commit 34fa6da into rust-lang:master Mar 23, 2023

rustbot added this to the 1.70.0 milestone Mar 23, 2023

Ezrashaw deleted the refactor-ident-parsing branch March 26, 2023 03:33

dwrensha mentioned this pull request Apr 6, 2023

'rustc' panicked at 'called Option::unwrap() on a None value' #110014

Closed

compiler-errors reviewed Apr 6, 2023

View reviewed changes

Ezrashaw added a commit to Ezrashaw/rust that referenced this pull request Apr 6, 2023

fix: fix regression in rust-lang#109203

9dbf20e

matthiaskrgr mentioned this pull request Apr 6, 2023

Rollup of 6 pull requests #110024

Merged

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Apr 6, 2023

Rollup merge of rust-lang#110022 - Ezrashaw:fix-parser-ident-regressi…

903b439

…on, r=compiler-errors fix: fix regression in rust-lang#109203 Fixes rust-lang#110014 r? `@compiler-errors`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor/feat: refactor identifier parsing a bit #109203

refactor/feat: refactor identifier parsing a bit #109203

Ezrashaw commented Mar 16, 2023

rustbot commented Mar 16, 2023

compiler-errors commented Mar 16, 2023

Noratrieb commented Mar 17, 2023

Ezrashaw commented Mar 17, 2023

Ezrashaw commented Mar 17, 2023

Noratrieb Mar 18, 2023

Ezrashaw Mar 19, 2023

Noratrieb Mar 19, 2023

Ezrashaw Mar 19, 2023

Noratrieb Mar 19, 2023

Ezrashaw Mar 19, 2023

Noratrieb Mar 19, 2023

Ezrashaw Mar 19, 2023

Noratrieb Mar 19, 2023

Noratrieb Mar 19, 2023

Ezrashaw commented Mar 19, 2023

This comment has been minimized.

Ezrashaw commented Mar 20, 2023

Noratrieb commented Mar 20, 2023 •

edited

Loading

Ezrashaw commented Mar 20, 2023 •

edited

Loading

Noratrieb left a comment

Noratrieb commented Mar 20, 2023

Noratrieb commented Mar 20, 2023

bors commented Mar 20, 2023

compiler-errors Apr 6, 2023 •

edited

Loading

Ezrashaw Apr 6, 2023 •

edited

Loading

Noratrieb Apr 6, 2023

refactor/feat: refactor identifier parsing a bit #109203

refactor/feat: refactor identifier parsing a bit #109203

Conversation

Ezrashaw commented Mar 16, 2023

rustbot commented Mar 16, 2023

compiler-errors commented Mar 16, 2023

Noratrieb commented Mar 17, 2023

Ezrashaw commented Mar 17, 2023

Ezrashaw commented Mar 17, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ezrashaw commented Mar 19, 2023

This comment has been minimized.

Ezrashaw commented Mar 20, 2023

Noratrieb commented Mar 20, 2023 • edited Loading

Ezrashaw commented Mar 20, 2023 • edited Loading

Noratrieb left a comment

Choose a reason for hiding this comment

Noratrieb commented Mar 20, 2023

Noratrieb commented Mar 20, 2023

bors commented Mar 20, 2023

compiler-errors Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Ezrashaw Apr 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Noratrieb commented Mar 20, 2023 •

edited

Loading

Ezrashaw commented Mar 20, 2023 •

edited

Loading

compiler-errors Apr 6, 2023 •

edited

Loading

Ezrashaw Apr 6, 2023 •

edited

Loading