-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor/feat: refactor identifier parsing a bit #109203
Conversation
(rustbot has picked a reviewer for you, use r? to override) |
I don't have much time to study these parser code changes and validate they're correct/don't introduce other regressions. maybe nils can take a look, or can re-roll |
dd630e2
to
6b65663
Compare
Can you separate the refactorings from the additional recovery? Either in a separate commit or a separate PR. |
@Nilstrieb Will do soonish. I was going to, but forgot and couldn't be bothered lol 🙄. |
6b65663
to
9eebc5e
Compare
@Nilstrieb Split into three commits; improving spans for |
@@ -395,7 +391,7 @@ impl<'a> Parser<'a> { | |||
} else { | |||
PatKind::Lit(const_expr) | |||
} | |||
} else if self.can_be_ident_pat() { | |||
} else if self.can_be_ident_pat() || self.is_lit_bad_ident().is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code adds (not really adds, it was already here with the early return previously which was also a mistake but now is the second best moment to fix it) a parser regression.
macro_rules! pat {
($p:pat) => {};
}
fn main() {
pat!(3meow);
}
This code should compile (as literals are valid patterns) but doesn't. For reference, replacing :pat
with :expr
makes it compile. It also compiles on stable, where the early return above hasn't landed yet.
This can be fixed by adding a self.may_revover() &&
before this check. I still don't exactly like it, but it should fix the regression. I would prefer it if this wasn't an eager recovery but instead only started to influence behavior once there truly was an error, but I can accept it if you add the may_recover()
and don't want to refactor it further.
For the future, always remember to think about how such changes can influence parser behavior and make sure to gate it behind a may_recover()
if there the parser isn't in an error state yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so I was the author of that suggestion. Bear in mind that I don't have years on experience working on rustc but I don't think this is a regression.
If you compile the following code:
macro_rules! pat {
($p:pat) => {
let $p = 5;
};
}
fn main() {
pat!(3meow);
}
(the same code but using the $p
metavariable)
Then it emits an error. This is the case since 1.0.0
, the only thing this suggestion does is pick up on always invalid code and provide a better error message.
Secondly, AFAIK Parser::may_recover
isn't "correct" here. Technically speaking, it is only for eager token recovery which neither the suggestion PR nor this PR introduce that. (eager recovery meaning consuming multiple tokens that might be valid?)
if there the parser isn't in an error state yet.
We are in an error state though, a numeric literal with an invalid suffix is always invalid.
Maybe I'm completely wrong (*cough* like #107813 *cough*) though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Secondly, AFAIK Parser::may_recover isn't "correct" here. Technically speaking, it is only for eager token recovery which neither the suggestion PR nor this PR introduce that.
You are absolutely right. Redirecting this to am early error like this is always wrong.
a numeric literal with an invalid suffix is always invalid.
This is not quite correct. A literal with an invalid syntax is semantically invalid. This means that it's not allowed in post expansion Rust code, as shown in your example which correctly errors. But an invalid suffix is syntactically valid, so we can't error out because macros might delete it like in my example.
I am not really sure about the best way to get the nice diagnostic without introducing regressions. Maybe finding out where the example normally errors and then trying to add something there?
But don't worry about making mistakes here, introducing parser regressions like this happens to many others as well, it's hard to spot unless you're already aware of the potential problem^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finding out where the example normally errors and then trying to add something there
Hmm, that'd be difficult because "invalid int lit suffix" errors are emitted while parsing expressions (which obviously can be in pattern position as well) and this diagnostic is only applicable to patterns. Maybe we could just put in self.may_recover
and a fixme?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean literals instead of expressions? That does sound a little tricky, adding a parameter would fix it but is also probably a little much.
So I guess the alternatives here are also having the diagnostic in other places where literals are allowed or not having it at all. I don't want to just put broken code behind a FIXME as these usually don't get fixed in quite a while.
Maybe you have some other ideas or you could try out adding the parameter if possible and see how bad it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so what do we do from here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave it up to you whether you would prefer removing the diagnostic or whether you'd be fine with changing it so that it also shows the note inside expressions. I don't think it would hurt, so I'd be fine with either.
In the meantime, you could also split out the first and last commit into a separate PR if you like, I would approve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure may_recover
wouldn't work here? It'll be a bit overreaching (all use in macros won't have the improved diagnostic) but it'll fix the regression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, but I'm not entirely sure, I would need to check.
But actually, I just changed my mind about this PR. While this doesn't fix the regression, it doesn't introduce a new one either. We should merge this and I'll open an issue about the regression (which you can claim if you want, but of course don't have to).
We can continue this discussion on the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And may_recover
is probably better than nothing, so let's do that anyways. It's not like this is a very critical issue anyways.
b1b182e
to
d7efda3
Compare
@Nilstrieb In any case, I've pushed my proposed changes with |
This comment has been minimized.
This comment has been minimized.
d7efda3
to
f08d17a
Compare
@Nilstrieb whoops, all fixed. Would you like a PR renaming |
I've played around with it and actually, the input tokens to attribute/derive proc macros do have to be semantically valid, so doing the eager check there is okay. So actually having the
Yes, that would be useful (although the exact wording might be subject to some bikeshedding) After you've added that comment and removed the comment on "can we recover here" (since you do recover there) this should be good to go. |
@Nilstrieb
EDIT: On second thought, probably not a good idea to recursively recover there, otherwise everything should be good to go? Also, with the |
f08d17a
to
05b5046
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bors r+
yes, just create a PR for that |
@bors r+ |
…=Nilstrieb refactor/feat: refactor identifier parsing a bit \+ error recovery for `expected_ident_found` Prior art: rust-lang#108854
…iaskrgr Rollup of 9 pull requests Successful merges: - rust-lang#108954 (rustdoc: handle generics better when matching notable traits) - rust-lang#109203 (refactor/feat: refactor identifier parsing a bit) - rust-lang#109213 (Eagerly intern and check CrateNum/StableCrateId collisions) - rust-lang#109358 (rustc: Remove unused `Session` argument from some attribute functions) - rust-lang#109359 (Update stdarch) - rust-lang#109378 (Remove Ty::is_region_ptr) - rust-lang#109423 (Use region-erased self type during IAT selection) - rust-lang#109447 (new solver cleanup + implement coherence) - rust-lang#109501 (make link clickable) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
suffix, | ||
}) = self.token.kind | ||
&& rustc_ast::MetaItemLit::from_token(&self.token).is_none() | ||
{ | ||
Some((symbol.as_str().len(), suffix.unwrap())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re: #110014
- suffix,
+ suffix: Some(suffix),
}) = self.token.kind
&& rustc_ast::MetaItemLit::from_token(&self.token).is_none()
{
- Some((symbol.as_str().len(), suffix.unwrap()))
+ Some((symbol.as_str().len(), suffix))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you like me to PR this? Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice if you could put up the fix, yes :)
…on, r=compiler-errors fix: fix regression in rust-lang#109203 Fixes rust-lang#110014 r? `@compiler-errors`
…iaskrgr Rollup of 6 pull requests Successful merges: - rust-lang#109806 (Workaround rust-lang#109797 on windows-gnu) - rust-lang#109957 (diagnostics: account for self type when looking for source of unsolved type variable) - rust-lang#109960 (Fix buffer overrun in bootstrap and (test-only) symlink_junction) - rust-lang#110013 (Label `non_exhaustive` attribute on privacy errors from non-local items) - rust-lang#110016 (Run collapsed GUI test in mobile mode as well) - rust-lang#110022 (fix: fix regression in rust-lang#109203) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
+ error recovery for
expected_ident_found
Prior art: #108854