-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error reporting and recovery #16
Comments
For custom diagnostics, @eternaleye is suggesting using "spines", made up of vertebrae, each standing in for an incomplete rule/SPPF node, and providing access to its completed children (on the left of the error) and child vertebra (overlapping the error). We can even provide some typed APIs, although the level of detail (i.e. type safety) has to be balanced with ergonomics. We could have, for each enum /*partial::*/Child<T: ?Sized> {
Complete(super::Handle<T>),
Partial(/*partial::*/Handle<T> /* aka Verterbra<T> */),
NotStarted,
} This way, the custom diagnostics could pattern-match on e.g.: Expr::Cast { expr: Child::Complete(expr), ty: Child::Partial(ty) } |
Potentially useful paper to look into: https://arxiv.org/abs/1804.07133 |
One of the simplest things we could do is keep a buffer of "attempted input matches" at the "most advanced input location", as we parse, keeping only the entries with the largest starting point.
Then they could be reported as an "expected one of ..." error.
rustc
itself does something similar:An optimization over this would be to not keep that buffer until there's an error, and then only redo the bit of the parse that errored, this time buffering input matches.
Another technique that would help localize and constrain an error, is to use backward parsing (from #13) after an error in forward parsing, to find the "longest valid prefix and suffix" of the input, and if the most advanced failures in both directions get close together, then the syntax error can be localized to even one token/character.
Error recovery could be done for a localized error by either:
f(x.)
could recover asCall(Var("f"), Field(Var("x"), ""))
, instead of showing up asf(x.a)
or similar (from picking a character that'd work) - this would be useful to IDEsAll approaches to recovery for a GLL parser can involve some amount of non-determinism, allowing multiple recovery possibilities to continue through, and picking the best outcome through heuristics.
The text was updated successfully, but these errors were encountered: