Error reporting and recovery #16

eddyb · 2018-08-21T19:50:08Z

One of the simplest things we could do is keep a buffer of "attempted input matches" at the "most advanced input location", as we parse, keeping only the entries with the largest starting point.
Then they could be reported as an "expected one of ..." error. rustc itself does something similar:

use x.

error: expected one of `::`, `;`, or `as`, found `.`
 --> src/lib.rs:1:6
  |
1 | use x.
  |      ^ expected one of `::`, `;`, or `as` here

An optimization over this would be to not keep that buffer until there's an error, and then only redo the bit of the parse that errored, this time buffering input matches.

Another technique that would help localize and constrain an error, is to use backward parsing (from #13) after an error in forward parsing, to find the "longest valid prefix and suffix" of the input, and if the most advanced failures in both directions get close together, then the syntax error can be localized to even one token/character.

Error recovery could be done for a localized error by either:

splicing in one of the expected inputs and using an incremental reparse (Incremental reparsing #12) to redo the parse, in the hope that it will get farther (this can be repeated as long as it keeps shrinking the error)
faking success on the most advanced failures, without actual input being used up by the parse, e.g. f(x.) could recover as Call(Var("f"), Field(Var("x"), "")), instead of showing up as f(x.a) or similar (from picking a character that'd work) - this would be useful to IDEs

All approaches to recovery for a GLL parser can involve some amount of non-determinism, allowing multiple recovery possibilities to continue through, and picking the best outcome through heuristics.

The text was updated successfully, but these errors were encountered:

eddyb · 2018-09-18T06:33:21Z

For custom diagnostics, @eternaleye is suggesting using "spines", made up of vertebrae, each standing in for an incomplete rule/SPPF node, and providing access to its completed children (on the left of the error) and child vertebra (overlapping the error).
(this has interactions with #17 - for efficiency, we'd want to build these spines while parsing but only keep the rightmost ones, and discard the rest, including the orphan SPPF nodes they kept alive)
That way, hand-written recursive descent diagnostic logic can be ported over, with the main distinction (other than exact API in use) being that the choice of diagnostic itself can be ambiguous.

We can even provide some typed APIs, although the level of detail (i.e. type safety) has to be balanced with ergonomics. We could have, for each Foo rule, a partial::Foo version, using, for fields:

enum /*partial::*/Child<T: ?Sized> {
    Complete(super::Handle<T>),
    Partial(/*partial::*/Handle<T> /* aka Verterbra<T> */),
    NotStarted,
}

This way, the custom diagnostics could pattern-match on e.g.:

Expr::Cast { expr: Child::Complete(expr), ty: Child::Partial(ty) }

eddyb · 2018-10-18T15:21:52Z

Potentially useful paper to look into: https://arxiv.org/abs/1804.07133

This was referenced Aug 21, 2018

GC the SPPF based on whether a node is still "reachable" #17

Open

Shrink the GSS based on whether calls can still be performed #18

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error reporting and recovery #16

Error reporting and recovery #16

eddyb commented Aug 21, 2018

eddyb commented Sep 18, 2018 •

edited

Loading

eddyb commented Oct 18, 2018

Error reporting and recovery #16

Error reporting and recovery #16

Comments

eddyb commented Aug 21, 2018

eddyb commented Sep 18, 2018 • edited Loading

eddyb commented Oct 18, 2018

eddyb commented Sep 18, 2018 •

edited

Loading