Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ refactor ] Multiline error report #1155

Merged
merged 1 commit into from
Mar 16, 2021

Conversation

andrevidela
Copy link
Collaborator

@andrevidela andrevidela commented Mar 4, 2021

@andrevidela andrevidela changed the title WIP - Multiline error report Multiline error report Mar 5, 2021
@andrevidela
Copy link
Collaborator Author

ready for review @gallais

@andrevidela andrevidela requested a review from gallais March 5, 2021 20:15
@andylokandy
Copy link
Contributor

andylokandy commented Mar 6, 2021

I'm still confused by why WithBounds tok, which could be NoBound, is enforced, while you can return tok when bound is not provided.

@andrevidela
Copy link
Collaborator Author

andrevidela commented Mar 6, 2021

It's explained in the commit message. Essentially, WithBounds used to always carry bounds but it also had a flag that indicates if the bounds are valid or not. This change makes it so that we know when a term has valid bounds or not. With the additional invariant that bounds are only valid if our grammar consumes its input. Indeed I assume that the token does not come from the input if the grammar is not guaranteed to consume. This is not quite the case because peek and alt both can return valid bounds while not being guaranteed to consume, but it's enough for most uses of the bound constructor. Bounds from non-consuming grammars can be recovered with clever use of position.

Almost all signatures return WithBounds True which means the parsed terms need to have valid bounds but that does not mean the flag is unnecessary, it means that every term that does not have valid bounds needs to be either merged with one which has (like in https://github.com/idris-lang/Idris2/pull/1155/files#diff-28f7a243932138226d35676705da183ae3ff559d1683ef8afb779e850004f378R692-R693) or needs to be given bounds (like in https://github.com/idris-lang/Idris2/pull/1155/files#diff-28f7a243932138226d35676705da183ae3ff559d1683ef8afb779e850004f378R855)

Edit: I just realised you're suggesting to return the token directly when there are no bounds. I haven't tried that but I suspect it would make mergebounds harder to use because now it needs to know in advance what the bounds are:

current:

mergeBounds : WithBounds lb ty -> WithBounds rb ty' -> WithBounds (lb || rb) ty'

updated:

mergeBounds : Either tok (WithBounds tok) -> Either tok (WithBounds tok) -> Either tok (WithBounds tok)

It doesn't even capture the invariant properly since if either the bounds are valid the result should have valid bounds, we could update this to carry proofs about Either but at this point it seems the original definition of mergeBounds is better suited.

src/Idris/Parser.idr Outdated Show resolved Hide resolved
@andylokandy
Copy link
Contributor

It doesn't even capture the invariant properly since if either the bounds are valid the result should have valid bounds, we could update this to carry proofs about Either but at this point it seems the original definition of mergeBounds is better suited.

Is there any concrete case where the proof is used?

@andrevidela
Copy link
Collaborator Author

Basically everywhere its using mergeBounds in the parser the parser expects valid bounds but either arguments aren't guaranteed to have bounds. Typically when parsing do blocks, we have a symbol do which always has bounds, but the block might not have bounds and be empty, but the entire parsed term has bounds because we merge its bounds with the bounds of the keyword do: https://github.com/idris-lang/Idris2/pull/1155/files#diff-28f7a243932138226d35676705da183ae3ff559d1683ef8afb779e850004f378R693

@andylokandy
Copy link
Contributor

I still can't figure out why the type-level proof for valid bound is necessary. (1) mergeBounds works before this PR (2) I can't find anything that uses Withbounds False in this PR.

@andrevidela
Copy link
Collaborator Author

andrevidela commented Mar 7, 2021

  • The type-level proof is necessary in order to have a correct implementation of .getBounds.
  • .getBounds is necessary in order to get the bounds of a symbol and forward it to failLoc which gives new bounds to an error
  • This is necessary because with the previous implementation, bounds for errors were reported with the next token to be parsed
  • Intuitively you would think that the old implementation of Fail could have taken a WithBounds tok (instead of Bound) and manually add it as the next token, however, this does not work because the type tok does not necessarily match with the type of the token parsed. Typically, symbols return () which is not PTerm.
  • Therefore, we need to use Bounds, therefore we need a reliable way to get a Bound, therefore we need an indication of when they are valid or not, and therefore the Bool flag

I can't find anything that uses Withbounds False in this PR.

Every calls to the constructor NoBounds is of type WithBounds False. Typically, the position primitive of the parser returns WithBounds False Bounds, because the value returned does not come from a valid position in the file being parsed.

@zhongzc
Copy link

zhongzc commented Mar 8, 2021

I think it should be fine to return an empty (col 0, row 0) Bound in case no bound is given, just like the isirrelevant in previous implementation. Because it's rare that a term has no bound, and eoi in blockEntry can be assigned the bound to the last row and last column plus one. So no need to carry the proof at type-level.

@andrevidela
Copy link
Collaborator Author

Yes this is correct, technically speaking you don't need the additional invariant because you can return a sentinel value just fine. In fact this is what most software does in practice, programming languages like Go or C do this routinely since they do not have ADTs. This is also what the previous implementation did by returning bounds of (-1, -1).

However I don't believe this is the right move when developing software. We have the opportunity to both document and enforce an invariant that we know should result in inaccurate diagnostics if broken (those are also particularly hard to notice since they don't result in runtime error). Maybe you're worried about the runtime performance, but it should be the same. That's because the flag simply moved position in the struct that the codegen uses for WIthBounds, it's now automatically handled by the compiler when matching on it, rather than checked manually by the programmer. If you are really worried about the runtime performance we can try to erase the flag but since every operation on it is O(1) I do not expect any measurable gain in performance

@andylokandy
Copy link
Contributor

andylokandy commented Mar 9, 2021

I agree with @zhongzc, but I rather worry about the balance between the assumption explicitness and the development efficiency. Even if Idris is more powerful in expressiveness than others, we should still think twice when we are able to use the power -- is it necessary? does it bring us runtime cost? is it harder to understand? In this case, I think the overwhelming syntax noise is not deserved by the only one (maybe not any) special case. Not only that, I think the special case is still well handled by the internal dynamical isIrrelavant field in WithBounds which is just hiding the type-level Bool into runtime.

@andrevidela
Copy link
Collaborator Author

I couldn't figure out what was wrong with the out-of-files bounds so I've reverted the changes to WithBounds @gallais

@andylokandy
Copy link
Contributor

@andrevidela A bug breaking the CI has been fixed by #1189. You need to rebase on the master.

@gallais gallais changed the title Multiline error report [ refactor ] Multiline error report Mar 16, 2021
@gallais gallais merged commit 405c266 into idris-lang:master Mar 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reporting single-quoted multi-line strings
5 participants