Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix extension pragma inserted below ghc options pragma #2364 #2392

Merged
merged 25 commits into from
Nov 30, 2021

Conversation

eddiemundo
Copy link
Collaborator

@eddiemundo eddiemundo commented Nov 23, 2021

Fixes #2364

It introduces a parser that uses the ghc api lexer to parse everything before the first "declaration".

Some additional features added:

There is a haddock comment that explains the rules the parser uses to gather next pragma location information, as well as some rationale behind why we aren't using the compiler's parser

Only thing left to do is fix any CI and ghc 9+ stuff.

@jneira
Copy link
Member

jneira commented Nov 24, 2021

@eddiemundo many thanks for working on this. Using parser combinators seems to be a much cleaner solution. However i wonder why we have to reinvent the ghc ast parser. I guess we are who cares about where the pragma should be put programatically 😟

@eddiemundo
Copy link
Collaborator Author

@eddiemundo many thanks for working on this. Using parser combinators seems to be a much cleaner solution. However i wonder why we have to reinvent the ghc ast parser. I guess we are who cares about where the pragma should be put programatically worried

The reason is because there are parse errors that can be fixed by a number of extensions, yet when those extensions are turned on can result in other parse errors... so in general if we use the compiler to get pragma annotations there will be some cases where we cannot insert an extension pragma. I could be wrong and maybe there is a way to get annotations without completing a parse, but I couldn't figure out how. So it seemed like re-implementation was the only way to solve all the cases.

@jneira
Copy link
Member

jneira commented Nov 24, 2021

i see, many thanks for the explanation

@eddiemundo
Copy link
Collaborator Author

Ugh, now that I've said that, I think there actually might be a way to use ghc api to do the parsing work. There is a lexTokenStream function that I looked at but didn't think it could do it, but on closer inspection probably can get the information needed. It'll probably read the entire file in most cases when that's not necessary, but that probably makes no practical speed difference. I'll try and see if that works later.

@eddiemundo
Copy link
Collaborator Author

I looked into using the lexer for parsing comments and pragmas, but I'm not convinced it's better than the standalone parser implementation.
The reasons are:

  • the lexer lexes the source into specific tokens that are too specific for the needs of inserting pragmas. I'd have to do a decent amount of experimentation to figure out what is and isn't the sort of thing that would be at the top of the file and when to stop lexing.
  • there would have to be 2 implementations because the annotations are stored differently in GHC 9.2.1.

@eddiemundo
Copy link
Collaborator Author

eddiemundo commented Nov 24, 2021

Actually I guess there wouldn't need to be 2 implementations, because the tokens can replace needing to look at annotations. The benefits I can see is that it is more likely to be correct.

Nevermind there would still be 2 implementations but they wouldn't be too different.

@eddiemundo eddiemundo marked this pull request as draft November 24, 2021 23:10
@nini-faroux
Copy link
Contributor

hi @eddiemundo

One thing I'm wondering is in the user's guide it says that OPTIONS_GHC is a file header pragma, so would it still be a user error (even if it compiles) to place it lower down the file or does it make any difference?

https://downloads.haskell.org/~ghc/7.0.3/docs/html/users_guide/pragmas.html

And I guess the shebangs should be at the top of the file too?

Aside from these though there's at least one other scenario that I didn't take into account before that needs to be fixed, for example a multiline language pragma separated by commas like.

Before: 
{-# LANGUAGE RecordWildCards,  
OverloadedStrings #-}

After: (but wrong)
{-# LANGUAGE RecordWildCards,
{-# LANGUAGE BangPatterns #-}   
OverloadedStrings #-}

@jneira
Copy link
Member

jneira commented Nov 25, 2021

Aside from these though there's at least one other scenario that I didn't take into account before that needs to be fixed, for example a multiline language pragma separated by commas like.

wow, it deserves a test marked as knownBroken (it it already does not exist)

@nini-faroux
Copy link
Contributor

I can do that this evening, I just realised it now after trying out more cases

@eddiemundo
Copy link
Collaborator Author

Yes, I think the solution that exists currently in this PR, and the solution that I'm working on that uses the ghc lexer correctly solve those cases already. I'm also trying to have it be able to work on annoying cases like

{- asdf
asdf -} -- | asdflj

and

{-# pragma
-#} {-
lasdjf
-}

so that pragmas are inserted inbetween the comments and the haddock (even if this is already a haddock error), and in between pragmas and the block comments (or line comments)

Also inserting after any type of pragma including unrecognized ones.

@nini-faroux
Copy link
Contributor

Ok great, that's better than depending on the user to place them at the top.

@eddiemundo eddiemundo marked this pull request as ready for review November 29, 2021 21:16
Copy link
Member

@jneira jneira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks really great, including new test cases and helpful comments, perfect, many thanks

@jneira jneira added the merge me Label to trigger pull request merge label Nov 30, 2021
@mergify mergify bot merged commit 083f542 into haskell:master Nov 30, 2021
drsooch pushed a commit to drsooch/haskell-language-server that referenced this pull request Dec 3, 2021
…askell#2392)

* new parser for stuff before first declaration

* remove unused pragmas, modify haddock comment on parser

* working but need to clean lots of little things and add more tests

* uncomment completions functions and tests (was trying to see why the test timeout), merge textedits to get around lsp-test applying text edits in reverse order, inserting pragma between lines fixes, some tests

* add line splitting tests, fix line splitting errors and among other things, add docs

* change comments, add cpp for setting use_pos_prags bit in PState

* add safeImportsOn to compat, fix ghc versions

* fix compat

* fix compat

* fix compat 3

* fix compat 4

* fix compat 5

* fix test

* fix compat 6

* add back some tests and investigate haskell#2375 later

Co-authored-by: Javier Neira <[email protected]>
@eddiemundo eddiemundo deleted the fix-pragma-inserts branch December 9, 2021 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merge me Label to trigger pull request merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extension pragma inserted below ghc options pragma (or shebang) no matter where it is in file
3 participants