Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emergency workaround for GHC 7.8.1 on OSX #1740

Closed
hvr opened this issue Mar 31, 2014 · 33 comments
Closed

Emergency workaround for GHC 7.8.1 on OSX #1740

hvr opened this issue Mar 31, 2014 · 33 comments

Comments

@hvr
Copy link
Member

hvr commented Mar 31, 2014

The bug documented in Haddock#284 turned out be caused by Clang's specific CPP behavior. So it affects mostly OSX >= 10.9 users.

If things are left as they are, affected users would have to inject the ghc option -optP-P, i.e.

cabal haddock --ghc-options=-optP-P

By having Cabal workaround this, users could be shielded from this Clang-effect.

Some discussions that occurred on IRC, where @thoughtpolice explains the underlying issue:

 <thoughtpolice> here's the bizarre thing: it outputs #line pragmas like
 <thoughtpolice> #line 1 "some/random/file.hs"
 <thoughtpolice> #line 2 ...
 <thoughtpolice> #line 3 ...
 <thoughtpolice>
 <thoughtpolice> #line 4 ...
 <thoughtpolice> notice the newline
 <thoughtpolice> the strange this is, if you put another newline after 1, 2, and 4, it will work!
 <Fuuzetsu> Huh. And GHC does the expected thing?
 <thoughtpolice> so the actual 'problem' is that Clang/GHC don't play nicely when there are multi-line #line pragmas
 <thoughtpolice> Fuuzetsu: apparently. somehow. i guess
 <thoughtpolice> i really still can't quite understand it despite staring at it for a while
 <thoughtpolice> Fuuzetsu: here's the workaround, though: on OS X, just enforce "-optP-P" to be passed to GHC
 <Fuuzetsu> what is that option?
 <thoughtpolice> this should suppress the problem by suppressing #line info, which is only ever really useful in an error message/warning anyway. i think it's acceptable for haddock to enable it, and expect the program to compile/typecheck etc
 <thoughtpolice> I suppose a Cabal update might be in needed for the Haddock thing
 <thoughtpolice> but IMO with such an easy workaround at least, it's not 100% critical
 <thoughtpolice> no, the line marker thing
 <hvr> (what's the easy workaround btw? tweaking  haddock or tweaking ghc?)
 <thoughtpolice> neither, you have to tweak Cabal
 <thoughtpolice> by running 'cabal haddock --ghc-options=-optP-P'
 <thoughtpolice> right well, a minor cabal bump can happen to paper over it in the mean time
 <thoughtpolice> so i don't think that's a showstopper anymore, just the segfault thing
 <hvr> 'Cabal' or 'cabal-install'?
 <thoughtpolice> probably Cabal
 <thoughtpolice> since that's what will actually deal with invoking it
 <thoughtpolice> (note that it affects './Setup.hs haddock' too)
 <thoughtpolice> it has to know about the C compiler anyway to compile C files so that could be detected easily
 <thoughtpolice> the easiest fix is to be conservative, and just automatically do that when OS X Version >= 10.9
 <hvr> it doesn't hurt when gcc is used?
 <thoughtpolice> i guess there are people who could be on 10.8 with XCode 5 or whatever, but that seems unlikely and odd
 <thoughtpolice> it does not hurt when GCC is used
 <thoughtpolice> and you don't need to do anything
 <thoughtpolice> so detecting Clang is the most correct choice and can be done easily
 <thoughtpolice> note: easy way to detect clang (the right way to do it) something like
 <thoughtpolice> clang -dM -E - < /dev/null | grep __clang__
 <thoughtpolice> #define __clang__ 1
@tibbe
Copy link
Member

tibbe commented Mar 31, 2014

i really still can't quite understand it despite staring at it for a while

I would like to understand the issue before we do something. What's going wrong? Which program is to blame?

@hvr
Copy link
Member Author

hvr commented Mar 31, 2014

The only thing I can offer right now as additional bit of information (as I don't have OSX here to investigate myself) is

<Fuuzetsu> I know that much but I'm unsure where the workaround is planned to be implemented (cabal side? Haddock side?) and what's happening with it. AFAIK Austin went off to investigate further and I haven't heard since.
<thoughtpolice> Fuuzetsu: sorry I let that out. yes, a fix on the Cabal side need to happen unfortunately, Haddock cannot fix it
<Fuuzetsu> can't we just inject the option on our end or is it too late by that point?
<Fuuzetsu> also, do you expect this workaround to fix with 7.8.1 somehow?
<thoughtpolice> it's too late at that point unfortunately. When you run cabal haddock, it actually preprocesses immediately (Cabal does), then it runs Haddock
<thoughtpolice> it's the first part that needs the -optP-P, not the second, because the first is what generates the line markers in the first place

maybe @thoughtpolice can comment on this a bit more

@thoughtpolice
Copy link
Member

Just to say this first: I don't think this is quite a blocker for the release of GHC, and an 'immediate' fix might not be necessary. For one, there is a temporary workaround. Two, the problem isn't fully understood, so I'm hesitant to rush this when for the most.

That said - the interaction seems to be a bad one between Clang/GHC. I speculate the problem is this: Cabal first preprocesses the source files, and inserts line pragma markers. Then, Haddock is run over all the results of this. But Haddock uses GHC, which in turn invokes the preprocessor again. I suspect this second invocation is where the problem lies.

But after staring at it for an inordinate amount of time, I couldn't make heads or tails as to why. For one, GHC's Lexer actually already handles these markers natively - I even tried tweaking the lexer itself, to no avail. Second, GHC emits line markers in almost the exact same way - some markers with only one \n separator, others with two. And that doesn't explain why if you insert an \n\n after the markers, it works with Clang either.

Overall I don't think a temporary fix would be great in Cabal (I don't have time to write one, but I can test it), but in the mean time we can at least suggest a workaround for the moment.

I'll follow up later with steps to reproduce and verify all this.

@mzero
Copy link
Member

mzero commented Mar 31, 2014

Hold on - I don't think cabal is the right place to deal with this.
HP on OS X already ships with wrappers around both cabal and ghc. Hence, we have ample opportunity to inject -optP-P. Should we do that all the time on clang based systems? Only when we detect ghc has been asked to pre-process? Only when cabal is invoked with haddock?

@thoughtpolice
Copy link
Member

I think Cabal is the right place for the fix, even if it's temporary: there is another case the Haskell Platform cannot fix, which is when people may use custom Setup.hs binaries that get invoked for these stages. In particular, GHC invokes ./Setup haddock ... at one point in the build system, which the Platform can't fix (see ghc/ghc@52c6dc9 for the fix).

Cabal could easily determine if this workaround is needed. It is only needed during cabal haddock when Clang is the preprocessor, which Cabal can already determine based on ghc --info output. It's also easy to determine if Clang is the real deal:

$ clang -dM -E - < /dev/null | grep __clang__
#define __clang__ 1

@tibbe
Copy link
Member

tibbe commented Apr 1, 2014

If this is the only way to unbreak users I'm fine with it, but I'd like to understand the issue better first so we really are getting it fixed.

@mzero
Copy link
Member

mzero commented Apr 1, 2014

On Mon, Mar 31, 2014 at 10:51 PM, Austin Seipp [email protected]:

In particular, GHC invokes ./Setup haddock ... at one point in the build
system, which the Platform can't fix (see ghc/ghc@52c6dc9ghc/ghc@52c6dc970272437aa83a936fc1fe63977fa6178dfor the fix).

You mean GHC's build itself, right? At least that is what seems like in
that bug fix. People building with GHC will never encounter a situation
where GHC itself will call ./Setup haddock correct?

Cabal could easily determine if this workaround is needed. It is only
needed during cabal haddock when Clang is the preprocessor,

On OS X, we alter the settings file of GHC so that instead of calling gcc,
it calls a wrapper script: https://gist.github.com/mzero/7245290

("C compiler command", "/usr/bin/ghc-clang-wrapper")

That wrapper script adds a bunch of arguments to clang, and adjusts some
others. It can easily add -P either all the time, or just when -E is
present.

This won't work for the GHC build itself, but shouldn't that cover
everything else?

  • Mark

P.S.: How fast can we move to cpphs for preprocessing? :-)

@tibbe
Copy link
Member

tibbe commented Apr 1, 2014

It can easily add -P either all the time, or just when -E is present.

Ideally we'd limit this to only when calling Haddock. Otherwise we will get poor error messages for all Haskell code that uses CPP, as the line numbers will be wrong in GHC's error messages.

@cartazio
Copy link
Contributor

cartazio commented Apr 1, 2014

Alternatively: we could add the CPP settings patch to 7.8.1 and ship ghc with cpphs right?

@cartazio
Copy link
Contributor

cartazio commented Apr 1, 2014

(I realize that the CPPHS patch work was punted to 7.8.2, but maybe this merits revisting doing that solution so that none of these wrapper hacks are needed). Especially since it should now validate and install cleanly. (all that'd be needed is to bundle cpphs along with a suitably patched ghc)

@tibbe
Copy link
Member

tibbe commented Apr 1, 2014

@cartazio I don't want to delay 7.8.1 more. The cpphs thing is not tested well enough yet to know it doesn't break even more stuff.

@cartazio
Copy link
Contributor

cartazio commented Apr 1, 2014

fair enough, though with the most recent CPPHS lens does build as is, which is a pretty good stress test

@hvr
Copy link
Member Author

hvr commented Apr 4, 2014

fyi, one of the problems seem to be that the second pre-processing via by clang -E tends to insert line-pragmas with a leading whitespace char (this doesn't happen with GNU cpp), e.g.

{-# LINE 1 "Control/Applicative/Free.hs" #-}
# 1 "Control/Applicative/Free.hs"
# 1 "<built-in>" 1
# 1 "Control/Applicative/Free.hs" 2
{-# LINE 1 "src/Control/Applicative/Free.hs" #-}
# 1 "src/Control/Applicative/Free.hs"

 # 1 "<built-in>" 1

 # 20 "<built-in>"

 # 1 "dist/build/autogen/cabal_macros.h" 1
...

@23Skidoo
Copy link
Member

Looks like we agreed to not add a Cabal workaround for this. Please reopen if disagree.

@elaforge
Copy link

ghc-7.8.3 is out and we still have this problem. I think this is pretty bad user experience: parallel cabal no longer prints errors to stdout, so you don't even know which packages failed to get documentation. You find out much later when there are broken links on the local haddock. Then you eventually figure out to manually unpack and haddock a package and see the error, then google it and the only references are two bugs (this one and the haddock one), both closed with "will not fix", and no fix in sight.

The workaround seems to be to manually unpack and haddock each package with -P, then manually copy the haddock into place (a reinstall of everything has scary warnings about breaking other things). That's a lot of work! So the result is that OS X users either don't get reliable haddock, or can't use cabal-install.

The solutions seem to be use cpphs, or put a hack in cabal. cpphs seems like it may not happen for a while, so what's wrong with the cabal hack?

It would be nice at least to have a plan to fix documented in this thread so future people searching can find something more satisfying than "broken with no plan"! Is cpphs the plan?

In any case, for any future searchers, I eventually figured out you can add -P to "Haskell CPP Flags" in /usr/local/lib/ghc-7.8.3/settings, then reinstall everything.

@thoughtpolice
Copy link
Member

@elaforge GHC 7.8.3 contains a fix to the compiler that allows you to trivially set the C preprocessor for Haskell code to a separate tool like cpphs, as you noted. This makes it significantly easier for users to fix it without messing around too much, hopefully.

In the future we're seriously considering bundling cpphs with GHC itself as a separate tool, and having it default to it.

Personally I think the fix suggested in the OP should still be considered as a hack, but I can't reopen this ticket?

@mzero
Copy link
Member

mzero commented Jul 11, 2014

  1. Does adding "-P" to "Haskell CPP Flags" fix this, and is that safe for
    all compilations? If so, Haskell Platform will ship with a script that runs
    post-install and already is fixing up that field based on the user's
    installed C compiler. I can add this to that script.

  2. Does this affect the pre-built haddock that is in the 7.8.3 bindist
    already (the bindist I built last night), or only haddock built for some
    projects afterward?

@elaforge
Copy link

@thoughtpolice I assume that's "Haskell CPP command"? I'm all in favor of using cpphs. Would a complete hackage build prove it good enough? I don't know how to do that, but I know some people do.

@mzero
1 - All I can say is "works for me" (tm) :) I couldn't find any documentation for -P so I don't even know what it does!

2 - I'm using the 7.8.3 for OS X downloaded from http://www.haskell.org/ghc/download_ghc_7_8_3, if that's the one you built, then yes, it does affect that.

@mzero
Copy link
Member

mzero commented Jul 11, 2014

On Fri, Jul 11, 2014 at 12:17 PM, Evan Laforge [email protected]
wrote:

1 - All I can say is "works for me" (tm) :)

well.. that's a good data point! your systems has more than the average
amount of code and packages I imagine! :-)

I couldn't find any documentation for -P so I don't even know what it
does!

[966] : clang --help | grep -e '-P'

-P Disable linemarker output in -E mode

2 - I'm using the 7.8.3 for OS X downloaded from

http://www.haskell.org/ghc/download_ghc_7_8_3, if that's the one you
built, then yes, it does affect that.

Yikes - I wasn't clear. Is there missing/broken doc in that build? I
realize that doc it builds is subject to this problem... but I'm wondering
if any of the doc that comes befell this fate.

  • Mark

@hvr
Copy link
Member Author

hvr commented Jul 12, 2014

@elaforge

I assume that's "Haskell CPP command"? I'm all in favor of using cpphs. Would a complete hackage build prove it good enough? I don't know how to do that, but I know some people do.

Yeah, I'd say that's the most important metric. Maybe @snoyberg could help here by performing a cpphs-based Stackage build?

@cartazio
Copy link
Contributor

for what its worth, on my mac I'm using the following configure invocation on the binary dist

./configure --with-gcc=clang --with-hs-cpp=gcc-4.9

you can get CPPHS flags setup correctly if you do
./configure --with-gcc=clang --with-hs-cpp=cpphs
before you make install the bindist

NB: last I checked, you can't build GHC with cpphs, maybe thats improved,
theres still a few spots where GCC seems to be hard coded in the build system

@elaforge
Copy link

Oh, if you mean the docs bundled with ghc, they seem to all be present.

Also, as Johan mentioned, we would want -P only for haddock builds since it messes up line numbers in error msgs.

@mzero
Copy link
Member

mzero commented Jul 15, 2014

Okay - well, now we are back at square 1: I can have the OS X Platform installer fix up the newly added "Haskell CPP flags" settings entry to include -P on clang systems....

...but that will get used for all pre-processing, not just haddock, and it will break the line numbers in compilation error messages for -XCPP files.

@mzero
Copy link
Member

mzero commented Jul 15, 2014

AHA! may be solved

On Mac OS X, we wrap cabal. The purpose of the wrapper script is to install a default .cabal/~config for users. The new cabal config includes this option:

program-default-options
  haddock-options:

Can set it there?

program-default-options
  haddock-options: --optghc=-optP-P

In particular, when building haddock, does cabal call ghc to do the preprocessing step, or does haddock?

@mzero
Copy link
Member

mzero commented Jul 20, 2014

FOUND! The root cause is a bug in cabal.
See https://gist.github.com/mzero/d4ba11c567977111749a for the gory details

@tibbe
Copy link
Member

tibbe commented Jul 22, 2014

So what do we do about it. We could make another Cabal release in the 1.18 series and ship that with the HP instead of the Cabal that shipped with GHC 7.8.3. I guess that would be morally equivalent to how some Linux distros ship patched versions of libs. I'm happy to make the extra Cabal release, if someone cooks up a patch that actually solves the problem.

@tibbe tibbe reopened this Jul 22, 2014
@mzero
Copy link
Member

mzero commented Jul 22, 2014

I'm assuming that the call to preprocessComponent
https://github.com/haskell/cabal/blob/ghc-7.6.3-release/Cabal/Distribution/Simple/Haddock.hs#L203
(via local function pre) is *not *the code responsible for the first CPP
pass I see, and that prepareSrouces
https://github.com/haskell/cabal/blob/ghc-7.6.3-release/Cabal/Distribution/Simple/Haddock.hs#L236
is.

If so, then the tension is between the CPP call in prepareSources
https://github.com/haskell/cabal/blob/ghc-7.6.3-release/Cabal/Distribution/Simple/Haddock.hs#L250
... and the eventual -optghc=-XCPP option that is genreated in
renderPureArgs
https://github.com/haskell/cabal/blob/ghc-7.6.3-release/Cabal/Distribution/Simple/Haddock.hs#L438.
I can think of three approaches:

  1. Don't do the CPP processing in prepareSources - but I don't know if
    unlit (for .lhs files) can pass through preprocessing directives unharmed
  2. If we do the CPP processing in prepareSources (2.1: perhaps only if
    needed because of .lhs files) then it could fiddle with the args to remove
    the Extension info so that renderPureArgs doesn't end up adding
    -optghc=-XCPP
  3. Filter the -optghc=-XCPP from the generation of options in
    renderPureArgs

I suspect #2.1 might be the right thing to do... but I don't know enough
about preprocessing to know if my assumption at top is right, and how unlit
and CPP sources can be correctly handled. Also, there seems to be provision
in here for Haddock pre-2.0 which can't pass the -optghc? Not sure what
that is about.

​Lastly, this whole bit of logic seems to be missing in HEAD of master. Did
that mean that it got sucked into preprocessComponent? Or is it gone?

  • Mark

@tibbe
Copy link
Member

tibbe commented Jul 22, 2014

/cc @dcoutts

@mzero
Copy link
Member

mzero commented Jul 22, 2014

SCRAP ALL THAT! This commit: [5729bc5] does what we need... just back port?
Mind you this is in Cabal, not cabal-install....

@mzero
Copy link
Member

mzero commented Jul 22, 2014

Plan of action is: cherry-pick [5bcb6f], [a718eb0], [98c537f], [5729bc5], and [ba4ae3d]. If they solve the problem, then this bug is closed by those changes, already in master.

Then we'll back-port them to 1.18, and ship that in the OS X version of the Platform.

@mzero
Copy link
Member

mzero commented Jul 23, 2014

See pull request #2010 - fixes this issue. I have verified it on a clang based system, with GHC 7.8.3.

@tibbe
Copy link
Member

tibbe commented Jul 23, 2014

I have merged #2010 and will make a Cabal release this afternoon (CET).

@elaforge
Copy link

Many thanks to both mzero and tibbe for tracking down and fixing the problem, and for backporting. I just installed Cabal-1.18.1.4 and haddock is back in action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants