Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stack build --ghc-output (or similar) to print GHC output during a build #426

Closed
spinda opened this issue Jun 26, 2015 · 31 comments
Closed

Comments

@spinda
Copy link

spinda commented Jun 26, 2015

Provide an option to pipe GHC's output to the terminal to better keep track of progress, instead of hanging on Cabal-1.22.4.0: build for dependencies that take some time to compile.

@snoyberg
Copy link
Contributor

We already have logic in place to send output to stdout under specific circumstances (namely: on the terminal, only one package left to build, and it's a "wanted" package). Adding a flag to simply override that logic with "always output to stdout" is trivial, though the result will likely not be what you're looking for unless you also use --jobs 1.

@chrisdone Thoughts?

@spinda
Copy link
Author

spinda commented Jun 26, 2015

although the result will likely not be what you're looking for unless you also use --jobs 1

As long as it gives some indication of GHC making progress, it's helpful.
Perhaps, in the multi-jobs context, output lines could be prefixed with the
name of the package generating them?

On Fri, Jun 26, 2015, 03:21 Michael Snoyman [email protected]
wrote:

We already have logic in place to send output to stdout under specific
circumstances (namely: on the terminal, only one package left to build, and
it's a "wanted" package). Adding a flag to simply override that logic with
"always output to stdout" is trivial, though the result will likely not be
what you're looking for unless you also use --jobs 1.

@chrisdone https://github.com/chrisdone Thoughts?


Reply to this email directly or view it on GitHub
#426 (comment)
.

@chrisdone
Copy link
Member

Yeah, outputting interleaved GHC output would be a bit messy. We could indeed prefix lines, that was discussed once. It might not be too bad.

@snoyberg
Copy link
Contributor

@spinda This part of the code base is pretty isolated. You interested in giving a crack at a PR on it? I can give you pointers of where to start.

@spinda
Copy link
Author

spinda commented Jun 26, 2015

@snoyberg I'm traveling from Friday to Monday, but if the delay isn't an issue then I'd be happy to have a go at it.

@snoyberg
Copy link
Contributor

Shouldn't be a problem at all. When you're ready to look into this, check out Stack.Build.Execute and look for withLogFile and console.

@snoyberg snoyberg added this to the 0.2.0.0 milestone Jun 26, 2015
@chrisdone
Copy link
Member

👍

@mgsloan
Copy link
Contributor

mgsloan commented Jun 29, 2015

This may well be an overengineered solution, but it seems like it ought to be possible to have uninterleaved GHC output if some of it is delayed. Here's how this would work:

  1. The first package being built would get control of stdout, so you get to see its live progress.

  2. When that package finishes building, and does so successfully, the output from the other builds that finished can be outputted in bulk.

  3. Then, it'd pick the next oldest build to get control over stdout (and it'll dump what it has so far for it). This makes it likely that you'll get the progress output from the package build that's holding things up. Continue with step 2.

Stack messages and logging would get interleaved with the ghc output, but that seems rather reasonable to me, if line buffered.

@snoyberg
Copy link
Contributor

@mgsloan I'm -1 on that solution, it seems like it's contrary to the stated goal of this issue: get regular output to stdout as long as a build is occurring. Delaying output from other processes would impair that. Additionally, implementing the solution you've described would be very difficult to do without introducing the possibility of significant memory usage or hurting performance in the common case.

@mgsloan
Copy link
Contributor

mgsloan commented Jun 29, 2015

Agreed, the complication isn't worth it.

@snoyberg
Copy link
Contributor

snoyberg commented Jul 7, 2015

@spinda Are you still planning on looking into this?

@rrnewton
Copy link
Contributor

(Continuing conversation here from the duplicate #642.)

Regarding the complexity of the implementation -- my hope was that this policy could be completely encapsulated in a function roughly of the type Stream (Stream ByteString) -> Stream ByteString, with the simplest trivial implementation being immediately merging all substream's lines on the output.

Once the policy is factored out like that, then any of these fancy policies could live in their own modules, expose that one function, and not touch anything else. (And also not be the default, probably.) Maybe with some help factoring the plumbing (to route through that single arbiter function), then subsequently myself or @spinda, or @mgsloan or whoever could easily implement and try out different policies.

A richer type than the one above will probably be necessary to have metadata for the substreams (e.g. human readable names) plus some way to implement the current policy of hiding output but still printing status messages....

I don't know conduit yet, but I'll have a look at the async/concurrent stuff out there for it and then make a specific proposal based on Conduit.

@mgsloan
Copy link
Contributor

mgsloan commented Jul 22, 2015

Even modules / functions that are only used in one spot add complexity - they'd also need to be wired up to the CLI.

my hope was that this policy could be completely encapsulated in a function roughly of the type Stream (Stream ByteString) -> Stream ByteString

In the other thread, @snoyberg mentioned concerns about huge logs building up in memory. For consumers of this stream that wait a while to send their data, this could easily be the case (and hard to catch when only tested on small examples). It's not clear how to resolve this while keeping the type of the function simple.

That said, I do think this output stuff could use some attention. Here are the things I'd like to see resolved:

  1. If you're just building one thing, you get the output in the console, but no log file. Here's the code. To me, it'd be more consistent to have both.

  2. A long running build or test can lead the user to think that it's stuck.

  3. It's easier to ignore warnings when you don't see all the output. Not sure what to do about this.

One somewhat hacky approach to these last two would be to attempt to parse out the following parts of output:

  • GHC's [N of M] prefixes telling compilation progress. Instead of using direct stdout, stack could use TTY stuff to tell you about the current progress of each package.
  • Parse out GHC messages and emit them out at the end.
  • This is a stretch, but possibly parse out hspec style test output.

Considering how hacky this is, I wouldn't want it to make it stack's canonical behavior, at least not at first. So, yeah, maybe it'd be good to have this be something where it's easy to swap in a different output behavior.

my hope was that this policy could be completely encapsulated in a function roughly of the type Stream (Stream ByteString) -> Stream ByteString

@rrnewton
Copy link
Contributor

Well it would be nice if the sticky line could report fine-grained module progress as well as coarse-grained progress. But that does require a hacky parsing of GHC output, right? (I suppose the alternative is to do like Jenkins and remember what typical amounts of time/console-output are for particular subtasks, to estimate progress based on that.)

Regarding memory usage:
It should be possible to have constant-memory stream transducers of the type Stream (Stream ByteString) -> Stream ByteString which are nice in practice. So, for example, even a strategy that typically passes through build-output when a subtask is complete, could have a max buffer size after which it passes through chunks of output even if not complete.

I think some experimentation could lead to a real-time behavior that feels comfortable, not leaving those "it looks stuck" windows of time, nor producing ugly interleavings.

@rrnewton
Copy link
Contributor

Actually, on second thought, even without having a "denominator", just having a counter in the sticky line that tracks the total number of lines of output from all subtasks.... that would go a huge way towards showing that the thing isn't stuck.

@snoyberg snoyberg modified the milestones: 0.3.0.0, 0.2.0.0 Aug 5, 2015
@scott-fleischman
Copy link
Contributor

Whatever you can do to improve the output for multiple projects would be appreciated. This is the most frustrating part of stack for me at the moment. I get a lot of spurious Travis CI build failures. Locally I'd much rather get all the output interleaved (with some prefix would be nice) than have no output until it hits an error. Thanks.

@cdepillabout
Copy link
Member

As mentioned in #896, I would like to see ghc warnings for my local packages when doing a multi-package stack build.

One work around is to pass the --pedantic flag to stack build, but there are times when I don't want the whole build to fail on a warning, I just want to see them.

Another work around is to pass a package name to stack build in order to get it to do a single-package build. When working on multiple packages simultaneously, this is kind of annoying.

@fizruk
Copy link
Contributor

fizruk commented Oct 26, 2015

What is the status on this? Is someone working on a PR or should I just try do that myself (seems pretty straightforward to implement)?

@snoyberg
Copy link
Contributor

I do not believe anyone else is working on this right now.

@cdepillabout
Copy link
Member

The other day I was trying to see template haskell output by passing --ghc-options='-ddump-splices' to stack, but stack didn't print anything to stdout.

Maybe a --ghc-output` flag would solve this problem as well?

@snoyberg
Copy link
Contributor

Dump output should be going into a dump file inside .stack-work

On Tue, Oct 27, 2015 at 6:15 AM, Dennis Gosnell [email protected]
wrote:

The other day I was trying to see template haskell output by passing
--ghc-options='-ddump-splices' to stack, but stack didn't print anything
to stdout.

Maybe a --ghc-output` flag would solve this problem as well?


Reply to this email directly or view it on GitHub
#426 (comment)
.

@cdepillabout
Copy link
Member

Dump output should be going into a dump file inside .stack-work

I did find the file it was going to, but I had assumed it would work like cabal where the code generated by template haskell is output to stdout.

@borsboom
Copy link
Contributor

@cdepillabout: We pass -ddump-hi -ddump-to-file to GHC because we use the .dump-hi files to find unlisted modules and TH dependencies. But there doesn't seem to be a way to tell GHC to only write the the dump-hi files but not other dumps (like -ddump-splices). Is having the splices written to files instead of to stdout causing any real problems for anyone? I usually prefer them to go to files anyway. I've added a FAQ entry, so at least it's documented now.

@cdepillabout
Copy link
Member

@borsboom Thanks for the updated documentation! Having the -ddump-splices output go to a file instead of stdout isn't causing me any real problems, but I had expected it to work the same as it does with cabal. I think the updated FAQ solves most of the problem!

@borsboom
Copy link
Contributor

Should probably use a different argument name than --ghc-output, since this would also include test and benchmark progress output. Maybe --all-tasks-output?

@decentral1se
Copy link
Member

Is having the splices written to files instead of to stdout causing any real problems for anyone?

@borsboom - I think I am getting stung by this.

EDIT: Oh wait, this was unrelated. Sorry for confusion

@snoyberg
Copy link
Contributor

I've opened PR #2594 to add a --dump-logs flag and dump-logs config option.

@nh2
Copy link
Collaborator

nh2 commented Jun 20, 2017

I've filed #3225 for a flag that outputs to stdout, instead of a file that one first has to find.

@saurabhnanda
Copy link

IIUC --dump-logs prints logs only after a build is complete. Would it be possible to output a . for every compiled module. Use-cases:

  • Avoid hitting timeouts [1] on some CI environments.
  • Reporting progress is important for impatient devs :)

[1] https://discuss.circleci.com/t/too-long-with-no-output-while-building-docker-image/12275

@mgsloan
Copy link
Contributor

mgsloan commented Oct 23, 2017

@saurabhnanda Should be possibly by parsing GHC output. How about opening a new issue about it? This one is closed and already has a ton of comments.

@snoyberg
Copy link
Contributor

Relevant PR: #4097

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests