Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/go: add timing information to -debug-actiongraph #15736

Closed
josharian opened this issue May 18, 2016 · 14 comments
Closed

cmd/go: add timing information to -debug-actiongraph #15736

josharian opened this issue May 18, 2016 · 14 comments
Labels
FeatureRequest FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. ToolSpeed
Milestone

Comments

@josharian
Copy link
Contributor

In order to get insight into how cmd/go schedules work, I added the ability to generate a trace using the same viewer as cmd/trace. There is a screenshot and sample html output at #15734. It was generated using the command go build -a -trace std.html std. A partial implementation (doesn't trace downloads, test running, etc.) can be seen at unstable url.

This issue is to discuss whether this would be worth adding to cmd/go in 1.8.

Advantages:

  • Nice clean way to understand where the time goes
  • Could be useful for people to find slow-compiling packages, which they could then decompose (or file issues for!)
  • Could be useful for people to find compilation bottlenecks due to dependencies, which they could then choose to break

Disadvantages:

  • Most helpful for people working on Go, not people using it
  • Might inspire people to do foolish things to break up packages or eliminate dependencies. For example, when I saw that compiling cmd/compile was significantly linear due to the dependency chain cmd/internal/obj/x86 -> cmd/compile/internal/ssa -> cmd/compile/internal/gc, I was immediately tempted to change the SSA rewrite rules to emit int literals (1234) rather than instruction codes (x86.AMOVL), to break that first dependency. Not a good idea.
    trace_cmd_compile

Feedback requested.

@josharian josharian added this to the Proposal milestone May 18, 2016
@randall77
Copy link
Contributor

I think this is pretty cool.
That said, parallelizing packages and parallelizing compilation within a
package (which we're also thinking about doing), are going after the same
idle cycles. Maybe we shouldn't do both if one of them would be enough.

On Wed, May 18, 2016 at 3:04 PM, Josh Bleecher Snyder <
[email protected]> wrote:

In order to get insight into how cmd/go schedules work, I added the
ability to generate a trace using the same viewer as cmd/trace. There is a
screenshot and sample html output at #15734
#15734. It was generated using the
command go build -a -trace std.html std. A partial implementation
(doesn't trace downloads, test running, etc.) can be seen at unstable url
josharian@a5a9f75
.

This issue is to discuss whether this would be worth adding to cmd/go in
1.8.

Advantages:

  • Nice clean way to understand where the time goes
  • Could be useful for people to find slow-compiling packages, which
    they could then decompose (or file issues for!)
  • Could be useful for people to find compilation bottlenecks due to
    dependencies, which they could then choose to break

Disadvantages:

  • Most helpful for people working on Go, not people using it
  • Might inspire people to do foolish things to break up packages or
    eliminate dependencies. For example, when I saw that compiling cmd/compile
    was significantly linear due to the dependency chain cmd/internal/obj/x86
    -> cmd/compile/internal/ssa -> cmd/compile/internal/gc, I was immediately
    tempted to change the SSA rewrite rules to emit int literals (1234) rather
    than instruction codes (x86.AMOVL), to break that first dependency. Not a
    good idea. [image: trace_cmd_compile]
    https://cloud.githubusercontent.com/assets/67496/15376338/43fa3772-1d09-11e6-9fe3-8597e9756e9d.png

Feedback requested.


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#15736

@bradfitz
Copy link
Contributor

That said, parallelizing packages and parallelizing compilation within a
package (which we're also thinking about doing), are going after the same
idle cycles. Maybe we shouldn't do both if one of them would be enough.

Good point. But parallelizing compilation within a package requires removing a bunch of remaining global variables in the compiler, which might take longer than Josh's proposal?

@randall77
Copy link
Contributor

On Wed, May 18, 2016 at 3:33 PM, Brad Fitzpatrick [email protected]
wrote:

That said, parallelizing packages and parallelizing compilation within a
package (which we're also thinking about doing), are going after the same
idle cycles. Maybe we shouldn't do both if one of them would be enough.

Good point. But parallelizing compilation within a package requires
removing a bunch of remaining global variables in the compiler, which might
take longer than Josh's proposal?

I don't know. Global variables are kinda hard, but not insurmountable.
Parallelizing compiles sounds like the code is easier, but it may make the
system harder to manage - how many parallel compiles should be running?
How much memory would that take? What if there's an error? The
multiprocess spiderweb gets unwieldy after a while.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15736 (comment)

@josharian
Copy link
Contributor Author

parallelizing packages and parallelizing compilation within a
package (which we're also thinking about doing), are going after the same
idle cycles

Indeed. However, (a) there will always be inefficiencies in getting work scheduled, so doing both could help nevertheless and (b) as @bradfitz observed, scheduling compilation processes better helps a lot more if you start to ship compilation off over the network to other machines.

Parallelizing compiles sounds like the code is easier, but it may make the
system harder to manage - how many parallel compiles should be running?
How much memory would that take? What if there's an error? The
multiprocess spiderweb gets unwieldy after a while.

cmd/go already parallelizes compiles, just not super efficiently; it already handles errors and juggles the spiderweb. That said, I rather dislike the cmd/go codebase, in part for this reason. :)

cmd/go chooses ncpus as the number of parallel compiles to run, which can be altered with -p. My experiments with the -p flag so far indicate that doing fewer parallel compiles is clearly bad for wall time. And that despite the fact that cmd/compile actually already does rely on more than a single CPU (due to GC?): compiling with GOMAXPROCS=1 on my machine causes a roughly 30-50% slowdown.

Also worth noting is that not all of cmd/go's work is in the compiler--cgo and resulting c compilations can be a big part of compile times, and those are not likely to be as core-saturating as cmd/compile anytime soon.

So I'm inclined to say it is still worth pursuing both paths. And we have the tools to control concurrency on both fronts (-p, GOMAXPROCS), so benchmarking and experimentation is easy.

@randall77
Copy link
Contributor

On Wed, May 18, 2016 at 3:51 PM, Josh Bleecher Snyder <
[email protected]> wrote:

parallelizing packages and parallelizing compilation within a
package (which we're also thinking about doing), are going after the same
idle cycles

Indeed. However, (a) there will always be inefficiencies in getting work
scheduled, so doing both could help nevertheless and (b) as @bradfitz
https://github.com/bradfitz observed, scheduling compilation processes
better helps a lot more if you start to ship compilation off over the
network to other machines.

Parallelizing compiles sounds like the code is easier, but it may make the
system harder to manage - how many parallel compiles should be running?
How much memory would that take? What if there's an error? The
multiprocess spiderweb gets unwieldy after a while.

cmd/go already parallelizes compiles, just not super efficiently; it
already handles errors and juggles the spiderweb. That said, I rather
dislike the cmd/go codebase, in part for this reason. :)

cmd/go chooses ncpus as the number of parallel compiles to run, which can
be altered with -p. My experiments with the -p flag so far indicate that
doing fewer parallel compiles is clearly bad for wall time. And that
despite the fact that cmd/compile actually already does rely on more than a
single CPU (due to GC?): compiling with GOMAXPROCS=1 on my machine causes a
roughly 30-50% slowdown.

Also worth noting is that not all of cmd/go's work is in the compiler--cgo
and resulting c compilations can be a big part of compile times, and those
are not likely to be as core-saturating as cmd/compile anytime soon.

True.

So I'm inclined to say it is still worth pursuing both paths. And we have
the tools to control concurrency on both fronts (-p, GOMAXPROCS), so
benchmarking and experimentation is easy.

Scheduling multiple parallel-capable processes is tricky. Suppose we're in
the middle of a build and there's only one compile that can be started. Do
we give it all the rest of the available cpus? Maybe the answer is yes.
But then what if a previously-issued compile completes and now a bunch of
new compiles are ready to go? We've already committed almost all the
processors to that one compile, so there aren't many left for all the
(possibly on the critical path) compiles that are now ready. Maybe we
could steal some processors back from the compile we already started?
Pretty soon we've reimplemented the entire Go work-stealing scheduler in
cmd/go.

I'm not saying we shouldn't do this. I'd really like to have the tracing
output, at least. But getting the parallelization right isn't easy and
splitting up the decision point into two different binaries won't help any.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15736 (comment)

@davecheney
Copy link
Contributor

I fear this line of investigation is mostly my fault because of my focus on
the from scratch compile time of jujud. It is a useful benchmark because it
runs in human time scales and has generated great results including the
huge improvements in link time.

Without excluding the excellent investigation that Josh has done, I think
that the focus should be on parallising cmd/compile as that will give
benefit to all compiler users, not just the ones that build front scratch
every time.

Thanks

Dave

On Thu, 19 May 2016, 09:11 Keith Randall, [email protected] wrote:

On Wed, May 18, 2016 at 3:51 PM, Josh Bleecher Snyder <
[email protected]> wrote:

parallelizing packages and parallelizing compilation within a
package (which we're also thinking about doing), are going after the same
idle cycles

Indeed. However, (a) there will always be inefficiencies in getting work
scheduled, so doing both could help nevertheless and (b) as @bradfitz
https://github.com/bradfitz observed, scheduling compilation processes
better helps a lot more if you start to ship compilation off over the
network to other machines.

Parallelizing compiles sounds like the code is easier, but it may make
the
system harder to manage - how many parallel compiles should be running?
How much memory would that take? What if there's an error? The
multiprocess spiderweb gets unwieldy after a while.

cmd/go already parallelizes compiles, just not super efficiently; it
already handles errors and juggles the spiderweb. That said, I rather
dislike the cmd/go codebase, in part for this reason. :)

cmd/go chooses ncpus as the number of parallel compiles to run, which can
be altered with -p. My experiments with the -p flag so far indicate that
doing fewer parallel compiles is clearly bad for wall time. And that
despite the fact that cmd/compile actually already does rely on more
than a
single CPU (due to GC?): compiling with GOMAXPROCS=1 on my machine
causes a
roughly 30-50% slowdown.

Also worth noting is that not all of cmd/go's work is in the
compiler--cgo
and resulting c compilations can be a big part of compile times, and
those
are not likely to be as core-saturating as cmd/compile anytime soon.

True.

So I'm inclined to say it is still worth pursuing both paths. And we have
the tools to control concurrency on both fronts (-p, GOMAXPROCS), so
benchmarking and experimentation is easy.

Scheduling multiple parallel-capable processes is tricky. Suppose we're in
the middle of a build and there's only one compile that can be started. Do
we give it all the rest of the available cpus? Maybe the answer is yes.
But then what if a previously-issued compile completes and now a bunch of
new compiles are ready to go? We've already committed almost all the
processors to that one compile, so there aren't many left for all the
(possibly on the critical path) compiles that are now ready. Maybe we
could steal some processors back from the compile we already started?
Pretty soon we've reimplemented the entire Go work-stealing scheduler in
cmd/go.

I'm not saying we shouldn't do this. I'd really like to have the tracing
output, at least. But getting the parallelization right isn't easy and
splitting up the decision point into two different binaries won't help any.


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15736 (comment)


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
#15736 (comment)

@josharian
Copy link
Contributor Author

josharian commented May 19, 2016

Thanks, everyone, for the input, very helpful.

I've now also heard privately that this tracing was helpful, for a non-Go-contributor.

So based on this conversation, I'll plan to move forward on adding tracing to cmd/go early in 1.8, and put the other proposal (#15734) on hold until cmd/compile is more concurrent, at which point we can re-evaluate.

Is there an issue or place where I can track progress on (and contribute to) making cmd/compile more concurrent?

@randall77
Copy link
Contributor

I don't know of a issue for that. I'll open one.

On Thu, May 19, 2016 at 10:44 AM, Josh Bleecher Snyder <
[email protected]> wrote:

Thanks, everyone, for the input, very helpful.

I've now also heard privately that this tracing was helpful, for a non-Go
developer.

So based on this conversation, I'll plan to move forward on adding tracing
to cmd/go early in 1.8, and put the other proposal (#15734
#15734) on hold until cmd/compile is
more concurrent, at which point we can re-evaluate.

Is there an issue or place where I can track progress on (and contribute
to) making cmd/compile more concurrent?


You are receiving this because you commented.
Reply to this email directly or view it on GitHub
#15736 (comment)

@bradfitz bradfitz changed the title proposal: add tracing to cmd/go builds cmd/go: add tracing to builds Aug 22, 2016
@bradfitz
Copy link
Contributor

@josharian, this is marked early. Got a CL ready?

@josharian
Copy link
Contributor Author

A have a CL at 70%. Will put it at top of my queue. I also don't see harm in removing the early milestone; this is not as intrusive or risky as I originally thought.

@josharian
Copy link
Contributor Author

This CL just made it to the top of my queue for cleanup and mailing, but not quite fast enough. I'm out on vacation until late October. This might or might not happen for Go 1.8. If not 1.8, though, 1.9; I do intend to see it through.

If anyone wants to pick up the ball and run with it while I'm away, the WIP code is at this unstable link. The major TODOs are:

  • Split this CL into several. At a minimum, one doing a tracing refactor and one adding trace support to cmd/go.
  • Add tracing to non-build steps.

@quentinmit
Copy link
Contributor

I'm going to move this to Go1.8Maybe since you might or might not get to it for 1.8 but it's not going to block the 1.8 release. Good luck with your vacation, and I am also looking forward to seeing this feature in Go!

We can move this to Go1.9 if it misses 1.8.

@quentinmit quentinmit modified the milestones: Go1.8Maybe, Go1.8Early Sep 26, 2016
@quentinmit quentinmit added the NeedsFix The path to resolution is known, but the work has not been done. label Oct 10, 2016
@rsc rsc modified the milestones: Go1.9, Go1.8Maybe Oct 20, 2016
@bradfitz bradfitz modified the milestones: Go1.10, Go1.9 Jun 19, 2017
@rsc
Copy link
Contributor

rsc commented Oct 25, 2017

There's now an (intentionally) undocumented -debug-actiongraph=x.json option that will dump the action graph for a build. If we add the times to the action graph then I think this issue can be closed. I would prefer to keep it undocumented since I don't want to commit to the specific format, API, and so on.

@rsc rsc changed the title cmd/go: add tracing to builds cmd/go: add timing information to -debug-actiongraph Oct 25, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Go1.11 Nov 28, 2017
@ianlancetaylor ianlancetaylor modified the milestones: Go1.11, Unplanned Jul 3, 2018
@rsc
Copy link
Contributor

rsc commented May 16, 2019

CL 177138 adds timing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
FeatureRequest FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. ToolSpeed
Projects
None yet
Development

No branches or pull requests

8 participants