Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

achive better parallelism when building deps #5014

Closed
matthiaskrgr opened this issue Feb 6, 2018 · 3 comments · Fixed by #5100
Closed

achive better parallelism when building deps #5014

matthiaskrgr opened this issue Feb 6, 2018 · 3 comments · Fixed by #5100

Comments

@matthiaskrgr
Copy link
Member

Note: this is an oversimplified example.
Lets say my crate dependencies are like this:

A -requires-> B , C
B -requires-> D , E
A
├── B
C   ├── D
    E

I noticed that cargo would build C and D or E in parallel and then build B in the end and then A.
It would be smarter to realize that we can build D and E in parallel and then B and C in parallel.

before:

core1 core2
C E
D
B
A

suggested:

core1 core2
E D
C B
A

The current behaviour also happens when forcing code gen units to 1.

@alexcrichton
Copy link
Member

Indeed! Right now there's no heuristics for "what crate to schedule next", and this sounds like an excellent heuristic to me. The logic would go around here

@lukaslueg
Copy link
Contributor

lukaslueg commented Feb 7, 2018

We can do that using Hu's scheduling algorithm. This assumes that all packages take an equal amount of time to compile; that is never true, yet the best bet we can make. We'd be required to compute the path length from the top package to the ready-to-be-scheduled packages, which in turn we can do using a simple topological sort (the DependencyQueue is guaranteed to never have cycles, right? Right?).
As far as I can see we would currently need to recompute that for every call of dequeue because DependencyQueue is designed to be mutable while jobs are being dequeued (that is, we currently do not pre-compute a definite job schedule ahead of time).

I'm wondering if the added complexity in dequeue() were to be considered worth it. The runtime-cost of doing a topological sort of the dependency-queue for every call to dequeue() is probably very small.

@alexcrichton
Copy link
Member

@lukaslueg sounds plausible to me! We could actually estimate the cost of a crate via the size of all its input files probably as well if we wanted to go a bit further. In any case the DependencyQueue basically hasn't changed since it was first added to Cargo, and IIRC it's not optimized at all and is already doing a bunch of liner things on each call to dequeue already, so adding more probably can't hurt!

And yeah AFAIK there are no cycles there.

alexcrichton added a commit to alexcrichton/cargo that referenced this issue Mar 1, 2018
Historically Cargo has been pretty naive about scheduling builds, basically just
greedily scheduling as much work as possible. As pointed out in rust-lang#5014, however,
this isn't guaranteed to always have the best results. If we've got a very deep
dependency tree that would otherwise fill up our CPUs Cargo should ideally
schedule these dependencies first. That way when we reach higher up in the
dependency tree we should have more work available to fill in the cracks if
there's spare cpus.

Closes rust-lang#5014
bors added a commit that referenced this issue Mar 1, 2018
Improve Cargo's scheduling of builds

Historically Cargo has been pretty naive about scheduling builds, basically just
greedily scheduling as much work as possible. As pointed out in #5014, however,
this isn't guaranteed to always have the best results. If we've got a very deep
dependency tree that would otherwise fill up our CPUs Cargo should ideally
schedule these dependencies first. That way when we reach higher up in the
dependency tree we should have more work available to fill in the cracks if
there's spare cpus.

Closes #5014
@bors bors closed this as completed in #5100 Mar 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants