Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Cargo's scheduling of builds #5100

Merged
merged 1 commit into from
Mar 1, 2018

Conversation

alexcrichton
Copy link
Member

Historically Cargo has been pretty naive about scheduling builds, basically just
greedily scheduling as much work as possible. As pointed out in #5014, however,
this isn't guaranteed to always have the best results. If we've got a very deep
dependency tree that would otherwise fill up our CPUs Cargo should ideally
schedule these dependencies first. That way when we reach higher up in the
dependency tree we should have more work available to fill in the cracks if
there's spare cpus.

Closes #5014

@rust-highfive
Copy link

r? @matklad

(rust_highfive has picked a reviewer for you, use r? to override)

Copy link
Member

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works as described, because the depth does not look like depth :-)

Some general thoughts:

It would be great if we could unit-test such algorithmic-y bits of Cargo. While I am absolutely in love with our integration tests suite, it's hard to use it to exercise such more internal bits. Another similar piece of computer-sciency stuff is, of course, the dependency resolution algorithm.

I wonder if it makes sense to invent some sort of "intermediate language" representation of the dependencies graph, the patch section, features and other resolve related stuff. If we than can dump and load this IR in json, it could help us to write unit tests for stuff like "resolver does not do too many steps for this case", "the optimal order of building these crates is this". It might also be useful for the build-plan work, and could unlock experimentation with SAT solvers for resolver.


// add one to the depth of all our dependencies because, well, we're
// depending on them
*self.depth.entry(dep.clone()).or_insert(0) += 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This... doesn't calculate depth at all? It looks more like an in-degree? There must be a max somewhere to calculate the depth.

}
self.depth.insert(key.clone(), 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might override the number already in hash map. If this can't happen due to the order in which dependencies are processed, let's add an assert here.

@matthiaskrgr
Copy link
Member

Hmm, I tried this on my cargo-cache crate (it uses codegen-units = 1 and monolithic lto) on a dual core system and build time went from

    Finished release [optimized] target(s) in 312.25 secs
real	5m12,605s
user	10m13,616s
sys	0m11,779s

( cargo 0.26.0-nightly (1d6dfea 2018-01-26))
to

    Finished release [optimized] target(s) in 304.31 secs
real	5m4,656s
user	10m22,107s
sys	0m11,489s

( release profile compiled from git + this PR).

The cargo dependency was still compiled last though (altough there should be room for better parallelism) :/

@matklad
Copy link
Member

matklad commented Mar 1, 2018

Hmm, I tried this on my cargo-cache crate (it uses codegen-units = 1 and monolithic lto) on a dual core system and build time went from

Yeah, prioritizing packages which have a lot of direct dependencies should improve scheduling I think, but the true depth heuristic should be even better.

@alexcrichton
Copy link
Member Author

I don't think this works as described, because the depth does not look like depth :-)

urgh too tired

While I am absolutely in love with our integration tests suite, it's hard to use it to exercise such more internal bits.

There's nothing stopping us from using #[test] everywhere, it just isn't done that often right now. I can add a test for this.

@alexcrichton
Copy link
Member Author

Updated!

depth(dep, map, results, cur_depth + 1);
}
}
}
Copy link
Member

@matklad matklad Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is O(N^2) worst case. Not sure if it matters, but I couldn't resist the temptation of implementing a linear time dynamic programming algorithm :-)

    pub fn queue_finished(&mut self) {
        for key in self.dep_map.keys() {
            depth(key, &self.reverse_dep_map, &mut self.depth);
        }

        fn depth<K: Hash + Eq + Clone>(
            key: &K,
            map: &HashMap<K, HashSet<K>>,
            results: &mut HashMap<K, usize>,
        ) -> usize {
            const IN_PROGRESS: usize = !0;

            if let Some(&depth) = results.get(key) {
                assert_ne!(depth, IN_PROGRESS, "cycle in DependencyQueue");
                return depth;
            }
            
            results.insert(key.clone(), IN_PROGRESS);

            let depth = map.get(&key).into_iter().flat_map(|it| it)
                .map(|dep| depth(dep, map, results))
                .max().unwrap_or(0) + 1;
            
            *results.get_mut(key).unwrap() = depth;
            
            depth
        }
    }

The idea is to use reverse map instead the usual one, calculate the depth as max of in-edges plus one and do a sanity check for absence of cycles just in case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Historically Cargo has been pretty naive about scheduling builds, basically just
greedily scheduling as much work as possible. As pointed out in rust-lang#5014, however,
this isn't guaranteed to always have the best results. If we've got a very deep
dependency tree that would otherwise fill up our CPUs Cargo should ideally
schedule these dependencies first. That way when we reach higher up in the
dependency tree we should have more work available to fill in the cracks if
there's spare cpus.

Closes rust-lang#5014
@matklad
Copy link
Member

matklad commented Mar 1, 2018

@bors r+

@bors
Copy link
Contributor

bors commented Mar 1, 2018

📌 Commit e54b5f8 has been approved by matklad

bors added a commit that referenced this pull request Mar 1, 2018
Improve Cargo's scheduling of builds

Historically Cargo has been pretty naive about scheduling builds, basically just
greedily scheduling as much work as possible. As pointed out in #5014, however,
this isn't guaranteed to always have the best results. If we've got a very deep
dependency tree that would otherwise fill up our CPUs Cargo should ideally
schedule these dependencies first. That way when we reach higher up in the
dependency tree we should have more work available to fill in the cracks if
there's spare cpus.

Closes #5014
@bors
Copy link
Contributor

bors commented Mar 1, 2018

⌛ Testing commit e54b5f8 with merge 558c7d2...

@bors
Copy link
Contributor

bors commented Mar 1, 2018

☀️ Test successful - status-appveyor, status-travis
Approved by: matklad
Pushing 558c7d2 to master...

@bors bors merged commit e54b5f8 into rust-lang:master Mar 1, 2018
@matklad
Copy link
Member

matklad commented Mar 1, 2018

@mattgathu it would be interesting to see the numbers with this PR merged! You can install Cargo off the master branch with

cargo install --git https://github.com/rust-lang/cargo --root /tmp

@matthiaskrgr
Copy link
Member

@matklad was that for me?

@matthiaskrgr
Copy link
Member

cargo 0.26.0-nightly (1d6dfea 2018-01-26)

    Finished release [optimized] target(s) in 309.39 secs

real	5m9,742s
user	10m11,713s
sys	0m14,417s

cargo from git ( 558c7d2 )

    Finished release [optimized] target(s) in 311.92 secs

real	5m12,308s
user	10m6,697s
sys	0m14,268s

so no noticeable improvements here. :/

@matthiaskrgr
Copy link
Member

matthiaskrgr commented Mar 2, 2018

Hm, in my example cargo is still compiled last and alone (with no ther crate in paralel) even after I added another entirely unneeded dep (tokio) to the crate deps.
so

root_crate -req-> cargo
  [nothing req]      tokio

and it would still compile tokio and all its deps first and then cargo and then the base crate.

edit:

cargo tree:

cargo-cache v0.1.0 (file:///home/matthias/cargo-cache)
[dependencies]
├── cargo v0.24.0
│   [dependencies]
│   ├── atty v0.2.6
│   │   [dependencies]
│   │   └── libc v0.2.37
│   ├── crates-io v0.13.0
│   │   [dependencies]
│   │   ├── curl v0.4.11
│   │   │   [dependencies]
│   │   │   ├── curl-sys v0.4.1
│   │   │   │   [dependencies]
│   │   │   │   ├── libc v0.2.37 (*)
│   │   │   │   ├── libz-sys v1.0.18
│   │   │   │   │   [dependencies]
│   │   │   │   │   └── libc v0.2.37 (*)
│   │   │   │   │   [build-dependencies]
│   │   │   │   │   ├── cc v1.0.4
│   │   │   │   │   └── pkg-config v0.3.9
│   │   │   │   └── openssl-sys v0.9.27
│   │   │   │       [dependencies]
│   │   │   │       └── libc v0.2.37 (*)
│   │   │   │       [build-dependencies]
│   │   │   │       ├── cc v1.0.4 (*)
│   │   │   │       └── pkg-config v0.3.9 (*)
│   │   │   │   [build-dependencies]
│   │   │   │   ├── cc v1.0.4 (*)
│   │   │   │   └── pkg-config v0.3.9 (*)
│   │   │   ├── libc v0.2.37 (*)
│   │   │   ├── openssl-probe v0.1.2
│   │   │   ├── openssl-sys v0.9.27 (*)
│   │   │   └── socket2 v0.3.2
│   │   │       [dependencies]
│   │   │       ├── cfg-if v0.1.2
│   │   │       └── libc v0.2.37 (*)
│   │   ├── error-chain v0.11.0
│   │   │   [dependencies]
│   │   │   └── backtrace v0.3.5
│   │   │       [dependencies]
│   │   │       ├── backtrace-sys v0.1.16
│   │   │       │   [dependencies]
│   │   │       │   └── libc v0.2.37 (*)
│   │   │       │   [build-dependencies]
│   │   │       │   └── cc v1.0.4 (*)
│   │   │       ├── cfg-if v0.1.2 (*)
│   │   │       ├── libc v0.2.37 (*)
│   │   │       └── rustc-demangle v0.1.7
│   │   ├── serde v1.0.27
│   │   ├── serde_derive v1.0.27
│   │   │   [dependencies]
│   │   │   ├── quote v0.3.15
│   │   │   ├── serde_derive_internals v0.19.0
│   │   │   │   [dependencies]
│   │   │   │   ├── syn v0.11.11
│   │   │   │   │   [dependencies]
│   │   │   │   │   ├── quote v0.3.15 (*)
│   │   │   │   │   ├── synom v0.11.3
│   │   │   │   │   │   [dependencies]
│   │   │   │   │   │   └── unicode-xid v0.0.4
│   │   │   │   │   └── unicode-xid v0.0.4 (*)
│   │   │   │   └── synom v0.11.3 (*)
│   │   │   └── syn v0.11.11 (*)
│   │   ├── serde_json v1.0.10
│   │   │   [dependencies]
│   │   │   ├── dtoa v0.4.2
│   │   │   ├── itoa v0.3.4
│   │   │   ├── num-traits v0.2.1
│   │   │   └── serde v1.0.27 (*)
│   │   └── url v1.7.0
│   │       [dependencies]
│   │       ├── idna v0.1.4
│   │       │   [dependencies]
│   │       │   ├── matches v0.1.6
│   │       │   ├── unicode-bidi v0.3.4
│   │       │   │   [dependencies]
│   │       │   │   └── matches v0.1.6 (*)
│   │       │   └── unicode-normalization v0.1.5
│   │       ├── matches v0.1.6 (*)
│   │       └── percent-encoding v1.0.1
│   ├── crossbeam v0.3.2
│   ├── crypto-hash v0.3.0
│   │   [dependencies]
│   │   ├── hex v0.2.0
│   │   └── openssl v0.9.24
│   │       [dependencies]
│   │       ├── bitflags v0.9.1
│   │       ├── foreign-types v0.3.2
│   │       │   [dependencies]
│   │       │   └── foreign-types-shared v0.1.1
│   │       ├── lazy_static v1.0.0
│   │       ├── libc v0.2.37 (*)
│   │       └── openssl-sys v0.9.27 (*)
│   ├── curl v0.4.11 (*)
│   ├── docopt v0.8.3
│   │   [dependencies]
│   │   ├── lazy_static v1.0.0 (*)
│   │   ├── regex v0.2.6
│   │   │   [dependencies]
│   │   │   ├── aho-corasick v0.6.4
│   │   │   │   [dependencies]
│   │   │   │   └── memchr v2.0.1
│   │   │   │       [dependencies]
│   │   │   │       └── libc v0.2.37 (*)
│   │   │   ├── memchr v2.0.1 (*)
│   │   │   ├── regex-syntax v0.4.2
│   │   │   ├── thread_local v0.3.5
│   │   │   │   [dependencies]
│   │   │   │   ├── lazy_static v1.0.0 (*)
│   │   │   │   └── unreachable v1.0.0
│   │   │   │       [dependencies]
│   │   │   │       └── void v1.0.2
│   │   │   └── utf8-ranges v1.0.0
│   │   ├── serde v1.0.27 (*)
│   │   ├── serde_derive v1.0.27 (*)
│   │   └── strsim v0.6.0
│   ├── env_logger v0.4.3
│   │   [dependencies]
│   │   ├── log v0.3.9
│   │   │   [dependencies]
│   │   │   └── log v0.4.1
│   │   │       [dependencies]
│   │   │       └── cfg-if v0.1.2 (*)
│   │   └── regex v0.2.6 (*)
│   ├── error-chain v0.11.0 (*)
│   ├── filetime v0.1.15
│   │   [dependencies]
│   │   ├── cfg-if v0.1.2 (*)
│   │   └── libc v0.2.37 (*)
│   ├── flate2 v0.2.20
│   │   [dependencies]
│   │   ├── libc v0.2.37 (*)
│   │   └── miniz-sys v0.1.10
│   │       [dependencies]
│   │       └── libc v0.2.37 (*)
│   │       [build-dependencies]
│   │       └── cc v1.0.4 (*)
│   ├── fs2 v0.4.3
│   │   [dependencies]
│   │   └── libc v0.2.37 (*)
│   ├── git2 v0.6.11
│   │   [dependencies]
│   │   ├── bitflags v0.9.1 (*)
│   │   ├── libc v0.2.37 (*)
│   │   ├── libgit2-sys v0.6.19
│   │   │   [dependencies]
│   │   │   ├── curl-sys v0.4.1 (*)
│   │   │   ├── libc v0.2.37 (*)
│   │   │   ├── libssh2-sys v0.2.6
│   │   │   │   [dependencies]
│   │   │   │   ├── libc v0.2.37 (*)
│   │   │   │   ├── libz-sys v1.0.18 (*)
│   │   │   │   └── openssl-sys v0.9.27 (*)
│   │   │   │   [build-dependencies]
│   │   │   │   ├── cmake v0.1.29
│   │   │   │   │   [dependencies]
│   │   │   │   │   └── cc v1.0.4 (*)
│   │   │   │   └── pkg-config v0.3.9 (*)
│   │   │   ├── libz-sys v1.0.18 (*)
│   │   │   └── openssl-sys v0.9.27 (*)
│   │   │   [build-dependencies]
│   │   │   ├── cc v1.0.4 (*)
│   │   │   ├── cmake v0.1.29 (*)
│   │   │   └── pkg-config v0.3.9 (*)
│   │   ├── openssl-probe v0.1.2 (*)
│   │   ├── openssl-sys v0.9.27 (*)
│   │   └── url v1.7.0 (*)
│   ├── git2-curl v0.7.0
│   │   [dependencies]
│   │   ├── curl v0.4.11 (*)
│   │   ├── git2 v0.6.11 (*)
│   │   ├── log v0.3.9 (*)
│   │   └── url v1.7.0 (*)
│   ├── glob v0.2.11
│   ├── hex v0.2.0 (*)
│   ├── home v0.3.0
│   ├── ignore v0.2.2
│   │   [dependencies]
│   │   ├── crossbeam v0.2.12
│   │   ├── globset v0.2.1
│   │   │   [dependencies]
│   │   │   ├── aho-corasick v0.6.4 (*)
│   │   │   ├── fnv v1.0.6
│   │   │   ├── log v0.3.9 (*)
│   │   │   ├── memchr v2.0.1 (*)
│   │   │   └── regex v0.2.6 (*)
│   │   ├── lazy_static v0.2.11
│   │   ├── log v0.3.9 (*)
│   │   ├── memchr v1.0.2
│   │   │   [dependencies]
│   │   │   └── libc v0.2.37 (*)
│   │   ├── regex v0.2.6 (*)
│   │   ├── thread_local v0.3.5 (*)
│   │   └── walkdir v1.0.7
│   │       [dependencies]
│   │       └── same-file v0.1.3
│   ├── jobserver v0.1.9
│   │   [dependencies]
│   │   └── libc v0.2.37 (*)
│   ├── libc v0.2.37 (*)
│   ├── libgit2-sys v0.6.19 (*)
│   ├── log v0.3.9 (*)
│   ├── num_cpus v1.8.0
│   │   [dependencies]
│   │   └── libc v0.2.37 (*)
│   ├── same-file v0.1.3 (*)
│   ├── scoped-tls v0.1.0
│   ├── semver v0.8.0
│   │   [dependencies]
│   │   ├── semver-parser v0.7.0
│   │   └── serde v1.0.27 (*)
│   ├── serde v1.0.27 (*)
│   ├── serde_derive v1.0.27 (*)
│   ├── serde_ignored v0.0.4
│   │   [dependencies]
│   │   └── serde v1.0.27 (*)
│   ├── serde_json v1.0.10 (*)
│   ├── shell-escape v0.1.3
│   ├── tar v0.4.14
│   │   [dependencies]
│   │   ├── filetime v0.1.15 (*)
│   │   └── libc v0.2.37 (*)
│   ├── tempdir v0.3.6
│   │   [dependencies]
│   │   ├── rand v0.4.2
│   │   │   [dependencies]
│   │   │   └── libc v0.2.37 (*)
│   │   └── remove_dir_all v0.3.0
│   │       [dependencies]
│   │       ├── kernel32-sys v0.2.2
│   │       │   [dependencies]
│   │       │   └── winapi v0.2.8
│   │       │   [build-dependencies]
│   │       │   └── winapi-build v0.1.1
│   │       └── winapi v0.2.8 (*)
│   ├── termcolor v0.3.5
│   ├── toml v0.4.5
│   │   [dependencies]
│   │   └── serde v1.0.27 (*)
│   └── url v1.7.0 (*)
│   [dev-dependencies]
│   └── filetime v0.1.15 (*)
├── clap v2.30.0
│   [dependencies]
│   ├── ansi_term v0.10.2
│   ├── atty v0.2.6 (*)
│   ├── bitflags v1.0.1
│   ├── strsim v0.7.0
│   ├── textwrap v0.9.0
│   │   [dependencies]
│   │   └── unicode-width v0.1.4
│   ├── unicode-width v0.1.4 (*)
│   └── vec_map v0.8.0
├── fs2 v0.4.3 (*)
├── git2 v0.6.11 (*)
├── humansize v1.1.0
├── rayon v1.0.0
│   [dependencies]
│   ├── either v1.4.0
│   └── rayon-core v1.4.0
│       [dependencies]
│       ├── crossbeam-deque v0.2.0
│       │   [dependencies]
│       │   ├── crossbeam-epoch v0.3.0
│       │   │   [dependencies]
│       │   │   ├── arrayvec v0.4.7
│       │   │   │   [dependencies]
│       │   │   │   └── nodrop v0.1.12
│       │   │   ├── cfg-if v0.1.2 (*)
│       │   │   ├── crossbeam-utils v0.2.2
│       │   │   │   [dependencies]
│       │   │   │   └── cfg-if v0.1.2 (*)
│       │   │   ├── lazy_static v0.2.11 (*)
│       │   │   ├── memoffset v0.2.1
│       │   │   ├── nodrop v0.1.12 (*)
│       │   │   └── scopeguard v0.3.3
│       │   └── crossbeam-utils v0.2.2 (*)
│       ├── lazy_static v1.0.0 (*)
│       ├── libc v0.2.37 (*)
│       ├── num_cpus v1.8.0 (*)
│       └── rand v0.4.2 (*)
└── walkdir v2.1.4
    [dependencies]
    └── same-file v1.0.2

@alexcrichton alexcrichton deleted the better-dep-ordering branch March 6, 2018 16:53
@ehuss ehuss added this to the 1.26.0 milestone Feb 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

achive better parallelism when building deps
6 participants